The term “computerized adaptive testing,” or CAT, defines itself fairly well by its three component words. It is a way of using computers to administer tests that adapt during the testing period to the test-takers’ ability levels, based on their answers, the amount of time they take and other factors. It has also been called “tailored testing,” but that term is becoming obsolete because of the short and sweet “CAT” acronym.
CAT is starting to make inroads into classrooms at all educational levels, although it is still used primarily in admissions-related testing such as the SAT, ACT, GRE, GMAT and LSAT. There are some developmental issues that must be addressed to make CAT work in small test populations, since accurate scoring requires relatively large samples. The minimum number for a sample, according to the Education Resources Information Center, is 1,000 students, but 2,000 is a common benchmark. This sample size makes locally developed tests difficult, often impossible, to create—at least at present. With all the time, energy and investment being allocated to CAT development, solutions to existing weaknesses could appear at any time.
How CAT developed
Computers generated excitement in some educators, and arched eyebrows in others, when the 1980s brought rapid increases in computer power at lower and lower price points. Within a decade it was clear that computer technology made test administration, scoring, individual and group analysis, data management, storage and score reporting more accurate and less time consuming, therefore less costly. Computer-based testing, even before it evolved into adaptive models, was clearly a superior way of doing things.
With the continuing advance of the technology (and continuing reduction in cost) educators became increasingly interested in performance-based testing, which requires that students demonstrate the capacity to use what they’ve learned. Test designers and education researchers realized that they had to develop new means of performance assessment. Some 20 years of research, development, trial and error has brought us to the point where CAT is firmly established in college-level placement examinations (SAT, GRE, etc.). There are also many non- and for-profit firms and organizations that deploy CAT in various ways to help test-takers prepare for their exam(s).
How it works
As the test-takers proceed through a CAT exam, the software adapts continuously to them and then, on the basis of performance on both present and previous items, selects the next question for presentation. If the test-takers do well on intermediate-level questions, the CAT program will assemble a set of queries at the advanced level. On the other hand, if the test-takers do poorly at the intermediate level, the program will revert to basic-level questions for the next set.
Essentially, CAT software is constantly working to achieve the proper level for each individual test-taker's optimum performance. After a “mathematically appropriate” number of questions have been asked and answered, the examination terminates when a test-taker’s performance at a particular level is shown to the highest that can be sustained.
What are CAT’s advantages?
In addition to the cost-effectiveness and other efficiencies mentioned above, CAT has many important advantages. These include, but are certainly not limited to, the following:
- CAT technology requires less time and fewer questions, compared to paper-and-pencil exams, to develop more accurate ratings of test-takers' proficiency.
- CAT scoring produces finer distinctions and more accurate assessments than can be derived from merely a total number of correct responses.
- The scoring algorithm does not just consider the number of correct answers, but the level and depth of correct answers. Test-takers that correctly answer more difficult sets of questions score better than those that correctly answer easier ones.
- By excluding questions outside test-takers’ proficiency levels, CAT exams take less time, even than other computer-based exams.
- The challenge faced by test-takers is both reasonable and realistic, since the questions are neither too difficult nor too easy.
- Exam security is greatly enhanced by the fact that every test-taker is given unique sets of questions, which are not available before the test begins.
- CAT technology lets administrators give test-takers nearly instantaneous feedback on their performance.
- By storing test results on a computer, students' performance over a period of time or a sequence of exams can be tracked, analyzed and used by educators to assess student needs and remedial requirements.
The future of CAT
You will pardon the pun, but this is one “CAT” that will have far more than nine lives. Because the technology is still advancing, CAT exams will become more and more uniquely tailored to each test-taker in the future. Testing will become not only more personalized, but the results will be far more instructive for both teachers and students.
As CAT software grows in sophistication, it will be able to mine more, and more focused, meaning from tests that used to give one-dimensional numeric scores or letter grades. The future promises test results that don’t just yield a score, but provide enough information for educators to help students address their weak points and expand on their strengths. As far as the future of testing goes—watch out, here comes another pun—the CAT is out of the bag.