Abstract
A critical shortcoming of the maximum likelihood estimation (MLE) method for test score estimation is that it does not work with certain response patterns, including ones consisting only of all 0s or all 1s. This can be problematic in the early stages of computerized adaptive testing (CAT) administration and for tests short in length. To overcome this challenge, test practitioners often set lower and upper bounds of theta estimation and truncate the score estimation to be one of those bounds when the log likelihood function fails to yield a peak due to responses consisting only of 0s or 1s. Even so, this MLE with truncation (MLET) method still cannot handle response patterns in which all harder items are correct and all easy items are incorrect. Bayesian-based estimation methods such as the modal a posteriori (MAP) method or the expected a posteriori (EAP) method can be viable alternatives to MLE. The MAP or EAP methods, however, are known to result in estimates biased toward the center of a prior distribution, resulting in a shrunken score scale. This study introduces an alternative approach to MLE, called MLE with fences (MLEF). In MLEF, several imaginary “fence” items with fixed responses are introduced to form a workable log likelihood function even with abnormal response patterns. The findings of this study suggest that, unlike MLET, the MLEF can handle any response patterns and, unlike both MAP and EAP, results in score estimates that do not cause shrinkage of the theta scale.
References
|
Baker, F. B., Kim, S.-H. (2004). Item response theory: Parameter estimation techniques. New York, NY: Basel. Google Scholar | Crossref | |
|
Birnbaum, A. (1968). Some latent ability models and their use in inferring an examinee’s ability. In Lord, F. M., Novick, M. R. (Eds.), Statistical theories of mental test scores (pp. 397-479). Reading, MA: Addison-Wesley. Google Scholar | |
|
Bock, R. D., Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443-459. Google Scholar | Crossref | ISI | |
|
Han, K. T. (2012). SimulCAT: Windows software for simulating computerized adaptive test administration. Applied Psychological Measurement, 36, 64-66. Google Scholar | SAGE Journals | ISI | |
|
Herrando, S. (1989, September). Tests adaptativos computerizados: Una sencilla solución al problema de la estimación con puntuaciones perfecta y cero [Computerized adaptive tests: An easy solution to the estimation problem with perfect and zero scores]. II Conferencia Española de Biometría, Biometric Society, Segovia, Spain. Google Scholar | |
|
Kendall, M. G., Stuart, A. (1967). The advanced theory of statistics (Vol. 2). New York, NY: Hafner. Google Scholar | |
|
McBride, J. R. (1977). Some properties of a Bayesian adaptive ability testing strategy. Applied Psychological Measurement, 1, 121-140. Google Scholar | SAGE Journals | |
|
Novick, M. R., Jackson, P. H. (1974). Statistical methods for educational and psychological research. New York, NY: McGraw-Hill. Google Scholar | |
|
Olea, J., Ponsoda, V. (2003). Tests adaptativos informatizados [Computerized adaptive testing]. Madrid, Spain: Universidad Nacional de Educación a Distancia [National University of Distance Education]. Google Scholar | |
|
Owen, R. J. (1975). A Bayesian sequential procedure for quantal response in the context of adaptive mental testing. Journal of the American Statistical Association, 70, 351-356. Google Scholar | Crossref | ISI | |
|
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores (Psychometrika Monograph No. 17). Richmond, VA: Psychometric Society. Google Scholar | Crossref | |
|
Wang, T., Hanson, B. A., Lau, C.-M. C. A. (1999). Reducing bias in CAT trait estimation: A comparison of approaches. Applied Psychological Measurement, 23, 263-278. Google Scholar | SAGE Journals | ISI | |
|
Wang, T., Vispoel, W. P. (1998). Properties of ability estimation methods in computerized adaptive testing. Journal of Educational Measurement, 35, 105-135. Google Scholar | Crossref | ISI | |
|
Weiss, D. J. (1982). Improving measurement quality and efficiency with adaptive testing. Applied Psychological Measurement, 6, 473-492. Google Scholar | SAGE Journals | ISI | |
|
Weiss, D. J., McBride, J. R. (1984). Bias and information of Bayesian adaptive testing. Applied Psychological Measurement, 8, 273-285. Google Scholar | SAGE Journals | ISI |
