In applications of item response theory (IRT), an estimate of the reliability of the ability estimates or sum scores is often reported. However, analytical expressions for the standard errors of the estimators of the reliability coefficients are not available in the literature and therefore the variability associated with the estimated reliability is typically not reported. In this study, the asymptotic variances of the IRT marginal and test reliability coefficient estimators are derived for dichotomous and polytomous IRT models assuming an underlying asymptotically normally distributed item parameter estimator. The results are used to construct confidence intervals for the reliability coefficients. Simulations are presented which show that the confidence intervals for the test reliability coefficient have good coverage properties in finite samples under a variety of settings with the generalized partial credit model and the three-parameter logistic model. Meanwhile, it is shown that the estimator of the marginal reliability coefficient has finite sample bias resulting in confidence intervals that do not attain the nominal level for small sample sizes but that the bias tends to zero as the sample size increases.

Andersson, B. (2016). Asymptotic standard errors of observed-score equating with polytomous IRT models. Journal of Educational Measurement, 53, 459-477.
Google Scholar | Crossref | ISI
Bock, R. D., Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443-459.
Google Scholar | Crossref | ISI
Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1-29.
Google Scholar | Crossref | ISI
Cheng, Y., Yuan, K.-H., Liu, C. (2012). Comparison of reliability measures under factor analysis and item response theory. Educational and Psychological Measurement, 72, 52-67.
Google Scholar | SAGE Journals | ISI
Davison, A. C., Hinkley, D. V. (1997). Bootstrap methods and their application. Cambridge, England: Cambridge University Press.
Google Scholar | Crossref
Ferguson, T. (1996). A course in large sample theory. London, England: Chapman & Hall.
Google Scholar | Crossref
Fischer, H. F., Tritt, K., Klapp, B. F., Fliege, H. (2011). How to compare scores from different depression scales: Equating the patient health questionnaire (PHQ) and the ICD-10-symptom rating (ISR) using item response theory. International Journal of Methods in Psychiatric Research, 20, 203-214.
Google Scholar | Crossref | Medline | ISI
Green, B. F., Bock, R. D., Humphreys, L. G., Linn, R. L., Reckase, M. D. (1984). Technical guidelines for assessing computerized adaptive tests. Journal of Educational Measurement, 21, 347-360.
Google Scholar | Crossref | ISI
Hambleton, R. K., Swaminathan, H. (1985). Item response theory: Principles and applications. Boston, MA: Kluwer.
Google Scholar | Crossref
Kim, S. (2012). A note on the reliability coefficients for item response model-based ability estimates. Psychometrika, 77, 153-162.
Google Scholar | Crossref | ISI
Kim, S., Feldt, L. S. (2010). The estimation of the IRT reliability coefficient and its lower and upper bounds, with comparisons to CTT reliability statistics. Asia Pacific Education Review, 11, 179-188.
Google Scholar | Crossref | ISI
Lord, F. M. (1977). Practical applications of item characteristic curve theory. Journal of Educational Measurement, 14, 117-138.
Google Scholar | Crossref | ISI
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.
Google Scholar
Lord, F. M., Wingersky, M. S. (1984). Comparison of IRT true-score and equipercentile observed-score “equatings”. Applied Psychological Measurement, 8, 452-461.
Google Scholar | SAGE Journals | ISI
Magis, D. (2015). A note on the equivalence between observed and expected information functions with polytomous IRT models. Journal of Educational and Behavioral Statistics, 40, 96-105.
Google Scholar | SAGE Journals | ISI
Mislevy, R. J. (1984). Estimating latent distributions. Psychometrika, 49, 359-381.
Google Scholar | Crossref | ISI
Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16, 159-176.
Google Scholar | SAGE Journals | ISI
Muthén, B., Lehman, J. (1985). Multiple group IRT modeling: Applications to item bias analysis. Journal of Educational and Behavioral Statistics, 10, 133-142.
Google Scholar | SAGE Journals
Ogasawara, H. (2002). Stable response functions with unstable item parameter estimates. Applied Psychological Measurement, 26, 239-254.
Google Scholar | SAGE Journals | ISI
Ogasawara, H. (2003). Asymptotic standard errors of IRT observed-score equating methods. Psychometrika, 68, 193-211.
Google Scholar | Crossref | ISI
R Development Core Team . (2016). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.
Google Scholar
Thissen, D., Pommerich, M., Billeaud, K., Williams, V. S. (1995). Item response theory for scores on tests including polytomous items with ordered responses. Applied Psychological Measurement, 19, 39-49.
Google Scholar | SAGE Journals | ISI
Woods, C. M. (2006). Ramsay-curve item response theory (RC-IRT) to detect and correct for nonnormal latent variables. Psychological Methods, 11, 253-270.
Google Scholar | Crossref | Medline | ISI
Yuan, K.-H., Bentler, P. M. (2002). On robustness of the normal-theory based asymptotic distributions of three reliability coefficient estimates. Psychometrika, 67, 251-259.
Google Scholar | Crossref | ISI
Yuan, K.-H., Cheng, Y., Patton, J. (2013). Information matrices and standard errors for MLEs of item parameters in IRT. Psychometrika, 79, 232-254.
Google Scholar | Crossref | Medline | ISI
Access Options

My Account

Welcome
You do not have access to this content.



Chinese Institutions / 中国用户

Click the button below for the full-text content

请点击以下获取该全文

Institutional Access

does not have access to this content.

Purchase Content

24 hours online access to download content

Research off-campus without worrying about access issues. Find out about Lean Library here

Your Access Options


Purchase

EPM-article-ppv for $37.50
Single Issue 24 hour E-access for $323.77

Cookies Notification

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Find out more.
Top