Abstract
General cognitive diagnostic models (CDM) such as the generalized deterministic input, noisy, “and” gate (G-DINA) model are flexible in that they allow for both compensatory and noncompensatory relationships among the subskills within the same test. Most of the previous CDM applications in the literature have been add-ons to simulation studies. Although there are some applications of CDMs such as the Fusion Model and the Rule Space Model to educational assessment data in general and second-language data in particular, there are few studies applying general models such as the G-DINA. The purpose of the present study was to demonstrate the application of the G-DINA to the reading comprehension data of a high-stakes test. To this end, an initial Q-matrix was developed, validated, and cross-validated. The skill profiles of the test takers were estimated using the “CDM” package in R. Throughout, the process of constructing and validating a Q-matrix was elaborated on, the benefits of general models were emphasized, and implications for research investigating inter-skill relationships were discussed. Finally, suggestions for further research, to better take advantage of the flexibilities of general diagnostic models, were presented.
|
Adams, R. J., Wilson, M. R., Wang, W. C. (1997). The multidimensional random coefficients multinomial logit. Applied Psychological Measurement, 21, 1-24. Google Scholar | SAGE Journals | ISI | |
|
Alderson, J. C., Lukmani, Y. (1989). Cognition and reading: Cognitive levels as embodied in test questions. Reading in a Foreign Language, 5, 253-270. Google Scholar | |
|
Baghaei, P . (2012). The application of multidimensional Rasch models in large scale assessment and validation: An empirical example. Electronic Journal of Research in Educational Psychology, 10, 233-252. Google Scholar | |
|
Baghaei, P., Kubinger, K. D. (2015). Linear logistic test modeling with R. Practical Assessment, Research & Evaluation, 20, 1-11. Retrieved from http://pareonline.net/getvn.asp?v=20&n=1 Google Scholar | |
|
Baghaei, P., Ravand, H. (2015). A cognitive processing model of reading comprehension in English as a foreign language using a linear logistic test model. Learning and Individual Differences, 43, 100-105. Google Scholar | Crossref | ISI | |
|
Brutten, S. R., Perkins, K., Upshur, J. A. (1991, March). Measuring growth in ESL reading. Paper presented at the Thirteenth Annual Language Testing Research Colloquium, Princeton, NJ. Google Scholar | |
|
Buck, G., Tatsuoka, K. (1998). Application of the rule-space procedure to language testing: Examining attributes of a free response listening test. Language Testing, 15, 119-157. Google Scholar | SAGE Journals | |
|
Buck, G., Tatsuoka, K., Kostin, I. (1997). The subskills of reading: Rule-space analysis of a multiplechoice test of second language reading comprehension. Language Learning, 47, 423-466. Google Scholar | Crossref | ISI | |
|
Buck, G., VanEssen, T., Tatsuoka, K., Kostin, I., Lutz, D., Phelps, M. (1998). Development, selection and validation of a set of cognitive and linguistic attributes for the SAT I Verbal: Analogy section (Research Report, RR-98-19). Princeton, NJ: Educational Testing Service. Google Scholar | |
|
Chen, J., & de la Torre, J. (2013). A general cognitive diagnosis model for expert-defined polytomous attributes. Applied Psychological Measurement, 37, 419-437. doi:10.1177/0146621613479818 Google Scholar | SAGE Journals | ISI | |
|
Chen, J., de la Torre, J., Zhang, Z. (2013). Relative and absolute fit evaluation in cognitive diagnosis modeling. Journal of Educational Measurement, 50, 123-140. doi:10.1111/j.1745-3984.2012.00185.x Google Scholar | Crossref | ISI | |
|
Chen, W. H., Thissen, D. (1997). Local dependence indexes for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22, 265-289. Google Scholar | SAGE Journals | ISI | |
|
Cui, Y., Gierl, M. J., Chang, H. H. (2012). Estimating classification consistency and accuracy for cognitive diagnostic assessment. Journal of Educational Measurement, 49, 19-38. Google Scholar | Crossref | ISI | |
|
de la Torre, J. (2009). DINA model and parameter estimation: A didactic. Journal of Educational and Behavioral Statistics, 34, 115-130. Google Scholar | SAGE Journals | ISI | |
|
de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76, 179-199. Google Scholar | Crossref | ISI | |
|
de la Torre, J., Chiu, C. Y. (2010, April). General empirical method of Q-matrix validation. Paper presented at the annual meeting of the National Council on Measurement in Education, Denver, CO. Google Scholar | |
|
de la Torre, J., Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69, 333-353. Google Scholar | Crossref | ISI | |
|
de la Torre, J., Lee, Y. S. (2013). Evaluating the Wald test for item-level comparison of saturated and reduced models in cognitive diagnosis. Journal of Educational Measurement, 50, 355-373. Google Scholar | Crossref | ISI | |
|
DiBello, L. V., Roussos, L. A., Stout, W. F. (2007). Review of cognitively diagnostic assessment and a summary of psychometric models. In Rao, C. R., Sinharay, S. (Eds.), Handbook of statistics. Volume 26: Psychometrics (pp. 979-1030). Amsterdam, The Netherlands: Elsevier. Google Scholar | |
|
DiBello, L. V., Stout, W. F., Roussos, L. (1995). Unified cognitive psychometric assessment likelihood-based classification techniques. In Nichols, P. D., Chipman, S. F., Brennan, R. L. (Eds.), Cognitively diagnostic assessment (pp. 361-390). Hillsdale, NJ: Lawrence Erlbaum. Google Scholar | |
|
Doornik, J. A. (2007). Object-oriented matrix programming using Ox (6th ed.). London, England: Timberlake Consultants Press. Google Scholar | |
|
Embretson, S. E. (1983). Construct validity: Construct representation vs. nomothetic span. Psychological Bulletin, 93, 179-197. Google Scholar | Crossref | ISI | |
|
Embretson, S. E. (1991). A multidimensional latent trait model for measuring learning and change. Psychometrika, 56, 495-515. Google Scholar | Crossref | ISI | |
|
Fischer, G. H . (1973). The linear logistic test model as an instrument in educational research. Acta Psychologica, 37, 359-374. Google Scholar | Crossref | ISI | |
|
Grabe, W. (2009). Reading in a second language: Moving from theory to practice. Cambridge, England: Cambridge University Press. Google Scholar | |
|
Grabe, W., Stoller, F. (2002). Teaching and research reading. Harlow, UK: Longman. Google Scholar | |
|
Harding, L., Alderson, J. C., Brunfaut, T. (2015). Diagnostic assessment of reading and listening in a second or foreign language: Elaborating on diagnostic principles. Language Testing. Advance online publication. doi:10.1177/0265532214564505 Google Scholar | SAGE Journals | ISI | |
|
Hartz, S. M. (2002). A Bayesian framework for the unified model for assessing cognitive abilities: Blending theory with practicality (Doctoral dissertation). University of Illinois at Urbana–Champaign. Google Scholar | |
|
Henson, R. A. (2009). Diagnostic classification models: Thoughts and future directions. Measurement: Interdisciplinary Research and Perspectives, 7, 34-36. Google Scholar | Crossref | |
|
Henson, R. A., Templin, J. L., Willse, J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74, 191-210. doi:10.1007/s11336-008-9089-5 Google Scholar | Crossref | ISI | |
|
Hou, L., de la Torre, J. D., Nandakumar, R. (2014). Differential item functioning assessment in cognitive diagnostic modeling: Application of the Wald test to investigate DIF in the DINA model. Journal of Educational Measurement, 51, 98-125. Google Scholar | Crossref | ISI | |
|
Jang, E. E. (2005). A validity narrative: Effects of reading skills diagnosis on teaching and learning in the context of NG TOEFL (Doctoral dissertation). University of Illinois at Urbana–Champaign. Google Scholar | |
|
Jang, E. E. (2009). Cognitive diagnostic assessment of L2 reading comprehension ability: Validity arguments for fusion model application to LanguEdge assessment. Language Testing, 26, 31-73. Google Scholar | SAGE Journals | ISI | |
|
Junker, B. W., Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25, 258-272. Google Scholar | SAGE Journals | ISI | |
|
Kasai, M. (1997). Application of the rule space model to the reading comprehension section of the test of English as a foreign language (TOEFL) (Doctoral dissertation). University of Illinois, Urbana-Champaign, IL. Google Scholar | |
|
Kasai, M., Saito, J. (1996, April). The rule space model applied to the reading comprehension section of the Test of English as a Foreign Language (TOEFL). Paper presented at the annual meeting of the National Council in Measurement in Education, New York, NY. Google Scholar | |
|
Kim, A. Y. A. (2015). Exploring ways to provide diagnostic feedback with an ESL placement test: Cognitive diagnostic assessment of L2 reading ability. Language Testing, 32, 227-258. doi:10.1177/0265532214558457 Google Scholar | SAGE Journals | ISI | |
|
Kim, Y. H. (2011). Diagnosing EAP writing ability using the reduced reparameterized unified model. Language Testing, 28, 509-541. doi:10.1177/0265532211400860 Google Scholar | SAGE Journals | ISI | |
|
Landis, J. R., Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159-174. Google Scholar | Crossref | Medline | ISI | |
|
Lee, Y.-W. (2015). Diagnosing diagnostic language assessment. Language Testing, 32, 299-316. doi:10.1177/0265532214565387 Google Scholar | SAGE Journals | ISI | |
|
Lee, Y.-W., Sawaki, Y. (2009a). Application of three cognitive diagnosis models to ESL reading and listening assessments. Language Assessment Quarterly, 6, 239-263. doi:10.1080/15434300903079562 Google Scholar | Crossref | ISI | |
|
Lee, Y.-W., Sawaki, Y. (2009b). Cognitive diagnosis approaches to language assessment: An overview. Language Assessment Quarterly, 6, 172-189. doi:10.1080/15434300902985108 Google Scholar | Crossref | ISI | |
|
Leighton, J. P., Gierl, M. J. (2007). Defining and evaluating models of cognition used in educational measurement to make inferences about examinees’ thinking processes. Educational Measurement: Issues and Practice, 26, 3-16. Google Scholar | Crossref | |
|
Leighton, J. P., Gierl, M. J., Hunka, S. M. (2004). The attribute hierarchy method for cognitive assessment: A variation on Tatsuoka’s Rule-Space Approach. Journal of Educational Measurement, 41, 205-237. Google Scholar | Crossref | ISI | |
|
Li, H. (2011). Evaluating language group differences in the subskills of reading using a cognitive diagnostic modeling and differential skill functioning approach (Doctoral dissertation). Pennsylvania State University, State College. Google Scholar | |
|
Li, F., Cohen, A., Bottge, B., Templin, J. (2015). A latent transition analysis model for assessing change in cognitive skills. Educational and Psychological Measurement. Advance online publication.doi:10.1177/0013164415588946 Google Scholar | SAGE Journals | ISI | |
|
Li, H., Suen, H. K. (2013). Detecting native language group differences at the subskills level of reading: A differential skill functioning approach. Language Testing, 30, 273-298. Google Scholar | SAGE Journals | ISI | |
|
Long, D. L., Seely, M. R., Oppy, B. J., Golding, J. M. (1996). The role of inferential processing in reading ability. In Britton, B. K., Graesser, A. C. (Eds.), Models of understanding text (pp. 189-214). Mahwah, NJ: Lawrence Erlbaum. Google Scholar | |
|
Lumley, T. (1993). The notion of subskills in reading comprehension tests: An EAP example. Language Testing, 10, 211-234. Google Scholar | SAGE Journals | |
|
McDonald, R. P., Mok, M. M. C. (1995). Goodness of fit in item response models. Multivariate Behavioral Research, 30, 23-40. doi:10.1207/s15327906mbr32-001 Google Scholar | Crossref | Medline | ISI | |
|
Messick, S. (1996). Validity and washback in language testing. Language Testing, 13, 241-256. Google Scholar | SAGE Journals | |
|
No Child Left Behind Act of 2001 (NCLB) Public Law 107-110 . Google Scholar | |
|
Pressley, M. (2002). Metacognition and self-regulated comprehension. In Farstrup, A. E., Samuels, S. J. (Eds.), What research has to say about reading instruction (3rd ed., pp. 184-200). Newark, DE: International Reading Association. Google Scholar | Crossref | |
|
Ravand, H., Barati, H., Widhiarso, W. (2012). Exploring diagnostic capacity of a high stakes reading comprehension test: A pedagogical demonstration. Iranian Journal of Language Testing, 3, 11-37. Google Scholar | |
|
Robitzsch, A., Kiefer, T., George, A. C., Uenlue, C. (2014). CDM: Cognitive diagnosis modeling (R package version 3.0-12). Retrieved from https://cran.r-project.org/web/packages/CDM/index.html Google Scholar | |
|
Rojas, G., de la Torre, J., Olea, J. (2012, April). Choosing between general and specific cognitive diagnosis models when the sample size is small. Paper presented at the annual meeting of the National Council on Measurement in Education, Vancouver, British Columbia, Canada. Google Scholar | |
|
Roussos, L. A., DiBello, L. V., Henson, R. A., Jang, E. E., Templin, J. L. (2006). Skills diagnosis for education and psychology with IRT-based parametric latent class models. In Embretson, S., Roberts, J. (Eds.), New directions in psychological measurement with model-based approaches (pp. 35-69). Washington, DC: American Psychological Association. Google Scholar | |
|
Roussos, L. A., DiBello, L. V., Stout, W. (2006). Diagnostic skills-based testing using the Fusion-Model-Based Arpeggio system. In Leighton, J., Gierl, M. (Eds.), Cognitively diagnostic assessment in education: Theory and practice (pp. 275-318). New York, NY: Cambridge University Press. Google Scholar | |
|
Rupp, A. A., Templin, J. L. (2008). Unique characteristics of diagnostic classification models: A comprehensive review of the current state-of-the-art. Measurement: Interdisciplinary Research and Perspectives, 6, 219-262. doi:10.1080/15366360802490866 Google Scholar | Crossref | |
|
Rupp, A. A., Templin, J. L., Henson, R. A. (2010). Diagnostic measurement: Theory, methods, and applications. New York, NY: Guilford. Google Scholar | |
|
Sawaki, Y., Kim, H. J., Gentile, C. (2009). Q-matrix construction: Defining the link between constructs and test items in large-scale reading and listening comprehension assessments. Language Assessment Quarterly, 6, 190-209. Google Scholar | Crossref | ISI | |
|
Scott, H. S. (1998). Cognitive diagnostic perspectives of a second language reading test (Unpublished doctoral dissertation). University of Illinois, Urbana-Champaign, Urbana, IL. Google Scholar | |
|
Tatsuoka, C. (2002). Data analytic methods for latent partially ordered classification models. Journal of the Royal Statistical Society: Series C (Applied Statistics), 51, 337-350. Google Scholar | Crossref | ISI | |
|
Tatsuoka, K. K. (1983). Rule space: An approach for dealing with misconceptions based on item response theory. Journal of Educational Measurement, 20, 345-354. Google Scholar | Crossref | ISI | |
|
Tatsuoka, K. K. (1985). A probabilistic model for diagnosing misconceptions by the pattern classification approach. Journal of Educational and Behavioral Statistics, 10, 55-73. Google Scholar | SAGE Journals | |
|
Templin, J., Bradshaw, L. (2013). Measuring the reliability of diagnostic classification model examinee estimates. Journal of Classification, 30, 251-275. Google Scholar | Crossref | ISI | |
|
Templin, J., Hoffman, L. (2013). Obtaining diagnostic classification model estimates using Mplus. Educational Measurement: Issues and Practice, 32, 37-50. Google Scholar | Crossref | ISI | |
|
Templin, J. L., Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11, 287-305. Google Scholar | Crossref | Medline | ISI | |
|
von Davier, M . (2005). A general diagnostic model applied to language testing data (RR-05-16). Princeton, NJ: Educational Testing Service. Google Scholar | |
|
von Davier, M . (2014). The log-linear cognitive diagnostic model (LCDM) as a special case of the general diagnostic model (GDM). Retrieved from http://onlinelibrary.wiley.com/doi/10.1002/ets2.12043/abstract Google Scholar | |
|
Yen, W. M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement, 8, 125-145. doi:10.1177/014662168400800201 Google Scholar | SAGE Journals | ISI |

