Abstract
Some usability and interpretability issues for single-strategy cognitive assessment models are considered. These models posit a stochastic conjunctive relationship between a set of cognitive attributes to be assessed and performance on particular items/tasks in the assessment. The models considered make few assumptions about the relationship between latent attributes and task performance beyond a simple conjunctive structure. An example shows that these models can be sensitive to cognitive attributes, even in data designed to well fit the Rasch model. Several stochastic ordering and monotonicity properties are considered that enhance the interpretability of the models. Simple data summaries are identified that inform about the presence or absence of cognitive attributes when the full computational power needed to estimate the models is not available.
|
Adams, R. J. , Wilson, M. , & Wang, W.-C. (1997). The multidimensional random coefficients multinomial logit model. Applied Psychological Measurement, 21, 1–23. Google Scholar | SAGE Journals | ISI | |
|
Baxter, G. P. , & Glaser, R. (1998). Investigating the cognitive complexity of science assessments. Educational Measurement: Issues and Practice, 17, 37–45. Google Scholar | Crossref | |
|
Carpenter, P. A. , Just, M. A. , & Shell, P. (1990). What one intelligence test measures: A theoretical account of processing in the Raven’s Progressive Matrices Test. Psychological Review, 7, 404–431. Google Scholar | Crossref | |
|
Corbett, A. T. , Anderson, J. R. , & O’Brien, A. T. (1995). Student modeling in the ACT programming tutor. In P. D. Nichols , S. F. Chipman , & R. L. Brennan (Eds.), Cognitively diagnostic assessment (pp. 19–41). Hillsdale NJ: Erlbaum. Google Scholar | |
|
DiBello, L. V. , Stout, W. F. , & Roussos, L. A. (1995). Unified cognitive/psychometric diagnostic assessment likelihood-based classification techniques. In P. D. Nichols , S. F. Chipman , & R. L. Brennan (Eds.), Cognitively diagnostic assessment (pp. 361–389). Hillsdale NJ: Erlbaum. Google Scholar | |
|
Doignon, J.-P. , & Falmagne, J.-C. (1999). Knowledge spaces. New York: Springer-Verlag. Google Scholar | Crossref | |
|
Draney, K. L. , Pirolli, P. , & Wilson, M. (1995). A measurement model for a complex cognitive skill. In P. D. Nichols , S. F. Chipman , R. L. Brennan (Eds.), Cognitively diagnostic assessment (pp. 103–125). Hillsdale NJ: Erlbaum. Google Scholar | |
|
Embretson, S. E. (1997). Multicomponent response models. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 305–321). New York: Springer-Verlag. Google Scholar | Crossref | |
|
Fischer, G. H. (1995). The linear logistic test model. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch models: Foundations, recent developments, and applications (pp. 131–155). New York: Springer-Verlag. Google Scholar | Crossref | |
|
Glas, C. A. W. , & Ellis, J. (1994). RSP: Rasch scaling program. Groningen, The Netherlands: ProGAMMA. Google Scholar | |
|
Glas, C. A. W. , & Verhelst, N. D. (1995). Testing the Rasch model. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch models: Foundations, recent developments, and applications (pp. 69–95). New York: Springer-Verlag. Google Scholar | Crossref | |
|
Haertel, E. H. (1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Educational Measurement, 26, 301–321. Google Scholar | Crossref | ISI | |
|
Hartz, S. , DiBello, L. V. , & Stout, W. F. (2000, July). Hierarchical Bayesian approach to cognitive assessment: Markov chain monte carlo application to the Unified Model. Paper presented at the Annual North American Meeting of the Psychometric Society, Vancouver, Canada. Google Scholar | |
|
Heckerman, D. (1998). A tutorial on learning with Bayesian networks. In M. Jordan (Ed.), Learning in graphical models (pp. 301–354). Dordrecht, The Netherlands: Kluwer. Google Scholar | Crossref | |
|
Hemker, B. T. , Sijtsma K. , Molenaar, I. W. , & Junker, B. W. (1997). Stochastic ordering using the latent trait and the sum score in polytomous IRT models. Psychometrika, 62, 331–347. Google Scholar | Crossref | ISI | |
|
Holland, P. W. , & Rosenbaum, P. R. (1986). Conditional association and unidimensionality in monotone latent trait models. Annals of Statistics, 14, 1523–1543. Google Scholar | Crossref | ISI | |
|
Huguenard, B. R. , Lerch, F. J. , Junker, B. W. , Patz, R. J. , & Kass, R. E. (1997). Working memory failure in phone-based interaction. ACM Transactions on Computer-Human Interaction, 4, 67–102. Google Scholar | Crossref | |
|
Junker, B. W. (2001). On the interplay between nonparametric and parametric IRT, with some thoughts about the future. In A. Boomsma , M. A. J. Van Duijn , & T. A. B. Snijders (Eds.), Essays on item response theory (pp. 274–276). New York: Springer-Verlag. Google Scholar | Crossref | |
|
Junker, B. W. , & Sijtsma, K. (2000). Latent and manifest monotonicity in item response models. Applied Psychological Measurement, 24, 65–81. Google Scholar | SAGE Journals | ISI | |
|
Kyllonen, P. , & Christal, R. (1990). Reasoning ability is (little more than) working memory capacity? Intelligence, 14, 389–394. Google Scholar | Crossref | ISI | |
|
Macready, G. B. , & Dayton, C. M. (1977). The use of probabilistic models in the assessment of mastery. Journal of Educational Statistics, 2, 99–120. Google Scholar | Crossref | |
|
Maris, E. (1995). Psychometric latent response models. Psychometrika, 60, 523–547. Google Scholar | Crossref | ISI | |
|
Maris, E. (1999). Estimating multiple classification latent class models. Psychometrika, 64, 187–212. Google Scholar | Crossref | ISI | |
|
Mislevy, R. J. (1996). Test theory reconceived. Journal of Educational Measurement, 33, 379–416. Google Scholar | Crossref | ISI | |
|
Molenaar, I. W. , & Sijtsma, K. (2000). MSP5 for Windows [Computer program]. Groningen, The Netherlands: ProGAMMA. Google Scholar | |
|
Nichols, P. , & Sugrue, B. (1999). The lack of fidelity between cognitively complex constructs and conventional test development practice. Educational Measurement: Issues and Practice, 18, 18–29. Google Scholar | Crossref | |
|
Pellegrino, J. , Chudowsky, N. , & Glaser, R. (Eds.). (2001). Knowing what students know: The science and design of educational assessment [Final Report of the Committee on the Foundations of Assessment]. Washington DC: Center for Education, National Research Council. Google Scholar | |
|
Reckase, M. D. (1997). A linear logistic multidimensional model for dichotomous item response data. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 271–286). New York: Springer-Verlag. Google Scholar | Crossref | |
|
Resnick, L. B. , & Resnick, D. P. (1992). Assessing the thinking curriculum: New tools for educational reform. In B. R. Gifford & M. C. O’Connor (Eds.), Changing assessments: Alternative views of aptitude, achievement, and instruction (pp. 37–75). Norwell MA: Kluwer. Google Scholar | Crossref | |
| Rijkes, C. P. M. (1996). Testing hypotheses on cognitive processes using IRT models. Unpublished doctoral dissertation, University of Twente, The Netherlands. Google Scholar | |
|
Sijtsma, K. (1998). Methodology review: Nonparametric IRT approaches to the analysis of dichotomous item scores. Applied Psychological Measurement, 22, 3–31. Google Scholar | SAGE Journals | ISI | |
|
Sijtsma, K. , & Verweij, A. (1999). Knowledge of solution strategies and IRT modeling of items for transitive reasoning. Applied Psychological Measurement, 23, 55–68. Google Scholar | SAGE Journals | ISI | |
|
Spiegelhalter, D. J. , Thomas, A. , Best, N. G. , & Gilks, W. R. (1997). BUGS: Bayesian inference using Gibbs sampling, Version 0.6 [Computer program]. Cambridge, UK: MRC Biostatistics Unit. Google Scholar | |
|
Tanner, M. A. (1996). Tools for statistical inference: Methods for the exploration of posterior distributions and likelihood functions (3rd ed.). New York: Springer-Verlag. Google Scholar | Crossref | |
|
Tatsuoka, K. K. (1995). Architecture of knowledge structures and cognitive diagnosis: A statistical pattern recognition and classification approach. In P. D. Nichols , S. F. Chipman , & R. L. Brennan (Eds.), Cognitively diagnostic assessment (pp. 327–359). Hillsdale NJ: Erlbaum. Google Scholar | |
|
Van der Ark, L. A. (2001). An overview of relationships in polytomous item response theory and some applications. Applied Psychological Measurement, 25, 273–282. Google Scholar | SAGE Journals | ISI | |
|
Van Lehn, K. , & Niu, Z. (in press). Bayesian student modeling, user interfaces and feedback: A sensitivity analysis. International Journal of Artificial Intelligence in Education. Google Scholar | |
|
Van Lehn, K. , Niu, Z. , Siler, S. , & Gertner, A. (1998). Student modeling from conventional test data: A Bayesian approach without priors. In B.P.Goettle , H. M. Halff , C. L. Redfield , & V. J. Shute (Eds.), Proceedings of the Intelligent Tutoring Systems Fourth International Conference, ITS 98 (pp. 434–443). Berlin: Springer-Verlag. Google Scholar | |
|
Verweij, A. , Sijtsma, K. , & Koops, W. (1999). An ordinal scale for transitive reasoning by means of a deductive strategy. International Journal of Behavioral Development, 23, 241–264. Google Scholar | SAGE Journals | ISI | |
|
Wilson, M. , & Sloane, K. (2000). From principles to practice: An embedded assessment system. Applied Measurement in Education, 13, 181–208. Google Scholar | Crossref | ISI |
