Abstract
Researchers in the areas of psychology and education strive to understand the intersections among validity, educational measurement, and cognitive theory. Guided by a mixed model conceptual framework, this study investigates how respondents' opinions inform the validation argument. Validity evidence for a science assessment was collected through traditional paper-and-pencil tests, surveys, and think-aloud and exit interviews of fifth- and sixth-grade students. Item response theory analyses supplied technical descriptions of evidence investigating the internal structure. Surveys provided information regarding perceived item difficulty and fairness. Think-aloud and exit interviews provided context and response processes information to clarify and explain issues. This research demonstrates how quantitative and qualitative data can be used in concert to inform the validation process and highlights the use of think-aloud interviews as an explanatory tool.
|
American Psychological Association, American Educational Research Association, & National Council on Mathematics Education. ( 1999). Standards for educational and psychological testing. Washington, DC: American Psychological Association . Google Scholar | |
|
Angoff, W.H. (1988). Validity: An evolving concept. In H. Wainer & H. I. Braun (Eds.), Test validity. Hillsdale, NJ: Lawrence Erlbaum. Google Scholar | |
|
Ayala, C.C. , Yin, Y. , Shavelson, R. , & Vanides, J. (2002, April). Investigating the cognitive validity of science performance assessments with think-alouds: Technical aspects. Paper presented at the annual meeting of the American Educational Research Association, New Orleans. Google Scholar | |
|
Boren, M.T. , & Ramey, J. (2000). Thinking aloud: Reconciling theory and practice . IEEE Transactions on Professional Communication, 43, 261-278. Google Scholar | Crossref | ISI | |
|
Campbell, D.T. , & Fiske, D.W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81-106. Google Scholar | Crossref | Medline | ISI | |
|
Dellinger, A.B. , & Leech, N.L. (2007). Toward a unified validation framework in mixed methods research. Journal of Mixed Methods Research, 1(4), 309-332. Google Scholar | SAGE Journals | ISI | |
|
Embretson, S. , & Gorin, J. (2001). Improving construct validity with cognitive psychology principles. Journal of Educational Measurement, 38, 343-368. Google Scholar | Crossref | ISI | |
|
Ericsson, K.A. , & Simon, H.A. (1993). Protocol analysis: Verbal reports as data. London: A Bradford Book, MIT Press. Google Scholar | |
|
Floyd, R.G. , Phaneuf, R.L. , & Wilczynski, S.M. (2005). Measurement properties of indirect assessment methods of functional behavioral assessment: A review of research. School Psychology Review, 34, 58-73. Google Scholar | ISI | |
|
Full Option Science System. (2005). Environments posttest. Berkeley: Lawrence Hall of Science, Regents of the University of California. Google Scholar | |
|
Greene, J.C. , & Caracelli, V.J. (1997). Defining and describing the paradigm issue in mixed-method evaluation. In J. C. Greene & V. J. Caracelli (Eds.), Advances in mixed-method evaluation: The challenges and benefits of integrating diverse paradigms (pp. 5-18). San Francisco : Jossey-Bass. Google Scholar | |
|
Greene, J.C. , Caracelli, V.J. , & Graham, W.F. (1989). Toward a conceptual framework for mixed-method evaluation designs. Educational Evaluation and Policy Analysis , 11, 255-274. Google Scholar | SAGE Journals | |
|
Greene, J.C. , & McClintock, C. (1985). Triangulation in evaluation: Design and analysis issues. Evaluation Review, 9, 523-545. Google Scholar | SAGE Journals | ISI | |
|
Greeno, J.G. (1980). Some examples of cognitive task analysis with instructional implications. In R. E. Snow , P. A. Federico , & W. E. Montague (Eds.), Aptitude, learning and instruction: Vol. 2. Cognitive process analyses of learning and problem solving (pp. 1-21). Hillsdale, NJ: Lawrence Erlbaum. Google Scholar | |
|
Hamilton, L.S. , Nussbaum, E.M. , & Snow, R.E. (1997). Interview procedures for validating science assessments . Applied Measurement in Education, 10, 181-200. Google Scholar | Crossref | ISI | |
|
Hanson, W.E. , Creswell, J.W. , Plano Clark, V.L. , Petska, K.S. , & Creswell, J.D. (2005). Mixed methods research designs in counseling psychology. Journal of Counseling Psychology , 52, 224-235. Google Scholar | Crossref | ISI | |
|
Harvey, R.J. , & Hammer, A.L. (1999). Item response theory. The Counseling Psychologist, 27, 353-383. Google Scholar | SAGE Journals | ISI | |
|
Howe, K.R. (1988). Against the quantitative-qualitative incompatibility thesis, or, dogmas die hard. Educational Researcher, 17, 10-16. Google Scholar | SAGE Journals | |
|
Johnson, R.B. , & Onwuegbuzie, A.J. (2004). Mixed methods research: A research paradigm whose time has come. Educational Researcher , 33, 14-26. Google Scholar | SAGE Journals | |
|
Lawrence Hall of Science. (2003). Assessing science knowledge (ASK): Project summary (Unpublished report). Berkeley: Regents of the University of California. Google Scholar | |
|
Leighton, J.P. (2004). Avoiding misconception, misuse, and missed opportunities: The collection of verbal reports on educational achievement testing. Educational Measurement: Issues and Practice, 23, 6-15. Google Scholar | Crossref | |
|
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13-103). New York: Macmillan . Google Scholar | |
|
Messick, S. (1993). Foundations of validity: Meaning and consequences in psychological assessment. Prince-ton, NJ: Education Testing Service. Google Scholar | |
|
Morell, L. (2002). Analysis of the Buck Institute for education's high school economics test. Unpublished manuscript, University of California, Berkeley. Google Scholar | |
|
Morell, L. (2008). Contributions of middle grade students to the validation of a national science Assessment study. Middle Grades Research Journal, 3, 1-22. Google Scholar | |
|
National Research Council. (2001). Knowing what students know: The science and design of educational assessment . Washington, DC: National Academies Press. Google Scholar | |
|
Neilson, J. (1993). Usability engineering. Cambridge, MA: AP Professional. Google Scholar | |
|
Onwuegbuzie, A.J. , & Johnson, R.B. (2006). The validity issue in mixed research. Research in the Schools, 13, 48-63. Google Scholar | |
|
Patton, M.Q. (2002). Qualitative research and evaluation methods. Thousand Oaks, CA: Sage. Google Scholar | |
|
Pellegrino, J.W. (1988). Mental models and mental tests. In H. Wainer & H. I. Brown (Eds.), Test validity (pp. 49-59). Hillsdale, NJ: Lawrence Erlbaum. Google Scholar | |
|
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen, Denmark: Danmarks Paedogogiske Institut. Google Scholar | |
|
Snow, R.E. , & Lohman, D.F. (1989). Implications of cognitive psychology for educational measurement. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 263-331). New York: Macmillan. Google Scholar | |
|
Taskakkori, A. , & Teddlie, C. (Eds.). (2003). Handbook of mixed methods in social and behavioral research. Thousand Oaks, CA: Sage. Google Scholar | |
|
Taylor, K.L. , & Dionne, J.P. (2000). Accessing problem-solving strategy knowledge: The complementary use of concurrent verbal protocols and retrospective debriefing . Journal of Educational Psychology, 92, 413-425. Google Scholar | Crossref | ISI | |
|
Tzuriel, D. (2000). Dynamic assessment of young children: Educational and intervention perspectives. Educational Psychology Review , 12, 385-435. Google Scholar | Crossref | ISI | |
|
Wilson, M.R. (2005). Constructing measures: An item response modeling approach. Mahwah, NJ: Lawrence Erlbaum . Google Scholar | |
|
Wright, B.D. , & Masters, G.N. (1982). Rating scale analysis. Chicago : Mesa Press. Google Scholar |

