Abstract
We examine construct validity of the anchoring method used with 12 noncognitive scales from the Programme for International Student Assessment (PISA) 2012 project. This method combines individuals’ responses to vignettes and self-rated scores based on Likert-type items. It has been reported that the use of anchoring vignettes can reverse country-level correlations between academic achievement scores and noncognitive measures from negative to positive, and therefore align them with the typically reported individual-level correlations. Using the PISA 2012 data, we show that construct validity of this approach may be open to question because the anchored scales produce a different set of latent dimensions in comparison with nonanchored scales, even though both scales were created from the same set of individual responses. We also demonstrate that only one of three vignettes may be responsible for the resolution of the “paradox” highlighting that the choice of vignettes may be more important than what was previously reported.
|
He, J., Buchholz, J., Klieme, E. (2017). Effects of anchoring vignettes on comparability and predictive validity of student self-reports in 64 cultures. Journal of Cross-Cultural Psychology, 48, 319-334. Google Scholar | SAGE Journals | ISI | |
|
Heine, S. J., Lehman, D. R., Peng, K., Greenholtz, J. (2002). What’s wrong with cross-cultural comparisons of subjective Likert scales? The reference-group problem. Journal of Personality and Social Psychology, 82, 903-918. Google Scholar | Medline | ISI | |
|
King, G., Murray, C. J. L., Salomon, J. A., Tandon, A. (2004). Enhancing the validity and cross-cultural comparability of measurement in survey research. American Political Science Review, 98, 191-207. Google Scholar | ISI | |
|
King, G., Wand, J. (2007). Comparing incomparable survey responses: New tools for anchoring vignettes. Political Analysis, 15, 46-66. Google Scholar | |
|
Kyllonen, P. C., Bertling, J. P. (2013). Innovative questionnaire assessment methods to increase cross-country comparability. In Rutkowski, L., von Davier, M., Rutkowski, D. (Eds.), A handbook of international large-scale assessment data analysis (pp. 277-285) Boca Raton, FL: CRC Press. Google Scholar | |
|
Lee, J. (2009). Universals and specifics of math self-concept, math self-efficacy, and math anxiety across 41 PISA 2003 participating countries. Learning and Individual Differences, 19, 355-365. Google Scholar | ISI | |
|
Lu, Y., Bolt, D. M. (2015). Examining the attitude-achievement paradox in PISA using a multilevel multidimensional IRT model for extreme response style. Large-scale Assessments in Education: An IEA-ETS Research Institute Journal, 3, Article 2. doi:10.1186/s40536-015-0012-0 Google Scholar | |
|
Muthén, L. K., Muthén, B. O. (2014). Mplus user’s guide. Los Angeles, CA: Author. Google Scholar | |
|
Organisation for Economic Co-operation and Development . (2013). PISA 2012 results: Ready to learn: Students’ engagement, drive and self-beliefs (Vol. III). Paris, France: Author. Google Scholar | |
|
Organisation for Economic Co-operation and Development . (2014). PISA 2012 technical report. Paris, France: Author. Google Scholar | |
|
Ostroff, C. (1993). Comparing correlations based on individual-level and aggregated data. Journal of Applied Psychology, 78, 569-582. Google Scholar | ISI | |
|
Primi, R., Zanon, C., Santos, D., De Fruit, F., John, O. P. (2016). Anchoring vignettes can they make adolescent self-reports of social-emotional skills more reliable, discriminant, and criterion-valid? European Journal of Psychological Assessment, 32, 39-51. doi:10.1027/1015-5759/a000336 Google Scholar | |
|
Seaton, M., Marsh, H. W., Craven, R. (2009). Big-fish-little-pond effect: Generalizability and moderation—Two sides of the same coin. American Educational Research Journal, 47, 390-433. doi:10.3102/0002831209350493 Google Scholar | SAGE Journals | |
|
Simpson, E. H. (1951). The interpretation of interaction in contingency tables. Journal of the Royal Statistical Society, Series B, 13, 238-241. Google Scholar | |
|
Smith, P. B. (2004). Nations, cultures, and individuals: New perspectives and old dilemmas. Journal of Cross-Cultural Psychology, 35, 6-12. Google Scholar | SAGE Journals | ISI | |
|
Stankov, L., Lee, J., Luo, W., Hogan, D. J. (2012). Confidence: A better predictor of academic achievement than self-efficacy, self-concept and anxiety? Learning and Individual Differences, 22, 747-758. Google Scholar | ISI | |
|
Stankov, L., Lee, J., Paek, I. (2009). Realism of confidence judgments. European Journal of Psychological Assessment, 25,123-130. Google Scholar | ISI | |
|
Stankov, L., Saucier, G. (2015). Social axioms in 33 countries: Good replicability at the individual but less so at the country level. Journal of Cross-Cultural Psychology, 46, 296-315. Google Scholar | SAGE Journals | ISI | |
|
von Davier, M., Shin, H. J., Khorramdel, L., Stankov, L. (2017). The effects of vignette scoring on reliability and validity. Manuscript submitted for publication. Google Scholar | |
|
Ziegler, M., MacCann, C., Roberts, R. (2012). New perspectives on faking in personality assessments. Oxford, UK: Oxford University Press. Google Scholar |

