Skip to main content
Intended for healthcare professionals
Restricted access
Research article
First published online May 2, 2011

Item Response Modeling of Forced-Choice Questionnaires

Abstract

Multidimensional forced-choice formats can significantly reduce the impact of numerous response biases typically associated with rating scales. However, if scored with classical methodology, these questionnaires produce ipsative data, which lead to distorted scale relationships and make comparisons between individuals problematic. This research demonstrates how item response theory (IRT) modeling may be applied to overcome these problems. A multidimensional IRT model based on Thurstone’s framework for comparative data is introduced, which is suitable for use with any forced-choice questionnaire composed of items fitting the dominance response model, with any number of measured traits, and any block sizes (i.e., pairs, triplets, quads, etc.). Thurstonian IRT models are normal ogive models with structured factor loadings, structured uniquenesses, and structured local dependencies. These models can be straightforwardly estimated using structural equation modeling (SEM) software Mplus. A number of simulation studies are performed to investigate how latent traits are recovered under various forced-choice designs and provide guidelines for optimal questionnaire design. An empirical application is given to illustrate how the model may be applied in practice. It is concluded that when the recommended design guidelines are met, scores estimated from forced-choice questionnaires with the proposed methodology reproduce the latent traits well.

Get full access to this article

View all access and purchase options for this article.

References

Ackerman, T.A. ( 2005). Multidimensional item response theory modeling . In A. Maydeu-Olivares & J. J. McArdle (Eds.), Contemporary psychometrics (pp. 3-26). Mahwah, NJ: Lawrence Erlbaum.
Baron, H. ( 1996). Strengths and limitations of ipsative measurement . Journal of Occupational and Organizational Psychology , 69, 49-56.
Bartram, D. ( 2007). Increasing validity with forced-choice criterion measurement formats. International Journal of Selection and Assessment , 15, 263-272.
Bock, R.D. ( 1975). Multivariate statistical methods in behavioral research . New York, NY: McGraw-Hill .
Cheung, M.W.L., & Chan, W. ( 2002). Reducing uniform response bias with ipsative measurement in multiple-group confirmatory factor analysis. Structural Equation Modeling, 9, 55-77.
Coombs, C.H. ( 1964). A theory of data. New York, NY: Wiley.
Costa, P.T., & McCrae, R.R. (1992). NEO-PI-R professional manual . Odessa, FL: Psychological Assessment Resources.
Du Toit, M. (Ed.). (2003). IRT from SSI. Chicago, IL: SSI Scientific Software International.
Embretson, S.E., & Reise, S. ( 2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum.
Forero, C.G., Maydeu-Olivares, A., & Gallardo-Pujol, D. (2009). Factor analysis with ordinal indicators: A Monte Carlo study comparing DWLS and ULS estimation . Structural Equation Modeling, 16, 625-641.
Friedman, H., & Amoo, T. ( 1999). Rating the rating scales. Journal of Marketing Management, 9, 114-123.
Goldberg, L.R. ( 1992). The development of markers for the Big-Five factor structure . Psychological Assessment, 4, 26-42.
Gordon, L.V. ( 1976). Survey of interpersonal values (Revised manual) . Chicago, IL: Science Research Associates.
Hogan, R. ( 1983). A socioanalytic theory of personality. In M. M. Page (Ed.), Nebraska symposium on motivation (pp. 336-355). Lincoln: University of Nebraska Press.
International Personality Item Pool: A scientific collaboratory for the development of advanced measures of personality traits and other individual differences . (n.d.). Retrieved from http://ipip.ori.org/
Maydeu-Olivares, A. ( 1999). Thurstonian modeling of ranking data via mean and covariance structure analysis. Psychometrika, 64, 325-340.
Maydeu-Olivares, A., & Böckenholt, U. (2005). Structural equation modeling of paired comparisons and ranking data. Psychological Methods , 10, 285-304.
Maydeu-Olivares, A., & Brown, A. (in press). Item response modeling of paired comparison and ranking data . Multivariate Behavioural Research.
Maydeu-Olivares, A., & Coffman, D.L. ( 2006). Random intercept item factor analysis. Psychological Methods, 11, 344-362.
McCloy, R., Heggestad, E., & Reeve, C. ( 2005). A silk purse from the sow’s ear: Retrieving normative information from multidimensional forced-choice items. Organizational Research Methods, 8, 222-248.
McDonald, R.P. ( 1999). Test theory. A unified approach. Mahwah, NJ: Lawrence Erlbaum.
Meade, A. ( 2004). Psychometric problems and issues involved with creating and using ipsative measures for selection. Journal of Occupational and Organisational Psychology, 77, 531-552.
Muthén, L.K., & Muthén, B. (1998-2007). Mplus 5. Los Angeles, CA: Muthén & Muthén .
Reckase, M. ( 2009). Multidimensional item response theory. New York, NY: Springer.
Samejima, F. ( 1969). Calibration of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement , 17.
SHL (1997). Customer contact: Manual and user’s guide. Surrey, England. Author.
SHL. (2006). OPQ32 technical manual. Surrey, UK. Author.
Stark, S., Chernyshenko, O., & Drasgow, F. ( 2005). An IRT approach to constructing and scoring pairwise preference items involving stimuli on different dimensions: The multi-unidimensional pairwise-preference model. Applied Psychological Measurement , 29, 184-203.
Stark, S., Chernyshenko, O., Drasgow, F., & Williams, B. ( 2006). Examining assumptions about item responding in personality assessment: Should ideal point methods be considered for scale development and scoring? Journal of Applied Psychology, 91, 25-39.
Tenopyr, M.L. ( 1988). Artifactual reliability of forced-choice scales . Journal of Applied Psychology, 73, 749-751.
Thurstone, L.L. ( 1927). A law of comparative judgment. Psychological Review, 79, 281-299.
Thurstone, L.L. ( 1931). Rank order as a psychological method. Journal of Experimental Psychology, 14, 187-201.
Van Herk, H., Poortinga, Y., & Verhallen, T. ( 2004). Response styles in rating scales: Evidence of method bias in data from six EU countries. Journal of Cross-Cultural Psychology, 35, 346.

Cite article

Cite article

Cite article

OR

Download to reference manager

If you have citation software installed, you can download article citation data to the citation manager of your choice

Share options

Share

Share this article

Share with email
EMAIL ARTICLE LINK
Share on social media

Share access to this article

Sharing links are not relevant where the article is open access and not available if you do not have a subscription.

For more information view the SAGE Journals article sharing page.

Information, rights and permissions

Information

Published In

Article first published online: May 2, 2011
Issue published: June 2011

Keywords

  1. forced-choice format
  2. forced-choice questionnaires
  3. ipsative data
  4. comparative judgment
  5. multidimensional IRT

Rights and permissions

© The Author(s) 2011.
Request permissions for this article.

History

Published online: May 2, 2011
Issue published: June 2011

Authors

Affiliations

Anna Brown
SHL Group, Thames Ditton, Surrey, UK, [email protected]
Alberto Maydeu-Olivares
University of Barcelona, Barcelona, Spain

Metrics and citations

Metrics

Journals metrics

This article was published in Educational and Psychological Measurement.

VIEW ALL JOURNAL METRICS

Article usage*

Total views and downloads: 1341

*Article usage tracking started in December 2016

Altmetric

See the impact this article is making through the number of times it’s been read, and the Altmetric Score.
Learn more about the Altmetric Scores


Articles citing this one

Web of Science: 138 view articles Opens in new tab

Crossref: 144

  1. Forced-Choice Ranking Models for Raters’ Ranking Data
    Go to citation Crossref Google Scholar
  2. Comprehensive model for characterizing skin translucency by expert gra...
    Go to citation Crossref Google Scholar
  3. The Big Five Inventory–2 in China: A Comprehensive Psychometric Evalua...
    Go to citation Crossref Google ScholarPub Med
  4. Comparison of parameter estimation approaches for multi-unidimensional...
    Go to citation Crossref Google Scholar
  5. Bayesian paired comparison with the bpcs package
    Go to citation Crossref Google Scholar
  6. Scoring a forced-choice image-based assessment of personality: A compa...
    Go to citation Crossref Google Scholar
  7. Understanding, detecting, and deterring faking on interest inventories
    Go to citation Crossref Google Scholar
  8. Linear Ballistic Accumulator Item Response Theory Model for Multidimen...
    Go to citation Crossref Google Scholar
  9. Reliability Estimates for IRT-Based Forced-Choice Assessment Scores
    Go to citation Crossref Google Scholar
  10. Investigating the impact of negatively keyed statements on multidimens...
    Go to citation Crossref Google Scholar
  11. Modeling Faking in the Multidimensional Forced-Choice Format: The Faki...
    Go to citation Crossref Google Scholar
  12. A genetic algorithm for optimal assembly of pairwise forced-choice que...
    Go to citation Crossref Google Scholar
  13. IRT-based scoring methods for multidimensional forced choice tests
    Go to citation Crossref Google Scholar
  14. Comparing Direct and Indirect Methods of Audio Quality Evaluation in V...
    Go to citation Crossref Google Scholar
  15. Recommended methods for conducting human factors experiments on the su...
    Go to citation Crossref Google Scholar
  16. On Bank Assembly and Block Selection in Multidimensional Forced-Choice...
    Go to citation Crossref Google Scholar
  17. Gamification and Game-Based Assessments
    Go to citation Crossref Google Scholar
  18. Multidimensional Forced-Choice CAT With Dominance Items: An Empirical ...
    Go to citation Crossref Google Scholar
  19. Factors influencing customers' continuance usage intention of food del...
    Go to citation Crossref Google Scholar
  20. On the Information Obtainable from Comparative Judgments
    Go to citation Crossref Google Scholar
  21. Diagnostic Classification Model for Forced-Choice Items and Noncogniti...
    Go to citation Crossref Google Scholar
  22. autoFC: An R Package for Automatic Item Pairing in Forced-Choice Test ...
    Go to citation Crossref Google Scholar
  23. Leader Extraversion as a Boundary Condition in the Relationship betwee...
    Go to citation Crossref Google Scholar
  24. Dilemmas in Developing Context Questionnaires for International Large-...
    Go to citation Crossref Google Scholar
  25. Item desirability matching in forced-choice test construction
    Go to citation Crossref Google Scholar
  26. Introducing a supervised alternative to forced‐choice personality scor...
    Go to citation Crossref Google Scholar
  27. Adjectives vs. Statements in Forced Choice and Likert Item Types: Whic...
    Go to citation Crossref Google Scholar
  28. Assessing Dimensionality of the Ideal Point Item Response Theory Model...
    Go to citation Crossref Google Scholar
  29. A Lognormal Ipsative Model for Multidimensional Compositional Items
    Go to citation Crossref Google Scholar
  30. Detecting DIF in Multidimensional Forced Choice Measures Using the Thu...
    Go to citation Crossref Google Scholar
  31. Ambulatory assessment in psychopathology research: Current achievement...
    Go to citation Crossref Google Scholar
  32. A Meta-Analysis of the Faking Resistance of Forced-Choice Personality ...
    Go to citation Crossref Google Scholar
  33. Investigating the Normativity of Trait Estimates from Multidimensional...
    Go to citation Crossref Google Scholar
  34. Modeling Multidimensional Forced Choice Measures with the Zinnes and G...
    Go to citation Crossref Google Scholar
  35. Reviewing the Structure of Kolb’s Learning Style Inventory From Factor...
    Go to citation Crossref Google Scholar
  36. Structural Equation Approach to Analyze Cyclists Risk Perception and T...
    Go to citation Crossref Google Scholar
  37. A Concussion Education Programme for Motorsport Drivers: A Field-Based...
    Go to citation Crossref Google Scholar
  38. The Motivation and Opportunity for Socially Desirable Responding Does ...
    Go to citation Crossref Google ScholarPub Med
  39. The Development and Validation of a Multidimensional Forced-Choice For...
    Go to citation Crossref Google Scholar
  40. Faking Effects on the Factor Structure of a Quasi-Ipsative Forced-Choi...
    Go to citation Crossref Google Scholar
  41. Assessment of Differential Statement Functioning in Ipsative Tests Wit...
    Go to citation Crossref Google Scholar
  42. Improving Applicant Reactions to Forced-Choice Personality Measurement...
    Go to citation Crossref Google Scholar
  43. Hierarchical paired comparison modeling, a cultural consensus theory a...
    Go to citation Crossref Google Scholar
  44. Social Networks’ Factors Driving Consumer Restaurant Choice: An Explor...
    Go to citation Crossref Google Scholar
  45. A review of multi-source feedback focusing on psychometrics, pitfalls ...
    Go to citation Crossref Google Scholar
  46. Increase of reliability by incorporating response time into the paired...
    Go to citation Crossref Google Scholar
  47. Pitfalls of Statistical Methods in Traffic Psychology
    Go to citation Crossref Google Scholar
  48. Dilemmas in Developing Context Questionnaires for International Large-...
    Go to citation Crossref Google Scholar
  49. An Estimation Method in the Paired-comparison Format Questionnaires
    Go to citation Crossref Google Scholar
  50. Recognizing Emotions through Facial Expressions: A Largescale Experime...
    Go to citation Crossref Google Scholar
  51. Factor Analysis for Nominal (First Choice) Data
    Go to citation Crossref Google Scholar
  52. The Echo Listening Profile: Initial Validity Evidence for a Measure of...
    Go to citation Crossref Google Scholar
  53. An exploratory factor model for ordinal paired comparison indicators
    Go to citation Crossref Google Scholar
  54. Advancing and Evaluating IRT Model Data Fit Indices in Organizational ...
    Go to citation Crossref Google Scholar
  55. Though Forced, Still Valid: Psychometric Equivalence of Forced-Choice ...
    Go to citation Crossref Google Scholar
  56. The Multidimensional Forced-Choice Format as an Alternative for R...
    Go to citation Crossref Google Scholar
  57. A Bayesian Random Block Item Response Theory Model for Forced-Choice F...
    Go to citation Crossref Google Scholar
  58. Fit Indices for Measurement Invariance Tests in the Thurstonian IRT Mo...
    Go to citation Crossref Google Scholar
  59. On the Validity of Forced Choice Scores Derived From the Thurstonian I...
    Go to citation Crossref Google ScholarPub Med
  60. Item Selection and Exposure Control Methods for Computerized Adaptive ...
    Go to citation Crossref Google Scholar
  61. Joint modeling of the two-alternative multidimensional forced-choice p...
    Go to citation Crossref Google Scholar
  62. Taking the Test Taker’s Perspective: Response Process and Test Motivat...
    Go to citation Crossref Google ScholarPub Med
  63. Adaptive testing with the GGUM-RANK multidimensional forced choice mod...
    Go to citation Crossref Google Scholar
  64. Person Parameter Estimation for IRT Models of Forced-Choice Data: Meri...
    Go to citation Crossref Google Scholar
  65. Cycling Skill Inventory: Assessment of motor–tactical skills and safet...
    Go to citation Crossref Google Scholar
  66. Examination of the Test–Retest Reliability of a Forced‐Choice Personal...
    Go to citation Crossref Google Scholar
  67. Validation of a translated measurement scale to assess Chinese busines...
    Go to citation Crossref Google Scholar
  68. On the Statistical and Practical Limitations of Thurstonian IRT Models
    Go to citation Crossref Google ScholarPub Med
  69. thurstonianIRT: Thurstonian IRT Models in R
    Go to citation Crossref Google Scholar
  70. Effects of Applicant Faking on Forced-Choice and Likert Scores
    Go to citation Crossref Google Scholar
  71. Forced-Choice Versus Likert Responses on an Occupational Big Five Ques...
    Go to citation Crossref Google Scholar
  72. Item response and response time model for personality assessment via l...
    Go to citation Crossref Google Scholar
  73. GGUM-RANK Statement and Person Parameter Estimation With Multidimensio...
    Go to citation Crossref Google ScholarPub Med
  74. Examining stability of personality profile solutions between Likert-ty...
    Go to citation Crossref Google Scholar
  75. Applying Adaptive Approaches to Talent Management Practices
    Go to citation Crossref Google Scholar
  76. Comparison of Single-Response Format and Forced-Choice Format Instrume...
    Go to citation Crossref Google Scholar
  77. The Journey from Likert to Forced-Choice Questionnaires: Evidence of t...
    Go to citation Crossref Google Scholar
  78. Situational Judgment Tests as Measures of 21st Century Skills: Evidenc...
    Go to citation Crossref Google Scholar
  79. Development of Information Functions and Indices for the GGUM-RANK Mul...
    Go to citation Crossref Google Scholar
  80. Validity and Reliability Evidence for the Comprehensive Test of Nonver...
    Go to citation Crossref Google Scholar
  81. The Visual Analogue Scale for Rating, Ranking and Paired-Comparison (V...
    Go to citation Crossref Google Scholar
  82. Ordinal Factor Analysis of Graded-Preference Questionnaire Data
    Go to citation Crossref Google Scholar
  83. A Brief Measure of Narcissism Among Female Juvenile Delinquents and Co...
    Go to citation Crossref Google ScholarPub Med
  84. Forced-Choice Assessment of Work-Related Maladaptive Personality Trait...
    Go to citation Crossref Google ScholarPub Med
  85. Modelling Forced‐Choice Response Formats
    Go to citation Crossref Google Scholar
  86. Item Response Theory Approaches to Test Scoring and Evaluating the Sco...
    Go to citation Crossref Google Scholar
  87. Examining validity evidence for multidimensional forced choice measure...
    Go to citation Crossref Google Scholar
  88. Considering Local Dependencies: Person Parameter Estimation for IRT Mo...
    Go to citation Crossref Google Scholar
  89. The World Beyond Rating Scales
    Go to citation Crossref Google Scholar
  90. The Narcissism Epidemic Is Dead; Long Live the Narcissism Epidemic
    Go to citation Crossref Google ScholarPub Med
  91. A Literature Review on Collaborative Problem Solving for College and W...
    Go to citation Crossref Google Scholar
  92. Item Response Theory Models for Ipsative Tests With Multidimensional P...
    Go to citation Crossref Google Scholar
  93. The Motivational Value Systems Questionnaire (MVSQ): Psychometric Anal...
    Go to citation Crossref Google Scholar
  94. A Hierarchical Model for Accuracy and Choice on Standardized Tests
    Go to citation Crossref Google Scholar
  95. Faking under a nonlinear relationship between personality assessment s...
    Go to citation Crossref Google Scholar
  96. Review of Item Response Theory Practices in Organizational Research...
    Go to citation Crossref Google Scholar
  97. The Potential of Online Selection
    Go to citation Crossref Google Scholar
  98. Influence of Context on Item Parameters in Forced-Choice Personality A...
    Go to citation Crossref Google ScholarPub Med
  99. Integration of the Forced-Choice Questionnaire and the Likert Scale: A...
    Go to citation Crossref Google Scholar
  100. Pairwise comparison psychoacoustic test on the noise emitted by DC ele...
    Go to citation Crossref Google Scholar
  101. Acquiescence response styles: A multilevel model explaining individual...
    Go to citation Crossref Google Scholar
  102. Preventing Rater Biases in 360-Degree Feedback by Forcing Choice
    Go to citation Crossref Google Scholar
  103. The Chronophilia Conundrum: Continuum or Epiphenomenon?
    Go to citation Crossref Google Scholar
  104. Measures of Personality
    Go to citation Crossref Google Scholar
  105. Moderator effects of job complexity on the validity of forced-choice p...
    Go to citation Crossref Google Scholar
  106. Socio-emotional and Self-management Variables in Learning and Assessme...
    Go to citation Crossref Google Scholar
  107. Evaluating the impact of a Quasi-ipsative scoring approach on the scor...
    Go to citation Crossref Google Scholar
  108. A Dominance Variant Under the Multi-Unidimensional Pairwise-Preference...
    Go to citation Crossref Google Scholar
  109. MCMC Z-G...
    Go to citation Crossref Google Scholar
  110. Developing Pairwise Preference-Based Personality Test and Experimental...
    Go to citation Crossref Google Scholar
  111. Study Protocol on Intentional Distortion in Personality Assessment: Re...
    Go to citation Crossref Google Scholar
  112. Improving the Factor Structure of Psychological Scales...
    Go to citation Crossref Google ScholarPub Med
  113. The Stability of Extreme Response Style and Acquiescence Over 8 Years
    Go to citation Crossref Google ScholarPub Med
  114. Thurstonian Scaling of Compositional Questionnaire Data
    Go to citation Crossref Google Scholar
  115. Equivalence of Narcissistic Personality Inventory constructs and corre...
    Go to citation Crossref Google Scholar
  116. Item Response Models for Forced-Choice Questionnaires: A Common Framew...
    Go to citation Crossref Google Scholar
  117. Concordance Rate for the Identification of Distant Entrance Gunshot Wo...
    Go to citation Crossref Google Scholar
  118. Test Standards and Psychometric Modeling
    Go to citation Crossref Google Scholar
  119. Conscientiousness in Education: Its Conceptualization, Assessment, and...
    Go to citation Crossref Google Scholar
  120. Item Response Theory Models for Multidimensional Ranking Items
    Go to citation Crossref Google Scholar
  121. Assessing Cross-Cultural Competence: A Working Framework and Prototype...
    Go to citation Crossref Google Scholar
  122. General Noncognitive Outcomes
    Go to citation Crossref Google Scholar
  123. Bias Assessment and Prevention in Noncognitive Outcome Measures in Con...
    Go to citation Crossref Google Scholar
  124. Development and Validation of the Behavioral Tendencies Questionnaire
    Go to citation Crossref Google Scholar
  125. Comparing Traditional and IRT Scoring of Forced-Choice Tests
    Go to citation Crossref Google Scholar
  126. Attitudes of nursing staff towards a Modified Early Warning System
    Go to citation Crossref Google Scholar
  127. Acquiescence in personality questionnaires: Relevance, domain specific...
    Go to citation Crossref Google Scholar
  128. A Comparison of the Psychometric Properties of the Forced Choice and L...
    Go to citation Crossref Google Scholar
  129. Development of a Forced-Choice Measure of Typical-Performance Emotiona...
    Go to citation Crossref Google Scholar
  130. Personality Assessment, Faking and
    Go to citation Crossref Google Scholar
  131. Personality Assessment, Forced-Choice
    Go to citation Crossref Google Scholar
  132. Response Bias, Malingering, and Impression Management
    Go to citation Crossref Google Scholar
  133. Intercultural Training and Assessment...
    Go to citation Crossref Google Scholar
  134. From ABLE to TAPAS: A New Generation of Personality Tests to Support M...
    Go to citation Crossref Google Scholar
  135. Students’ Levels of Understanding Models and Modelling in Biology: Glo...
    Go to citation Crossref Google Scholar
  136. The Five-Factor Model, forced-choice personality inventories and perfo...
    Go to citation Crossref Google Scholar
  137. Análisis factorial de ítems de respuesta forzada: una revisión y un ej...
    Go to citation Crossref Google Scholar
  138. The general factor of personality: Substance or artefact?
    Go to citation Crossref Google Scholar
  139. Do Individual Response Styles Matter?
    Go to citation Crossref Google Scholar
  140. Scalar Equivalence of OPQ32: Big Five Profiles of 31 Countries
    Go to citation Crossref Google Scholar
  141. Fitting a Thurstonian IRT model to forced-choice data using Mplus
    Go to citation Crossref Google Scholar
  142. Adaptive Testing With Multidimensional Pairwise Preference Items...
    Go to citation Crossref Google Scholar
  143. Testing Practices in the 21st Century
    Go to citation Crossref Google Scholar
  144. Student Learning Strategies: Helping or Hindering Their Success?
    Go to citation Crossref Google Scholar

Figures and tables

Figures & Media

Tables

View Options

Get access

Access options

If you have access to journal content via a personal subscription, university, library, employer or society, select from the options below:


Alternatively, view purchase options below:

Purchase 24 hour online access to view and download content.

Access journal content via a DeepDyve subscription or find out more about this option.

View options

PDF/ePub

View PDF/ePub