Abstract
In traditional paired comparison models heterogeneity in the population is simply ignored and it is assumed that all persons or subjects have the same preference structure. In the models considered here the preference of an object over another object is explicitly modelled as depending on subject-specific covariates, therefore allowing for heterogeneity in the population. Since by construction the models contain a large number of parameters we propose to use penalized estimation procedures to obtain estimates of the parameters. The used regularized estimation approach penalizes the differences between the parameters corresponding to single covariates. It enforces variable selection and allows to find clusters of objects with respect to covariates. We consider simple binary but also ordinal paired comparisons models. The method is applied to data from a pre-election study from Germany.
References
| Agresti, A (1992) Analysis of ordinal paired comp- arison data. Applied Statistics, 41, 287–97. Google Scholar | Crossref | ISI | |
| Akaike, H (1973) Information theory and the extension of the maximum likelihood principle. In Petrov B and Caski F, eds. Second International Symposium on Infor- mation Theory, pages 267–81. Budapest: Akademia Kiado. Google Scholar | |
| Archer, KJ, Williams, AAA (2012) L1 penalized continuation ratio models for ordinal respo- nse prediction using high-dimensional data- sets. Statistics in Medicine, 31, 1464–74. ISSN 1097-0258. \doi 10.1002/sim.4484. URL http://dx.doi.org/10.1002/sim.4484. Google Scholar | Crossref | Medline | ISI | |
| Archer, KJ (2014a) Glmnetcr: Fit a penalized constrained continuation ratio model for predicting an ordinal response, R package version 1.0.2., URL http://CRAN.R-[project.org/package=glmnetcr]. Google Scholar | |
| Archer, KJ (2014b) Glmpathcr: Fit a penalized continuation ratio model for predicting an ordinal response, R package version 1.0.3., URL http://CRAN.R-project.org/[package=glmpathcr]. Google Scholar | |
| Böckenholt, U (2001) Thresholds and intransitiv- ities in pairwise judgments: A multilevel analysis. Journal of Educational and Behavioral Statistics, 26, 269–82. Google Scholar | SAGE Journals | ISI | |
| Bondell, HD, Reich, BJ (2009) Simultaneous factor selection and collapsing levels in anova. Biometrics, 65, 169–77. Google Scholar | Crossref | Medline | ISI | |
| Bradley, RA (1976) Science, statistics, and paired comparison. Biometrics, 32, 213–32. Google Scholar | Crossref | Medline | ISI | |
| Bradley, RA, Terry, ME (1952) Rank analysis of incomplete block designs, I: The method of pair comparisons. Biometrika, 39, 324–45. Google Scholar | ISI | |
| Casalicchio, G, Tutz, G, Schauberger, G (2015) Subject-specific Bradley-Terry-Luce models with implicit variable selection. Statistical Modelling, 15, 526–47. doi 10.1177/ 1471082X15571817. URL http://smj.[sagepub.com/content/15/6/526.abstract] (last accessed 23 January 2017). Google Scholar | SAGE Journals | ISI | |
| Cattelan, M (2012) Models for paired comparison data: A review with emphasis on dependent data. Statistical Science, 27, 412–33. Google Scholar | Crossref | ISI | |
| David, HA (1988) The method of paired compari- sons, 2nd edition. Griffin's Statistical Mono- graphs and Courses 41. London: Griffin. Google Scholar | |
| Dittrich, R, Hatzinger, R, Katzenbeisser, W (1998) Modelling the effect of subject- specific covariates in paired comparison studies with an application to university rankings. Applied Statistics, 47, 511–25. Google Scholar | ISI | |
| Dittrich, R, Hatzinger, R, Katzenbeisser, W (2004) A log-linear approach for modelling ordinal paired comparison data on motives to start a PhD programme. Statistical Mode- lling, 4, 181–93. doi 10.1191/1471082X04 st072oa. URL http://smj.sagepub.com/[content/4/3/181.abstract]. Google Scholar | SAGE Journals | ISI | |
| Dittrich, R, Katzenbeisser, W, Reisinger, H (2000) The analysis of rank ordered prefere- nce data based on Bradley-Terry type models. OR-Spektrum, 22, 117–34. Google Scholar | Crossref | ISI | |
| Dittrich, R, Francis, B, Hatzinger, R, Katzen- beisser, W (2007) A paired comparison app- roach for the analysis of sets of Likert-scale responses. Statistical Modelling, 7, 3–28. \doi 10.1177/1471082X0600700102. URL http://smj.sagepub.com/content/7/1/3.[abstract]. Google Scholar | SAGE Journals | ISI | |
| Eddelbuettel, D (2013) Seamless R and C++ integration with Rcpp. New York: Springer. Google Scholar | Crossref | |
| Eddelbuettel, D, Sanderson, C (2014) Rcpparmadillo: Accelerating R with high- performance C++ linear algebra. Computa- tional Statistics and Data Analysis, 71, 1054–63. URL http://dx.doi.org/10.1016/[j.csda.2013.02.005] (last accessed 23 January 2017). Google Scholar | Crossref | ISI | |
| Eddelbuettel, D, François, R, Allaire, J, Chambers, J, Bates, D, Ushey, K (2011) Rcpp: Seamless R and C++ integration. Journal of Statistical Software, 40, 1–18. Google Scholar | Crossref | ISI | |
| Fahrmeir, L, Pritscher, L (1996) Regression analysis of forest damage by marginal models for correlated ordinal responses. Journal of Environmental and Ecological Statistics, 3, 257–68. Google Scholar | Crossref | ISI | |
| Fan, J, Li, R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96, 1348–60. \doi 10.1198/016214501753382273. Google Scholar | Crossref | ISI | |
| Francis, B, Dittrich, R, Hatzinger, R, Penn, R (2002) Analysing partial ranks by using smoothed paired comparison methods: An investigation of value orientation in Europe. Journal of the Royal Statistical Society: Series C (Applied Statistics), 51, 319–36. Google Scholar | Crossref | ISI | |
| Francis, B, Dittrich, R, Hatzinger, R (2010) Modeling heterogeneity in ranked responses by nonparametric maximum likelihood: How do Europeans get their scientific know- ledge? The Annals of Applied Statistics, 4, 2181–2202. Google Scholar | Crossref | ISI | |
| Francis, B, Dittrich, R, Hatzinger, R, Humphreys, L (2014) A mixture model for longitudinal partially ranked data. Comm- unications in Statistics-Theory and Methods, 43, 722–34. Google Scholar | Crossref | ISI | |
| Gertheiss, J, Tutz, G (2010) Sparse modeling of categorial explanatory variables. Annals of Applied Statistics, 4, 2150–80. Google Scholar | Crossref | ISI | |
| Gneiting, T, Raftery, A (2007) Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Associa- tion, 102, 359–76. Google Scholar | Crossref | ISI | |
| Hatzinger, R, Dittrich, R (2012) Prefmod: An R package for modeling preferences based on paired comparisons, rankings, or ratings. Journal of Statistical Software, 48, 1–31. Google Scholar | Crossref | ISI | |
| Hatzinger, R, Dittrich, R, Salzberger, T (2009) Präferenzanalyse mit R: Anwendungen aus marketing, behavioural finance und human resource management [Preference analysis in R: Applications from marketing, behavioural finance and human resource management]. Vienna: Facultas wuv. Google Scholar | |
| Heagerty, PJ, Zeger, SL (1996) Marginal regres- sion models for clustered ordinal measure- ments. Journal of the American Statistical Association, 91, 1024–36. Google Scholar | Crossref | ISI | |
| Hoerl, AE, Kennard, RW (1970) Ridge regression: Bias estimation for nonortho- gonal problems. Technometrics, 12, 55–67. Google Scholar | Crossref | ISI | |
| LeCessie, a (1992) Ridge estimators in logistic regression. Applied Statistics, 41, 191–201. Google Scholar | Crossref | ISI | |
| Luce, RD (1959) Individual Choice Behaviour. New York: Wiley. Google Scholar | |
| Masarotto, G, Varin, C (2012) The ranking lasso and its application to sport tourna- ments. The Annals of Applied Statistics, 6, 1949–70. Google Scholar | Crossref | ISI | |
| Miller, ME, Davis, CS, Landis, RJ (1993) The analysis of longitudinal polytomous data: Generalized estimated equations and connections with weighted least squares. Biometrics, 49, 1033–44. Google Scholar | Crossref | Medline | ISI | |
| Nyquist, H (1991) Restricted estimation of genera- lized linear models. Applied Statistics, 40, 133–41. Google Scholar | Crossref | ISI | |
| Oelker, M-R (2015) Gvcm.cat: Regularized Cate- gorical Effects/Categorical Effect Modifiers/ Continuous/Smooth Effects in GLMs. R package version 1.9. Google Scholar | |
| Oelker, M-R, Tutz, G (2015) A uniform frame- work for the combination of penalties in generalized structured models. Advances in Data Analysis and Classification, page pub- lished online. ISSN 1862-5347. doi 10.1007/s11634-015-0205-y. URL http://dx.doi.org/10.1007/s11634-015-[0205-y] (last accessed 23 January 2017). Google Scholar | |
| Oelker, M-R, Gertheiss, J, Tutz, G (2014) Regularization and model selection with categorical predictors and effect modifiers in generalized linear models. Statistical Modelling, 14, 157–77. Google Scholar | SAGE Journals | ISI | |
| Plass, J, Fink, P, Schöning, N, Augustin, T (2015) Statistical modelling in surveys without neglecting ≤the undecided’: Multinomial logistic regression models and imprecise classification trees under ontic data impre- cision—extended version (Technical Report 179). Germany: Department of Statistics, Ludwig-Maximilians-Universität München. Google Scholar | |
| R R Core Team (2016) R: A language and environ- ment for statistical computing. R Founda- tion for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/. Google Scholar | |
| Rao, P, Kupper, L (1967) Ties in paired- comparison experiments: A generalization of the Bradley-Terry model. Journal of the American Statistical Association, 62, 194–204. Google Scholar | Crossref | ISI | |
| Rattinger, H, Roßteutscher, S, Schmitt-Beck, R, Weßels, B, Wolf, C (2014) Pre-election cross section (GLES 2013). GESIS Data Archive, Cologne, ZA5700 Data file Version 2.0.0. Google Scholar | |
| Schauberger, G (2017) BTLLasso: Modelling het- erogeneity in paired comparison data, R package version 0.1-5, URL http://[CRAN.R-project.org/package=BTLLasso]. Google Scholar | |
| Schwarz, G (1978) Estimating the dimension of a model. Annals of Statistics, 6, 461–64. Google Scholar | Crossref | ISI | |
| Segerstedt, B (1992) On ordinary ridge regression in generalized linear models. Communica- tions in Statistics—Theory and Methods, 21, 2227–46. Google Scholar | Crossref | ISI | |
| Strobl, C, Wickelmaier, F, Zeileis, A (2011) Accounting for individual differences in Bradley-Terry models by means of recursive partitioning. Journal of Educational and Behavioral Statistics, 36, 135–53. Google Scholar | Abstract | ISI | |
| Tibshirani, R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, B 58, 267–88. Google Scholar | |
| Turner, H, Firth, D (2012) Bradley-Terry models in R: The BradleyTerry2 package. Journal of Statistical Software, 48, 1–21. ISSN 1548-7660. URL http://www.[jstatsoft.org/v48/i09]. Google Scholar | Crossref | ISI | |
| Tutz, G (1986) Bradley-Terry-Luce models with an ordered response. Journal of Mathematical Psychology, 30, 306–16. Google Scholar | Crossref | ISI | |
| Tutz, G (1989) Latent Trait-Modelle für ordinale Beobachtungen—die statistische und mess- theoretische Analyse von Paarvergleichs- daten. Heidelberg: Springer-Verlag. Google Scholar | Crossref | |
| Tutz, G, Schauberger, G (2015) Extended ordered paired comparison models with application to football data from German Bundesliga. AStA Advances in Statistical Analysis, 99, 209–27. Google Scholar | Crossref | ISI | |
| Zou, H (2006) The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101, 1418–29. Google Scholar | Crossref | ISI |
