Abstract
We propose the nuclear norm penalty as an alternative to the ridge penalty for regularized multinomial regression. This convex relaxation of reduced-rank multinomial regression has the advantage of leveraging underlying structure among the response categories to make better predictions. We apply our method, nuclear penalized multinomial regression (NPMR), to Major League Baseball play-by-play data to predict outcome probabilities based on batter–pitcher matchups. The interpretation of the results meshes well with subject-area expertise and also suggests a novel understanding of what differentiates players.
References
| Albert, J (2016) Improved component predictions of batting and pitching measures. Journal of Quantitative Analysis in Sports, 12, 73–85. Google Scholar | |
| Anderson, JA (1984) Regression and ordered categorical variables. Journal of the Royal Statistical Society B, 46, 1–30. Google Scholar | |
| Baumer, B, Zimbalist, A (2014) The Sabermetric Revolution. Philadelphia, PA: University of Pennsylvania Press. Google Scholar | |
| Bhatia, R (1997) Matix Analysis. New York, NY: Springer. Google Scholar | |
| Brown, LD (2008) In-season prediction of batting averages: A field test of empirical Bayes and Bayes methodologies. The Annals of Applied Statistics, 2, 113–152. Google Scholar | |
| Chen, K, Dong, H, Chan, K-S (2013) Reduced rank regression via adaptive nuclear norm penalization. Biometrika, 100, 901–920. Google Scholar | Medline | |
| Deterding, DH (1990) Speaker normalisation for automatic speech recognition. PhD dissertation, University of Cambridge, UK. Google Scholar | |
| Efron, B, Morris, CN (1975) Data analysis using Stein's estimator and its generalizations. Journal of the American Statistical Association, 70, 311–319. Google Scholar | ISI | |
| Friedman, J, Hastie, TJ, Tibshirani, RJ (2010) Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33, 1–22. Google Scholar | Medline | ISI | |
| Grant, M, Boyd, S, Ye, Y (2008) CVX: Matlab software for disciplined convex programming. CVX Research. URL http://www.cvxr.com/(last accessed on 1 May 2018). Google Scholar | |
| Greenland, S (1994) Alternative models for ordinal logistic regression. Statistics in Medicine, 13, 1665–1677. Google Scholar | Medline | ISI | |
| Hastie, TJ, Tibshirani, R, Friedman, J (editors) (2009) The elements of statistical learning: Data mining, inference and prediction. In Springer Series in Statistics, 2nd edition. New York: Springer. Google Scholar | |
| Hastie, TJ, Tibshirani, RJ, Wainwright, M (editors) (2015) Statistical learning with sparsity: The lasso and its generalizations. In Monographs on Statistics and Applied Probability, 1st edition. New York: CRC Press. Google Scholar | |
| Judge, J, BP Stats Team (2015) DRA: An in-depth discussion. URL http://www.baseballprospectus.com/article.php?articleid=26196(last accessed on 1 May 2018). Google Scholar | |
| Likert, R (1932). A technique for the measurement of attitudes. Archives of Psychology, 140, 1–55. Google Scholar | |
| Morris, CN (1983) Parametric empirical Bayes inference: Theory and applications. Journal of the American Statistical Association, 78, 47–55. Google Scholar | ISI | |
| Nesterov, Y (2007) Gradient methods for minimizing composite objective function (Technical report 2007076). Universite catholique de Louvain, Center for Operations Research and Econometrics (CORE). Google Scholar | |
| Null, B (2009) Modeling baseball player ability with a nested Dirichlet distribution. Journal of Quantitative Analysis in Sports, 5. Google Scholar | |
| R Core Team (2016) R: A language and environ- ment for statistical computing. Vienna: R Foundation for Statistical Computing. URL http://www.R-project.org/. Google Scholar | |
| Robinson, AJ (1989) Dynamic error propagation networks. PhD dissertation, University of Cambridge, UK. Google Scholar | |
| Tibshirani, R (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society B, 58, 267–288. Google Scholar | |
| Tutz, G, Gertheiss, J (2016) Regularized regression for categorical data (with discussion and rejoinder). Statistical Modelling, 16, 161–260. Google Scholar | SAGE Journals | ISI | |
| Yee, TW (2010) The VGAM package for categorical data analysis. Journal of Statistical Software, 32, 1–34. Google Scholar | ISI | |
| Yee, TW, Hastie, TJ (2003) Reduced-rank vector generalized linear models. Statistical Modelling, 3, 15–41. Google Scholar | SAGE Journals | ISI |
