Abstract
The model selection literature has been generally poor at reflecting the deep foundations of the Akaike information criterion (AIC) and at making appropriate comparisons to the Bayesian information criterion (BIC). There is a clear philosophy, a sound criterion based in information theory, and a rigorous statistical foundation for AIC. AIC can be justified as Bayesian using a “savvy” prior on models that is a function of sample size and the number of model parameters. Furthermore, BIC can be derived as a non-Bayesian result. Therefore, arguments about using AIC versus BIC for model selection cannot be from a Bayes versus frequentist perspective. The philosophical context of what is assumed about reality, approximating models, and the intent of model-based inference should determine whether AIC or BIC is used. Various facets of such multimodel inference are presented here, particularly methods of model averaging.
|
Akaike, Hirotugu . 1973. “Information Theory as an Extension of the Maximum Likelihood Principle.” Pp. 267-281 in Second International Symposium on Information Theory, edited by B. N. Petrov and F. Csaki . Budapest: Akademiai Kiado. Google Scholar | |
|
Akaike, Hirotugu . 1974. “A New Look at the Statistical Model Identification.” IEEE Transactions on Automatic Control AC-19:716-723. Google Scholar | Crossref | ISI | |
|
Akaike, Hirotugu . 1981. “Likelihood of a Model and Information Criteria.” Journal of Econometrics 16:3-14. Google Scholar | Crossref | ISI | |
|
Akaike, Hirotugu . 1983. “Information Measures and Model Selection.” International Statistical Institute 44:277-291. Google Scholar | |
|
Akaike, Hirotugu . 1985. “Prediction and Entropy.” Pp. 1-24 in A Celebration of Statistics, edited by Anthony C. Atkinson and Stephen E. Fienberg . New York: Springer-Verlag. Google Scholar | Crossref | |
|
Akaike, Hirotugu . 1992. “Information Theory and an Extension of the Maximum Likelihood Principle.” Pp. 610-624 in Breakthroughs in Statistics, vol. 1, edited by Samuel Kotz and Norman L. Johnson . London: Springer-Verlag. Google Scholar | Crossref | |
|
Akaike, Hirotugu . 1994. “Implications of the Informational Point of View on the Development of Statistical Science.” Pp. 27-38 in Engineering and Scientific Applications: Vol. 3. Proceedings of the First US/Japan Conference on the Frontiers of Statistical Modeling: An Informational Approach, edited by Hamparsum Bozdogan . Dordrecht, the Netherlands: Kluwer Academic. Google Scholar | |
|
Andserson, David R. and Kenneth P. Burnham . 2002. “Avoiding Pitfalls When Using Information-Theoretic Methods.” Journal of Wildlife Management 66:910-916. Google Scholar | ISI | |
|
Azzalini, Adelchi . 1996. Statistical Inference Based on the Likelihood. London: Chapman & Hall. Google Scholar | |
|
Boltzmann, Ludwig . 1877. “Uber die Beziehung Zwischen dem Hauptsatze der Mechanischen Warmetheorie und der Wahrscheinlicjkeitsrechnung Respective den Satzen uber das Warmegleichgewicht.” Wiener Berichte 76:373-435. Google Scholar | |
|
Breiman, Leo . 1992. “The Little Bootstrap and Other Methods for Dimensionality Selection in Regression: X-Fixed Prediction Error.” Journal of the American Statistical Association 87:738-754. Google Scholar | Crossref | ISI | |
|
Breiman, Leo . 2001. “Statistical Modeling: The Two Cultures.” Statistical Science 26:199-231. Google Scholar | |
|
Buckland, Steven T. , Kenneth P. Burnham , and Nicole H. Augustin . 1997. “Model Selection: An Integral Part of Inference.” Biometrics 53:603-618. Google Scholar | Crossref | ISI | |
|
Burnham, Kenneth P. and David R. Anderson . 1998. Model Selection and Inference: A Practical Information-Theoretical Approach. New York: Springer-Verlag. Google Scholar | Crossref | |
|
Burnham, Kenneth P. and David R. Anderson . 2002. Model Selection and Multimodel Inference: A Practical Information-Theoretical Approach. 2d ed. New York: Springer-Verlag. Google Scholar | |
|
Cavanaugh, Joseph E. and Andrew A. Neath . 1999. “Generalizing the Derivation of the Schwarz Information Criterion.” Communication in Statistics Theory and Methods 28:49-66. Google Scholar | Crossref | ISI | |
|
Chamberlin, Thomas . [1890] 1965. “The Method of Multiple Working Hypotheses.” Science 148:754-759. Google Scholar | |
|
deLeeuw, Jan . 1992. “Introduction to Akaike (1973) Information Theory and an Extension of the Maximum Likelihood Principle.” Pp. 599-609 in Breakthroughs in Statistics, vol. 1, edited by Samuel Kotz and Norman L. Johnson . London: Springer-Verlag. Google Scholar | Crossref | |
|
Edwards, AnthonyW. F. 1992. Likelihood. Expanded ed. Baltimore: Johns Hopkins University Press. Google Scholar | |
|
Forster, Malcolm R. 2000. “Key Concepts in Model Selection: Performance and Generalizability.” Journal of Mathematical Psychology 44:205-231. Google Scholar | Crossref | Medline | ISI | |
|
Forster, Malcolm R. . 2001. “The New Science of Simplicity.” Pp. 83-119 in Simplicity, Inference and Modelling: Keeping It Sophisticatedly Simple, edited by Arnold Zellner , Hugo A. Keuzenkamp , and Michael McAleer . Cambridge, UK: Cambridge University Press. Google Scholar | |
|
Forster, Malcolm R. and Elliott Sober . 1994. “How to Tell Simpler, More Unified, or Less Ad Hoc Theories Will Provide More Accurate Predictions.” British Journal of the Philosophy of Science 45:1-35. Google Scholar | Crossref | ISI | |
|
Gelfand, Alan and Dipak K. Dey . 1994. “Bayesian Model Choice: Asymptotics and Exact Calculations.” Journal of the Royal Statistical Society, Series B 56:501-514. Google Scholar | |
|
Gelman, Andrew , John C. Carlin , Hal S. Stern , and Donald B. Rubin . 1995. Bayesian Data Analysis. New York: Chapman & Hall. Google Scholar | |
|
Hand, David J. and Veronica Vinciotti . 2003. “Local Versus Global Models for Classification Problems: Fitting Models Where It Matters.” The American Statistician 57:124-131. Google Scholar | Crossref | ISI | |
|
Hansen, Mark H. and Charles Kooperberg . 2002. “Spline Adaptation in Extended Linear Models.” Statistical Science 17:2-51. Google Scholar | Crossref | ISI | |
|
Hoeting, Jennifer A. , David Madigan , Adrian E. Raftery , and Chris T. Volinsky . 1999. “Bayesian Model Averaging: A Tutorial (With Discussion).” Statistical Science 14:382-417. Google Scholar | ISI | |
|
Hurvich, Clifford M. and Chih-Ling Tsai . 1989. “Regression and Time Series Model Selection in Small Samples.” Biometrika 76:297-307. Google Scholar | Crossref | ISI | |
|
Hurvich, Clifford M. and Chih-Ling Tsai . 1995. “Model Selection for Extended Quasi-Likelihood Models in Small Samples.” Biometrics 51:1077-1084. Google Scholar | Crossref | Medline | ISI | |
|
Johnson, Roger W. 1996. “Fitting Percentage of Body Fat to Simple Body Measurements.” Journal of Statistics Education 4(1). Retrieved from www.amstat.org/publications/jse/v4n1/datasets.johnson.html Google Scholar | |
|
Kass, Robert E. and Adrian E. Raftery . 1995. “Bayes Factors.” Journal of the American Statistical Association 90:773-795. Google Scholar | Crossref | ISI | |
|
Key, Jane T. , Luis R. Pericchi , and Adrian F. M. Smith . 1999. “Bayesian Model Choice: What and Why?” Pp. 343-370 in Bayesian Statistics 6, edited by Jos¥e M. Bernardo , James O. Berger , A. Philip Dawid , and Adrian F. M. Smith . Oxford, UK: Oxford University Press. Google Scholar | |
|
Kullback, Soloman and Richard A. Leibler . 1951. “On Information and Sufficiency.” Annals of Mathematical Statistics 22:79-86. Google Scholar | Crossref | |
|
Lahiri, Partha , ed. 2001. Model Selection. Beachwood, OH: Lecture Notes-Monograph Series, Institute of Mathematical Statistics. Google Scholar | |
|
Lehman, Eric L. 1990. “Model Specification: The Views of Fisher and Neyman, and Later Observations.” Statistical Science 5:160-168. Google Scholar | Crossref | |
|
Linhart, H. and Walter Zucchini . 1986. Model Selection. New York: John Wiley. Google Scholar | |
|
McQuarrie, Alan D. R. and Chih-Ling Tsai . 1998. Regression and Time Series Model Selection. Singapore: World Scientific Publishing Company. Google Scholar | Crossref | |
|
Meyer, Mary C. and PurushottamW. Laud . 2002. “Predictive Variable Selection in Generalized Linear Models.” Journal of the American Statistical Association 97:859-871. Google Scholar | Crossref | ISI | |
|
Parzen, Emmanuel , Kunio Tanabe , and Genshiro Kitagawa , eds. 1998. Selected Papers of Hirotugu Akaike. New York: Springer-Verlag. Google Scholar | Crossref | |
|
Raftery, Adrian E. 1995. “Bayesian Model Selection in Social Research (With Discussion).” Sociological Methodology 25:111-195. Google Scholar | Crossref | ISI | |
|
Raftery, Adrian E. . 1996. “Approximate Bayes Factors and Accounting for Model Uncertainty in Generalized Linear Regression Models.” Biometrika 83:251-266. Google Scholar | Crossref | ISI | |
|
Reschenhofer, Erhard . 1996. “Prediction With Vague Prior Knowledge.” Communications in Statistics—Theory and Methods 25:601-608. Google Scholar | Crossref | ISI | |
|
Royall, Richard M. 1997. Statistical Evidence: A Likelihood Paradigm. London: Chapman & Hall. Google Scholar | |
|
Stone, Mervyn . 1974. “Cross-Validatory Choice and Assessment of Statistical Predictions (With Discussion).” Journal of the Royal Statistical Society, Series B 39:111-147. Google Scholar | |
|
Stone, Mervyn . 1977. “An Asymptotic Equivalence of Choice of Model by Cross-Validation and Akaike’s Criterion.” Journal of the Royal Statistical Society, Series B 39:44-47. Google Scholar | |
|
Schwarz, Gideon . 1978. “Estimating the Dimension of a Model.” Annals of Statistics 6:461-464. Google Scholar | Crossref | ISI | |
|
Spiegelhalter, David J. , Nicola G. Best , Bradley P. Carlin , and Angelita van der Linde . 2002. “Bayesian Measures of Model Complexity and Fit.” Journal of the Royal Statistical Society, Series B 64:1-34. Google Scholar | |
|
Sugiura, Nariaki . 1978. “Further Analysis of the Data by Akaike’s Information Criterion and the Finite Corrections.” Communications in Statistics, Theory and Methods A7:13-26. Google Scholar | Crossref | ISI | |
|
Takeuchi, Kei . 1976. “Distribution of Informational Statistics and a Criterion of Model Fitting” (in Japanese). Suri-Kagaku (Mathematic Sciences) 153:12-18. Google Scholar | |
|
Wasserman, Larry . 2000. “Bayesian Model Selection and Model Averaging.” Journal of Mathematical Psychology 44 :92-107. Google Scholar | Crossref | Medline | ISI | |
|
Weakliem, David L. 1999. “A Critique of the Bayesian Information Criterion for Model Selection.” Sociological Methods & Research 27:359-397. Google Scholar | SAGE Journals | ISI | |
|
Williams, David . 2001. Weighing the Odds: A Course in Probability and Statistics. Cambridge, UK: Cambridge University Press. Google Scholar | Crossref |
