Simple models are preferred over complex models, but over-simplistic models could lead to erroneous interpretations. The classical approach is to start with a simple model, whose shortcomings are assessed in residual-based model diagnostics. Eventually, one increases the complexity of this initial overly simple model and obtains a better-fitting model. I illustrate how transformation analysis can be used as an alternative approach to model choice. Instead of adding complexity to simple models, step-wise complexity reduction is used to help identify simpler and better interpretable models. As an example, body mass index (BMI) distributions in Switzerland are modelled by means of transformation models to understand the impact of sex, age, smoking and other lifestyle factors on a person's BMI. In this process, I searched for a compromise between model fit and model interpretability. Special emphasis is given to the understanding of the connections between transformation models of increasing complexity. The models used in this analysis ranged from evergreens, such as the normal linear regression model with constant variance, to novel models with extremely flexible conditional distribution functions, such as transformation trees and transformation forests.

Breiman, L (2001) Random forests. Machine Learning, 45, 532. doi:10.1023/A:1010933404324
Google Scholar | Crossref | ISI
Bundesamt für Statistik (2013) Die Schweizerische Gesundheitsbefragung 2012 in Kürze: Konzept, Methode, Durchführung [The Swiss Health Survey 2012 in Short: Concept, Method, Implementation]. Bern. URL http://www.bfs.admin.ch
Google Scholar
Chernozhukov, V, Fernández-Val, I, Melly, B (2013) Inference on counterfactual distributions. Econometrica, 81, 22052268. doi:10.3982/ECTA10582.
Google Scholar | Crossref
Fahrmeir, L, Kneib, T, Lang, S, Marx, B (2013) Regression: Models, Methods and Applications. New York, NY: Springer-Verlag.
Google Scholar | Crossref
Farouki, RT (2012) The Bernstein polynomial basis: A centennial retrospective. Computer Aided Geometric Design, 29, 379419. doi:10.1016/j.cagd.2012.03.001.
Google Scholar | Crossref | ISI
Hothorn, T (2018) trtf: Transformation Trees and Forests. R package version 0.3-0. URL https://CRAN.R-project.org/package=trtf
Google Scholar
Hothorn, T (2017a) mlt: Most Likely Transformations. R package version 0.2-1. URL https://CRAN.R-project.org/package=mlt
Google Scholar
Hothorn, T (2017b) Most Likely Transformations: The mlt Package. R package vignette version 0.2-0. URL https://CRAN.R-project.org/package=mlt.docreg
Google Scholar
Hothorn, T, Zeileis, A (2017) Transformation forests. Technical report, arXiv 1701.02110. URL https://arxiv.org/abs/1701.02110
Google Scholar
Hothorn, T, Kneib, T, Bühlmann, P (2013) Conditional transformation models by example. In Proceedings of the 28th International Workshop on Statistical Modelling, edited by VMR Muggeo, V Capursi, G Boscaino and G Lovison. Pages 15–26. Universitá Degli Studi Di Palermo. ISBN 978-88-96251-47-8.
Google Scholar
Hothorn, T, Kneib, T, Bühlmann, P (2014) Conditional transformation models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76, 327. doi:10.1111/rssb.12017
Google Scholar | Crossref | ISI
Hothorn, T, Möst, L, Bühlmann, P (2017) Most likely transformations. Scandinavian Journal of Statistics. URL https://arxiv.org/abs/1508.06749
Google Scholar | Crossref
Hothorn, T, Zeileis, A (2017) Transformation forests (Technical report, arXiv 1701.02110). URL https://arxiv.org/abs/1701.02110
Google Scholar
Liu, Q, Shepherd, BE, Li, C, Harrell, FE (2017) Modeling continuous response variables using ordinal regression. Statistics in Medicine. doi:10.1002/sim.7433
Google Scholar | Crossref
Lohse, T, Rohrmann, S, Faeh, D, Hothorn, T (2017) Continuous outcome logistic regression for analyzing body mass index distributions. F1000Research, 6, 1933. doi:10.12688/f1000research.12934.1
Google Scholar | Crossref
Manuguerra, M, Heller, GZ (2010) Ordinal regression models for continuous scales. The International Journal of Biostatistics, 6. doi:10.2202/1557-4679.1230
Google Scholar | Crossref
Möst, L, Hothorn, T (2015) Conditional transformation models for survivor function estimation. International Journal of Biostatistics. doi:10.1515/ijb-2014-0006
Google Scholar | Crossref
Möst, L, Schmid, M, Faschingbauer, F, Hothorn, T (2014) Predicting birth weight with conditionally linear transformation models. Statistical Methods in Medical Research. doi:10.1177/0962280214532745
Google Scholar
R Core Team (2017) R: A Language and Environment for Statistical Computing. >R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/
Google Scholar
UNESCO Institute for Statistics (2012) International Standard Classification of Education: ISCED 2011. Montreal. URL http://www.uis.unesco.org/Education/Documents/isced-2011-en.pdf
Google Scholar | Crossref
Access Options

My Account

Welcome
You do not have access to this content.



Chinese Institutions / 中国用户

Click the button below for the full-text content

请点击以下获取该全文

Institutional Access

does not have access to this content.

Purchase Content

24 hours online access to download content

Research off-campus without worrying about access issues. Find out about Lean Library here

Your Access Options


Purchase

SMJ-article-ppv for $37.50
Single Issue 24 hour E-access for $250.00

Cookies Notification

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Find out more.
Top