Abstract
Abstract:
This tutorial article demonstrates how time-to-event data can be modelled in a very flexible way by taking advantage of advanced inference methods that have recently been developed for generalized additive mixed models. In particular, we describe the necessary pre-processing steps for transforming such data into a suitable format and show how a variety of effects, including a smooth nonlinear baseline hazard, and potentially nonlinear and nonlinearly time-varying effects, can be estimated and interpreted. We also present useful graphical tools for model evaluation and interpretation of the estimated effects. Throughout, we demonstrate this approach using various application examples. The article is accompanied by a new
References
| Andersen, PK, Borgan, O, Gill, R, Keiding, N (1992) Statistical Models Based on Counting Processes. Berlin and New York, NY: Springer-Verlag. Google Scholar | |
| Argyropoulos, C, Unruh, ML (2015) Analysis of time to event outcomes in randomized controlled trials by generalized additive models. PLoS ONE, 10, e0123784. doi: 10.1371/journal.pone.0123784 Google Scholar | Crossref | Medline | |
| Bates, D, Mächler, M, Bolker, B, Walker, S (2015) Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67, 1–48. doi: 10.18637/jss.v067.i01 Google Scholar | Crossref | ISI | |
| Bender, A, Scheipl, F (2017, November 14). adibender/pammtools: v0.0.3.2 (Version v0.0.3.2). Zenodo. URL http://doi.org/10.5281/zenodo.1048832 Google Scholar | |
| Bender, A, Groll, A, Scheipl, F (2018, January 14). adibender/pammtutorial-smj: Release v1.0.1 (Version v1.0.1). Zenodo. URL http://doi.org/10.5281/zenodo.1147058 Google Scholar | |
| Bender, A, Scheipl, F, Küchenhoff, H, Day, AG, Hartl, W (2016) Modeling exposure-lag-response associations with penalized piece-wise exponential models (Technical report 108). Ludwig-Maximilians-University URL https://epub.ub.uni-muenchen.de/32010/. Google Scholar | |
| Berger, M, Schmid, M (2018) Semiparametric regression for discrete time-to-event data. Statistical Modelling, 18 322–345. Google Scholar | SAGE Journals | |
| Clayton, DG (1983) Fitting a general family of failure-time distributions using GLIM. Journal of the Royal Statistical Society. Series C (Applied Statistics), 32, 102–109. doi: 10.2307/2347288 Google Scholar | |
| Cox, DR (1972) Regression models and life tables (with discussion). Journal of the Royal Statistical Society, B 34, 187–220. Google Scholar | |
| Demarqui, FN, Loschi, RH, Colosimo, EA (2008) Estimating the grid of time-points for the piecewise exponential model. Lifetime Data Analysis, 14, 333–356. doi: 10.1007/s10985-008-9086-0 Google Scholar | Crossref | Medline | |
| Eilers, PHC (1998) Hazard smoothing with B-splines. Proceedings of the 13th International Workshop on Statistical Modelling, New Orleans, La, 200–207. Google Scholar | |
| Eilers, PHC, Marx, BD (1996) Flexible smoothing with B-splines and penalties. Statistical Science, 11, 89–121. doi: 10.1214/ss/1038425655 Google Scholar | Crossref | ISI | |
| Fox, J, Weisberg, HS (2011) An R Companion to Applied Regression. Thousand Oaks, CA: SAGE. ISBN 978-1-4129-7514-8. Google Scholar | |
| Friedman, M (1982) Piecewise exponential models for survival data with covariates. The Annals of Statistics, 10, 101–113. Google Scholar | Crossref | ISI | |
| Friedman, J, Hastie, T, Tibshirani, R (2010) Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33, 1–22. Google Scholar | Crossref | Medline | ISI | |
| Frumento, P (2016) pch: Piecewise constant hazards models for censored and truncated data. R package version 1.3. URL https://CRAN.R-project.org/package=pch Google Scholar | |
| Gasparrini, A, Scheipl, F, Armstrong, B, Kenward, MG (2017) A penalized framework for distributed lag non-linear models. Biometrics. doi: 10.1111/biom.12645 Google Scholar | Crossref | Medline | |
| Gerds, TA, Kattan, MW, Schumacher, M, Yu, C (2013) Estimating a time-dependent concordance index for survival prediction models with covariate dependent censoring. Statistics in Medicine, 32, 2173–2184. doi: 10.1002/sim.5681 Google Scholar | Crossref | Medline | ISI | |
| Groll, A, Hastie, T, Tutz, G (2017) Selection of effects in Cox frailty models by regularization methods. Biometrics, 73, 846–856. Google Scholar | Crossref | Medline | |
| Guo, G (1993) Event-history analysis for left-truncated data. Sociological Methodology, 23, 217–243. doi: 10.2307/271011 Google Scholar | |
| Hastie, T, Tibshirani, R (1993) Varying-coefficient models. Journal of the Royal Statistical Society. Series B (Methodological), 55, 757–796. doi: 10.2307/2345993 Google Scholar | |
| Holford, TR (1980) The analysis of rates and of survivorship using log-linear models. Biometrics, 36, 299–305. doi: 10.2307/2529982 Google Scholar | Crossref | Medline | ISI | |
| Hothorn, T, Bühlmann, P (2006) Model-based boosting in high dimensions. Bioinformatics, 22, 2828–2829. Google Scholar | Crossref | Medline | |
| Hothorn, T, Bühlmann, P, Kneib, T, Schmid, M, Hofner, B (2016) mboost: Model-based boosting. R package version 2.7-0. URL https://CRAN.R-project.org/package=mboost Google Scholar | |
| Hurvich, CM, Simonoff, JS, Tsai, C (1998) Smoothing parameter selection in non-parametric regression using an improved Akaike information criterion. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 60, 271–293. Google Scholar | Crossref | ISI | |
| Kalbfleisch, J, Prentice, R (1980) The Statistical Analysis of Failure Time Data. New York, NY: Wiley. Google Scholar | |
| Klein, JP, Moeschberger, ML (1997) Survival Analysis: Techniques for Censored and Truncated Data. New York, NY: Springer. Google Scholar | Crossref | |
| Laird, N, Olivier, D (1981) Covariance analysis of censored survival data using log-linear analysis techniques. Journal of the American Statistical Association, 76, 231–240. doi: 10.2307/2287816 Google Scholar | Crossref | ISI | |
| Marra, G, Wood, SN (2011) Practical variable selection for generalized additive models. Computational Statistics & Data Analysis, 55, 2372–2387. doi: 10.1016/j.csda.2011.02.004 Google Scholar | Crossref | ISI | |
| Martinussen, T, Scheike, TH (2006) Dynamic Regression Models for Survival Data. New York, NY: Springer. Google Scholar | |
| Mayr, A, Hofner, B (2018) Boosting for statistical modelling: A non-technical introduction. Statistical Modelling, 18 365–384. Google Scholar | SAGE Journals | |
| Meier, L, Van de, Geer S, Bühlmann, P (2008) The group LASSO for logistic regression. Journal of the Royal Statistical Society, B 70, 53–71. Google Scholar | Crossref | |
| R Core Team
(2016) Google Scholar | |
| Rodríguez-Girondo, M, Kneib, T, Cadarso-Suárez, C, Abu-Assi, E (2013) Model building in nonproportional hazard regression. Statistics in Medicine, 32, 5301–5314. doi: 10.1002/sim.5961 Google Scholar | Crossref | Medline | ISI | |
| Rossi, PH, Berk, RA, Lenihan, KJ (1980) Money, Work, and Crime: Experimental Evidence. New York: Academic Press. Google Scholar | |
| Ruppert, D, Wand, MP, Carroll, RJ (2003) Semiparametric Regression. Cambridge: Cambridge University Press. Google Scholar | Crossref | |
| Sennhenn-Reulen, H, Kneib, T (2016) Structured fusion LASSO penalized multi-state models. Statistics in Medicine. doi: 10.1002/sim.7017 Google Scholar | |
| Simon, N, Friedman, J, Hastie, T, Tibshirani, R (2011) Regularization paths for Cox's proportional hazards model via coordinate descent. Journal of Statistical Software, 39, 1. Google Scholar | Crossref | Medline | ISI | |
| Sylvestre, M-P, Abrahamowicz, M (2009) Flexible modeling of the cumulative effects of time-dependent exposures on the hazard. Statistics in Medicine, 28, 3437–3453. doi: 10.1002/sim.3701 Google Scholar | |
| Therneau, TM (2015) A package for survival analysis in S. R package version 2.38. URL http://cran.us.r-project.org/web/packages/survival/index.html Google Scholar | |
| Thomas, L, Reyes, EM (2014) Tutorial: Survival estimation for Cox regression models with time-varying coefficients using SAS and R. Journal of Statistical Software, Code Snippets, 61. URL https://www.jstatsoft.org/article/view/v061c01 Google Scholar | |
| Whitehead, J (1980) Fitting Cox's regression model to survival data using GLIM. Journal of the Royal Statistical Society. Series C (Applied Statistics), 29, 268–275. doi: 10.2307/2346901 Google Scholar | |
| Wood, SN (2011) Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73, 3–36. Google Scholar | Crossref | ISI | |
| Wood, SN (2012) On p-values for smooth components of an extended generalized additive model. Biometrika, 100, 221–228. doi: 10.1093/biomet/ass048 Google Scholar | Crossref | |
| Wood, SN (2017) mgcv: Mixed GAM Computation Vehicle with GCV/AIC/REML Smoothness Estimation. URL https://cran.r-project.org/web/packages/mgcv/index.html Google Scholar | |
| Wood, SN, Li, Z, Shaddick, G, Augustin, NH (2016) Generalized additive models for gigadata: Modelling the UK black smoke network daily data. Journal of the American Statistical Association, 1–40. doi: 10.1080/01621459.2016.1195744 Google Scholar | Medline | |
| Wood, SN, Pya, N, Saefken, B (2016) Smoothing parameter and model selection for general smooth models. Journal of the American Statistical Association, 111, 1548–1563. doi: 10.1080/01621459.2016.1180986 Google Scholar | Crossref | ISI |
