Dynamic prediction of survival using multivariate functional principal component analysis: A strict landmarking approach

Dynamically predicting patient survival probabilities using longitudinal measurements has become of great importance with routine data collection becoming more common. Many existing models utilize a multi-step landmarking approach for this problem, mostly due to its ease of use and versatility but unfortunately most fail to do so appropriately. In this article we make use of multivariate functional principal component analysis to summarize the available longitudinal information, and employ a Cox proportional hazards model for prediction. Additionally, we consider a centred functional principal component analysis procedure in an attempt to remove the natural variation incurred by the difference in age of the considered subjects. We formalize the difference between a ‘relaxed’ landmarking approach where only validation data is landmarked and a ‘strict’ landmarking approach where both the training and validation data are landmarked. We show that a relaxed landmarking approach fails to effectively use the information contained in the longitudinal outcomes, thereby producing substantially worse prediction accuracy than a strict landmarking approach.


A. Simulation study -additional censoring scenarios
In this Section we display some additional simulation results that highlight the effect of different censoring scenarios on the performance of the models.The scenarios shown here are: • Scenario 1: Time-on-study data and light censoring (≈ 20% of observations censored).
The underlying multivariate data and survival times are generated identically to the procedure described in the article.Right-censoring times C i were generated from an exponential distribution with rate exp(−3.5)(light), exp(−2.75)(median), exp(−2) (heavy) to obtain the censoring percentages stated above.
The figures below display the performance measures of the considered methods for the different scenarios.

Figure 1 .
Figure 1.Time dependent AUC (tdAUC), Brier Score and MSE in the third scenario (time-on-study data, light censoring) for the considered methods over landmark times at 3, 6 and 9 years after baseline.Landmark method ("LM"); age-based centered ("ABC").Dashed lines: relaxed landmarked methods.Solid lines: strictly landmarked methods.Dotted lines: true probabilities.MFPCCox 1 (LM: Relax, LASSO: No, ABC:No) used as reference method.Number of people at risk at evaluation times displayed in red.

Figure 2 .
Figure 2. Time dependent AUC (tdAUC), Brier Score and MSE in the fifth scenario (time-on-study data, heavy censoring) for the considered methods over landmark times at 3, 6 and 9 years after baseline.Landmark method ("LM"); age-based centered ("ABC").Dashed lines: relaxed landmarked methods.Solid lines: strictly landmarked methods.Dotted lines: true probabilities.MFPCCox 1 (LM: Relax, LASSO: No, ABC:No) used as reference method.Number of people at risk at evaluation times displayed in red.

Figure 4 .
Figure 4. Time dependent AUC (tdAUC), Brier Score and MSE in the eight scenario (age-at-observation data, heavy censoring) for the considered methods over landmark times at 3, 6 and 9 years after baseline.Landmark method ("LM"); age-based centered ("ABC").Dashed lines: relaxed landmarked methods.Solid lines: strictly landmarked methods.Dotted lines: true probabilities.MFPCCox 1 (LM: Relax, LASSO: No, ABC:No) used as reference method.Number of people at risk at evaluation times displayed in red.