MEGH: A parametric class of general hazard models for clustered survival data

In many applications of survival data analysis, the individuals are treated in different medical centres or belong to different clusters defined by geographical or administrative regions. The analysis of such data requires accounting for between-cluster variability. Ignoring such variability would impose unrealistic assumptions in the analysis and could affect the inference on the statistical models. We develop a novel parametric mixed-effects general hazard (MEGH) model that is particularly suitable for the analysis of clustered survival data. The proposed structure generalises the mixed-effects proportional hazards and mixed-effects accelerated failure time structures, among other structures, which are obtained as special cases of the MEGH structure. We develop a likelihood-based algorithm for parameter estimation in general subclasses of the MEGH model, which is implemented in our R package MEGH. We propose diagnostic tools for assessing the random effects and their distributional assumption in the proposed MEGH model. We investigate the performance of the MEGH model using theoretical and simulation studies, as well as a real data application on leukaemia.

We make the following technical assumptions.
A1. The parameter space Γ is compact.
A3. (Identifiability and continuity) The baseline hazard function h 0 (t; θ) is continuous for each θ and t > 0, and satisfies that h 0 (·; θ ) is different from the Weibull hazard function. A4.
A7. There exist functions Λ k 1 k 2 k 3 (z), such that for all 1 ≤ k 1 , k 2 , k 3 ≤ l and η ∈ B The proof is based on adapting the proof of where R represents the remainder term. Following the proof of Theorem 2.

Power Generalised Weibull
The PGW pdf, survival function and hazard functions of the PGW are as follows [3]: , where η > 0 is a scale parameter and ν, δ > 0 are shape parameters.

Additional Simulation Results
In this section, we report some additional results regarding the simulations for the MEGH model (9). We first present simulation results for the case when the variance of random effects is smaller, where we consider σ u = 0.5. The results, which are shown in Figure 1, indicate that the MEGH model with the correct mixed hazard structure produces the smallest bias in this case as well.
We then present simulation results for the case when the random-effects distribution is misspecified in the simulations. For this, we generate the random effects from a two-piece normal distribution so that σ u = 1, while we fit the model assuming the standard normal distribution for random effects. The results presented in Figure 2 suggest that the estimates from the MEGH model with the correct mixed hazard structure are quite robust with respect to the misspecification of random-effects distribution. This finding is in line with the existing literature on this type of misspecification. However, we observe that the MEGH model with both the incorrect mixed hazard structure and the misspecified random-effects distribution produces substantially inaccurate estimates.
We also present simulation results for the case when both mixed structures I and II are misspecified.
For this, we generate simulated data from model (9) with the general mixed structure (1) and PGW baseline hazard. In this case, the two sets of random effects u i andũ i are generated from normal distributions with σ u = 1 and σũ = 0.5 respectively, and with cov(σ u , σũ) = 0.2. The results, which are shown in Figure   The results, presented in Table 1, indicate that the Type I error of the test is at the nominal level 0.05 and the power of the test is reasonably high even with smaller number of clusters or higher censoring rate.
Finally, Figure 8 shows the confidence intervals for the leukemia data. There seem to be slight differences between the confidence intervals obtained by the MEGH model and the model ignoring random effects. One may find the parameter β 2 is not significant here using the model ignoring random effects, while the MEGH model hardly shows that. It should be pointed out that the estimate of σ u is relatively small for this data set, and we would expect the differences to be larger if σ u was bigger, as shown in our simulations.      (9) with the mixed structure I and log-logistic baseline hazard, and a two-piece normal distribution for the generated random effects with σ u = 1.