Edinburgh Research Explorer Conditioning on a collider may or may not explain the relationship between lower neuroticism and premature mortality in Gale et al. (2017)

In their Commentary, Richardson, Davey Smith, and Munafo (2018) note that our findings of a health-protective effect of neuroticism could be due to our conditioning on a collider (self-rated health). They conducted exploratory analyses on 18 covariates and found evidence in support of this interpretation. However, in our paper and this Reply, we carried out analyses that suggested that the health-protective effects of neuroticism were attributable to a neuroticism facet related to worry and vulnerability These analyses did not condition upon self-rated health or other possible colliders. As such, our results suggest that self-rated health may have been a suppressor variable. This interpretation is consistent with previous findings. Future studies will reveal whether self-rated health is a collider, a suppressor, or both. Until then, however, these results and those of our earlier study recommend an in-depth study of the mortality and neuroticism at the level of facets.


Reply
We (Gale et al., 2017) analyzed data on 321,456 UK Biobank participants to address why higher neuroticism is sometimes related to lower mortality (e.g., Korten et al., 1999). We noted that the studies that revealed an inverse relationship between neuroticism and mortality included self-rated health as a predictor, perhaps because it is associated with mortality, even when multiple objective measures of health status are included in models (for reviews, see Benyamini & Idler, 1999;Idler & Benyamini, 1997). We therefore tested whether including self-rated health, which was modestly related to neuroticism, r S = .23 (Gale et al., p. 1348), in a model changed the sign of the neuroticism-mortality relationship. We found that it did so and then set out to test two possible explanations for this phenomenon.
The first possible explanation that we tested was that self-rated health moderated the neuroticism-mortality association. We therefore included a Neuroticism × Self-Rated Health interaction in our models and also examined the association between neuroticism and mortality at each self-rated health stratum. The interaction was significant only for cancer death; analyses stratified by self-rated health suggested that neuroticism was protective against all-cause mortality among participants with "fair" self-rated health and protective against cancer mortality among participants with "fair" or "poor" selfrated health (Gale et al., 2017(Gale et al., , pp. 1350(Gale et al., -1352. The second possible explanation that we tested was that self-rated health acted as a negative suppressor (Tzelgov & Henrik, 1991). In other words, we tested whether including self-rated health in models controlled for aspects of neuroticism related to poorer health. This explanation is based on the fact that in the presence of the pattern of correlations that gives rise to the statistical phenomenon known as negative suppression, mediation analyses will result in opposite signs for the direct and indirect effects (Tzelgov & Henrik, 1991); this is sometimes called "inconsistent mediation" (see MacKinnon, Krull, & Lockwood, 2000, p. 175). The self-rated-health and personality literatures, as much as it is possible for longitudinal and prospective studies to do so, provide findings that are consistent with this explanation. This literature shows that neuroticism has a negative effect on (is on a causal pathway to) self-rated health (Löckenhoff, Terracciano, Ferrucci, & Costa, 2012), that neuroticism has a positive effect on mortality (Graham et al., 2017), and that poor self-rated health has a positive effect on mortality (for reviews, see Benyamini & Idler, 1999;Idler & Benyamini, 1997), including even non-illness-related mortality (Heistaro, Jousilahti, Lahelma, Vartiainen, & Puska, 2001 have a direct and positive effect on mortality and an indirect negative effect (via self-rated health) on mortality. This was in fact found, although only in women, in a previous study (Ploubidis & Grundy, 2009).
Because personality-health associations are often limited to one or a few personality facets (e.g., Terracciano et al., 2009), we followed a reviewer's suggestion and tested whether variance related to health-harming but not health-helping facets of neuroticism might have been controlled for by the inclusion of self-rated health in the model. Our analyses proceeded as follows. We first had to contend with the fact that participants completed the short Neuroticism scale from the Revised Eysenck Personality Questionnaire (EPQ-R; Eysenck, Eysenck, & Barrett, 1985), which does not operationalize facets. We therefore operationalized facets by conducting an exploratory bifactor analysis. This involves extracting factors using exploratory factor analysis and then rotating these factors so that all items have high loadings on a general factor and each item has a high loading on, at most, one of two or more special factors that are orthogonal to the general factor ( Jennrich & Bentler, 2011. The latent variable for each special factor, therefore, is made up only of item variance related to that special factor; the latent variable score for the general factor consists only of the common item variance. A reviewer advised that we use bifactor analysis because, by not doing so, such as by using simple sum scores, it is not possible to determine whether an association between a facet and an outcome is related to the facet or the general factor, which would share a considerable portion of variance with the facet (see Wiernik, Wilmot, & Kostal, 2015, for a discussion). Alongside the general neuroticism factor, the analysis yielded two special factors, representing facets that we labeled "anxious/tense" and "worried/vulnerable" (see Fig. 1).
We then examined associations between these latent variable scores and each mortality outcome in one model that included age, sex, and the general neuroticism factor and a second model that additionally included all the covariates, including self-rated health. The critical analyses were those related to the first model (Gale et al., 2017, Table 4). There could be no conditioning on a collider in this model because neither self-rated health nor other health-related covariates were included. For death from all causes, cancer, cardiovascular disease, and respiratory disease, but not from external causes, higher worried/vulnerable scores were associated with reduced risk. Anxious/tense scores were not associated with any mortality outcomes. Only the relationship between the worried/ vulnerable facet and all-cause mortality survived adjusting for the covariates and correction for the false discovery rate. Richardson and colleagues (2019) proposed that our findings relating to conditioning on self-rated health could be spurious if neuroticism and mortality risk factors independently influenced self-rated health because, in these circumstances, self-rated health is a collider (note that in their model, self-rated health is a consequence and not a cause of poor health). To explore this possibility, they examined associations between neuroticism and the health-related covariates from our analyses, both in the total sample and stratified by self-rated health. For some covariates, they found that neuroticism was associated with increased risk in the total sample and reduced risk in self-rated health strata, an example of Simpson's (1951) paradox. They interpreted this as evidence that our finding of a healthprotective effect of neuroticism was spurious.
We thank Richardson and colleagues for their comments. Collider bias may lead to results such as those we reported. However, as noted, negative suppression may also lead to the same results, and our facet-level analyses that we described above did not include possible colliders, and so seems to support this explanation of the phenomenon. We also wish to be clear that we did not conclude from our previous study that we think the phenomenon in question was the result of neuroticism having a different effect at different levels of selfrated health, which seems to be what Richardson and colleagues surmised. To prevent further misunderstanding, we concluded in the final paragraph that "perhaps the most promising avenue for future research would be a closer examination of the role of Neuroticism's facets" (Gale et al., 2017(Gale et al., , p. 1355. In other words, we judged that our results supported the possibility that self-rated health acted as a negative suppressor that revealed the effects of a specific facet of neuroticism.
In response to their Commentary, we now report further analyses that we carried out to test whether their findings can also be explained by the worried/vulnerable facet.

Method
To start, we tested whether neuroticism's bifactor structure replicated in two independent data sets: 8,158 participants in Generation Scotland (Smith et al., 2006) and the 1,434 participants used to develop the EPQ-R (Eysenck et al., 1985). We then used data from the participants from our previous article and from Richardson and colleagues' analyses to conduct three sets of new analyses, which we describe below.
First, we used multinomial logit regression to examine associations between self-rated health strata and the latent neuroticism scores. This model was adjusted for sex and age. Second, we examined associations between mortality and the general neuroticism factor, both neuroticism facets separately, and both facets together. Sex and age were present in all models, and we tested whether the healthrelated covariates attenuated these associations. Note that the sex-and age-adjusted models are critical because they do not include possible colliders.
Third, we examined associations between the healthrelated covariates and (a) general neuroticism, the anxious/ tense facet, and the worried/vulnerable facet separately and (b) both facets together. This was done to test whether the general neuroticism factor and the facets, and especially the worried/vulnerable facet, were associated with these variables in different directions. That is, we tested whether Richardson and colleagues' results could be explained by the facet-level mechanism that we proposed.

Results
The bifactor structure of neuroticism's items replicated in the new samples (see Table S1 in the Supplemental Material available online). The first set of analyses (see Table S2) revealed that higher general neuroticism and anxious/tense scores were associated with poorer selfrated health; higher worried/vulnerable scores were associated with better self-rated health.
The second set of analyses (see Table 1) revealed that higher general neuroticism was associated with greater all-cause mortality but only in the sex-and ageadjusted model; the anxious/tense facet was not associated with all-cause mortality. Higher scores on the worried/vulnerable facet, on the other hand, were associated with reduced all-cause mortality in the sex-and age-adjusted model (a key result) and in the fully adjusted model. For specific causes of death, we found a similar pattern of results in sex-and age-adjusted models, but adding the other covariates rendered effects nonsignificant. We did not find a significant association in the sex-and age-adjusted model for cancer when both facets were included simultaneously.
The third set of analyses (see Table S3) revealed that for nearly every covariate, higher general neuroticism  (Eysenck, Eysenck, & Barrett, 1985). The circles and arrows at the bottom of the figure represent loadings of facets onto specific Neuroticism items from this scale. All loadings were positive. Loadings less than |.3| are not presented. Item numbers in the figure correspond to the position of the items presented in Table 9 in the article by Eysenck et al. Item wording is presented in Appendix 2 of Eysenck et al. .733 Note: Estimates represent hazard ratios (HRs); 95% confidence intervals are given in brackets. Estimates were first adjusted for age and sex and then, in addition, for other covariates at baseline: health behaviors (smoking status, frequency of alcohol intake, number of types of exercise engaged in, and daily consumption of fruits and vegetables), physical attributes (body mass index, forced expiratory volume in 1 s, systolic blood pressure, and grip strength), reaction time, existing illness (diagnosis of vascular or heart problems, diabetes, cancer, asthma, chronic lung disease, deep vein thrombosis, or pulmonary embolism at baseline), and socioeconomic position (Townsend index score and highest educational qualification). Alpha was set to .001. scores were associated with higher risk, and higher worried/vulnerable scores were protective (another key result).

Discussion
The worried/vulnerable facet of neuroticism is linked to better health in models that do not include possible colliders. This finding suggests that in addition to considering collider bias as an explanation for why the inclusion of self-rated health causes neuroticism to become protective, one must consider the possibility that self-rated health is a negative suppressor, which reveals the action of a neuroticism facet related to worry and vulnerability. The latter explanation has the same number of parameters as the collider-bias explanation but describes a scenario in which neuroticism has a direct positive effect on health and an indirect negative effect on health via self-rated health. However, the latter explanation, which we favor, appears to be more consistent with findings from the literature at the time of the study and those that emerged since then. These findings include those relating to the association between self-rated health and mortality (Benyamini & Idler, 1999;Heistaro, Jousilahti, Lahelma, Vartiainen, & Puska, 2001;Idler & Benyamini, 1997), which we cited earlier, and those from the study by Ploubidis and Grundy (2009), who modeled the association between neuroticism and mortality. Their study found that, in women but not men, the direct effect of neuroticism was related to reduced risk and that the indirect effect of neuroticism via a somatic health factor (it loaded on self-rated health) was related to increased mortality. Similarly, another study found that "body vigilance" acted as a suppressor: When it was included in a model, the effect of neuroticism was protective (Weston & Jackson, 2018). Our findings are also consistent with the fact that not all facets of a personality domain are responsible for personality-health associations (e.g., Terracciano et al., 2009) and that although Type 2 diabetes is related to lower neuroticism net of depression (Čukić & Weiss, 2014), Type 1 diabetes risk, which cannot be reduced by vigilance, is related to higher neuroticism, regardless of whether depression is included in the model (Čukić & Weiss, 2016). Finally, a Mendelian randomization analysis by Nagel, Watanabe, Stringer, Posthuma, and van der Sluis (2018) found that a similar facet, which they labeled "worry" was related to lower waist circumference and lower body mass index, both mortality risk factors. Thus, multiple strands point to a health-protective role for neuroticism via vigilance or for one or more neuroticism facets related to vigilance. That said, because Nagel et al. did not use a bifactor analysis to obtain this facet, it is not possible to know whether the relationships that they report reflect associations with the facet or with the general neuroticism factor (see our earlier discussion on this point).
Richardson and colleagues are correct: One should be alert to the possibility of collider bias, and we agree that that was a possible-but not the only-interpretation of our original results. Future studies on the causal association between self-rated health and mortality will be key to understanding which explanation is likely to be correct. Nonetheless, our results as a whole, and the literature, appear to better support an alternative possibility that some neuroticism facets are associated with higher health risks, some are neutral, and some are protective.

Action Editor
D. Stephen Lindsay served as action editor for this article.

Author Contributions
A. Weiss planned the reply and analyses and conducted the exploratory bifactor analyses. C. R. Gale conducted the analyses presented in Table 1 and in Table S2

Open Practices
Data from UK Biobank are available to approved researchers at https://www.ukbiobank.ac.uk/. The other data used in this study are available on request from the corresponding author.