Psychometric Validation of the Parental Bonding Instrument in a U.K. Population–Based Sample: Role of Gender and Association With Mental Health in Mid-Late Life

The factorial structure of the Parental Bonding Instrument (PBI) has been frequently studied in diverse samples but no study has examined its psychometric properties from large, population-based samples. In particular, important questions have not been addressed such as the measurement invariance properties across parental and offspring gender. We evaluated the PBI based on responses from a large, representative population-based sample, using an exploratory structural equation modeling method appropriate for categorical data. Analysis revealed a three-factor structure representing “care,” “overprotection,” and “autonomy” parenting styles. In terms of psychometric measurement validity, our results supported the complete invariance of the PBI ratings across sons and daughters for their mothers and fathers. The PBI ratings were also robust in relation to personality and mental health status. In terms of predictive value, paternal care showed a protective effect on mental health at age 43 in sons. The PBI is a sound instrument for capturing perceived parenting styles, and is predictive of mental health in middle adulthood.

parenting: care and control (Parker, Tupling, & Brown, 1979). The care dimension measures positive parenting, including parental warmth and affection. The control dimension measures negative parenting, including parental control and constraint.

Psychometric Properties of the PBI and Methodological Challenges
Despite its widespread use, there is no consensus regarding the factor structure of the PBI. While some studies have confirmed the original two-factor structure (Kitamura et al., 2009;Mackinnon, Henderson, Scott, & Duncan-Jones, 1989;Parker et al., 1979), other studies have suggested three- (Cox, Enns, & Clara, 2000;Cubis, Lewin, & Dawes, 1989;Heider et al., 2005;E. Murphy, Brewin, & Silka, 1997;Sato et al., 1999) or four-factor solutions (Behzadi & Parker, 2015;Liu, Li, & Fang, 2011;Uji, Tanaka, Shono, & Kitamura, 2006). In most previous studies converging on a three-factor solution, items within the control factor have been shown to form two distinct factors: (a) overprotection, consisting of items such as "[my mother/father] felt I could not look after myself unless she or he was around" and (b) autonomy, consisting of items such as "[my mother/father] let me decide things for myself." In four-factor solutions, items originally measuring the care factor also separated into two dimensions, although this was observed mainly in non-European samples of Japanese (Uji et al., 2006), Chinese (Liu et al., 2011), and Persian respondents (Behzadi & Parker, 2015).
Apart from cultural or linguistic differences, several methodological issues may explain these factor structure inconsistencies. While some studies relied on exploratory factor analytic (EFA) methods or principal component analyses (e.g., Gómez-Beneyto, Pedrós, Tomás, Aguilar, & Leal, 1993;E. Murphy et al., 1997), others utilized confirmatory factor analyses (CFAs, e.g., Behzadi & Parker, 2015;Terra et al., 2009;Tsaousis, Mascha, & Giovazolias, 2011). Although traditional EFA methods can be useful for determining the number of factors to retain, they typically do not provide goodness-of-fit information; it is therefore difficult to assess whether the model provides an adequate representation of the data. On the other hand, although CFA methods are able to test theory-driven models and provide goodnessof-fit information, these methods rely on strict assumptions that do not often hold in practice. For instance, CFA relies on the highly restrictive independent cluster assumption, which forces all cross-loadings to be zero. When nonzero crossloadings are present in the population, such constraints can inflate the degree of associations between factors (Marsh, Morin, Parker, & Kaur, 2013;Morin, Marsh, & Nagengast, 2013). An increasingly popular method, exploratory structural equation modeling (ESEM), combines features of CFA and EFA, thus overcoming the typical restrictions of both and allowing the free estimation of all possible cross-loadings between items and nontarget factors Morin et al., 2013). The main advantage of ESEM over CFA is that it integrates the less restrictive assumptions of EFA with the benefits of structural equation modeling, such as goodness-of-fit indices, multigroup invariance analyses, and the ability to combine regression and structural equations within the same model Marsh et al., 2009;Marsh et al., 2010;Marsh, Nagengast, & Morin, 2012). Furthermore, simulations studies and studies of simulated data showed that ESEM tends to provide more exact estimates of true population values for factor correlations when cross-loadings are present in the population model, and to remain unbiased when the population model corresponds to the CFA assumption (Asparouhov, Muthén, & Morin, 2015).
A large body of empirical research (X. Chen, Liu, & Li, 2000;Cubis et al., 1989;Henry, Tolan, & Gorman-Smith, 2005;Hoeve, Dubas, Gerris, van der Laan, & Smeenk, 2011;Lansford, Laird, Pettit, Bates, & Dodge, 2014;Watson, Potts, Hardcastle, Forehand, & Compas, 2012) suggests that the association between parenting practices and offspring outcomes is dependent on the gender of the parent and of the offspring. Consistency of parenting style between both parents has also been investigated (Winsler, Madigan, & Aquilino, 2005). However, an important prerequisite to these comparisons is the demonstration that the PBI factors are psychometrically invariant across males and female offspring's ratings of their mothers and fathers (Byrne, Shavelson, & Muthén, 1989;Cheung, 2008). To our knowledge, no studies have examined the measurement invariance properties of the PBI instrument in relation to the gender of the offspring as well as of the parents.
As with many other instruments relying on self-reported measures, PBI ratings have been shown to be influenced by current depressive states (Eleanor Murphy, Wickramaratne, & Weissman, 2010) or sad mood (Gillham, Putter, & Kash, 2007). Similarly, personality has also been suggested to represent a possible source of bias in PBI ratings due to the subjective ways in which items are interpreted, which itself can be influenced by various respondent personality characteristics (Jakobsen & Jensen, 2015;Randall & Fernandes, 1991). Psychometric methods, such as multiple indicators multiple causes (MIMIC) tests of differential item functioning (DIF), can address this issue by detecting the extent to which item response differs as a function of various characteristics of the respondents over and above the relations between these characteristics and scores on the PBI factors. To our knowledge, no such studies have formally studied the response bias of the PBI in relation to personality or depressive state using state-of-the-art psychometric methods.
Furthermore, the PBI items are rated using a 4-point, ordered-categorical, Likert-type response scale with a marked tendency toward nonnormality (Liu et al., 2011;Tsaousis et al., 2011). Under these conditions, research has shown that it is problematic to model data using an estimator (such as maximum likelihood or robust alternative) that assumes the underlying continuity of the ratings (DiStefano, 2002;Dolan, 1994). To date, no psychometric study of the PBI has properly taken into account the ordinal nature of the PBI items using an estimation method that model data as ordinal variables such as the robust weighted least squares estimator (WLSMV; Finney & DiStefano, 2006).

Predictive Validity of the PBI
Although the PBI has been recognized to be predictive of future behavioral outcomes, many previous studies exploring parenting effects focus on young children (Cooper-Vince, Chan, Pincus, & Comer, 2014;Möller, Majdandžić, & Bögels, 2015;StGeorge, Fletcher, Freeman, Paquette, & Dumont, 2015) or adolescents (Lansford et al., 2014;McKinney & Renk, 2008). Few studies have followed participants into adulthood (Hoeve et al., 2011). Another weakness of previous investigations of the PBI is that very few have been conducted based on population-representative samples, thus rendering findings vulnerable to selection bias. In the present investigation, we relied on a longitudinal population-based sample from England, Wales, and Scotland to assess the psychometric properties of the PBI and its measurement invariance in relation to the gender of offspring and the parents, using multiple-group ESEM. Specifically, we assessed two psychometric properties of the PBI, as well as its predictive validity. Regarding the psychometric properties of the PBI, we address the following questions: 1. How many factors are necessary to represent PBI ratings, as assessed by ESEM analyses conducted separately for maternal and paternal PBI ratings? 2. Is the PBI underlying measurement model invariant for ratings provided by male and female offspring of the parenting style of their mothers and fathers? 3. Are PBI ratings biased (DIF) as a function of respondents' personality (measured at age 26) and mental health status (measured at age 43)?
Regarding the predictive validity of the PBI, we address the following question: What is the unique predictive effect of maternal or paternal PBI factors on respondents' mental health assessed at age 43 and 53?

Sample
The study sample was based on the Medical Research Council (MRC) National Survey of Health and Development (NSHD), also known as the British 1946 birth cohort, which originally consisted of 5,362 singleton babies (2,547 girls and 2,815 boys) born in 1 week in March 1946 in England, Scotland, and Wales (Stafford et al., 2013 The Parental Bonding Instrument. At age 43, the survey members rated their mothers' (24 items) and fathers' (24 items) parenting practices for the period up to the age of 16 years. These items were rated on a 1 to 4 ordered-categorical, Likert-type scale ranging from very like this to very unlike this. See Table 2 for a list of the PBI items.
Maudsley Personality Inventory. Study members completed six Neuroticism (e.g., "Do you sometimes feel happy, sometimes depressed, without any apparent reason?") and six Extraversion (e.g., "Are you happiest when you get involved in some project that calls for rapid action?") items from the Maudsley Personality Inventory (Eysenck, 1958(Eysenck, , 1959 at age 26. The items had a binary response format of "no" and "yes." Kuder-Richardson 20 (the equivalent of Cronbach's α for binary items) scale score reliability coefficients are 0.554 for extraversion and 0.741 for neuroticism.
Psychiatric Symptom Frequency Scale. Anxiety and depression at age 43 were assessed through the interview-based Psychiatric Symptom Frequency scale (Lindelow, Hardy, & Rodgers, 1997). Participants provided ratings ranging from 0 (not in the past year) to 5 (very often) to 18 questions such as "Have you felt on edge or keyed up or mentally tense?" in the past 12 months (α = 0.896).
General Health Questionnaire. Participants completed the 28-item self-administered General Health Questionnaire (Goldberg & Hillier, 1979) at age 53. The 28-item General Health Questionnaire focuses on symptoms of anxiety and depression in the preceding 4 weeks (e.g., "Have you recently been getting scared or panicky for no good reason?"). Item data were coded on a 4-point Likert-type scale ranging from not at all to much more than usual (α = 0.926).

Statistical Analysis
Analyses were carried out using Mplus 7.11 . We used the WLSMV estimator with theta parameterization. Data from questionnaire items were modelled as ordered-categorical polytomous ratings through a probit regression link with the corresponding latent variables. This corresponds to a graded response, two-parameter, normal ogive model in item response theory terms (Samejima, 1997). We used ESEM to determine whether the PBI data structure was invariant across groups formed on the basis of the gender of the offspring (sons and daughters) and their parents (mothers and fathers). Analyses started with the estimation of ESEMs using oblique geomin rotation, with an epsilon value of 0.5 (Marsh et al., 2009;Morin et al., 2013). Items related to maternal and paternal parenting styles were analyzed separately in order to determine the number of factors to retain, and to examine whether the factor structure was comparable across maternal and paternal measures. Since the wording of the items is identical for maternal and paternal ratings, a priori correlated residuals at the item level were included between items with parallel wording, as recommended by Marsh and Hau (1996). Next, measurement invariance was examined using multiplegroup ESEM Meredith & Teresi, 2006) for configural invariance, weak invariance (factor loadings), strong invariance (loadings and thresholds), and strict invariance (loadings, thresholds, and uniquenesses). Although it was not strictly part of the measurement invariance assessment, we also assessed the structural invariance of the PBI in terms of factor variances, covariances, and latent means across groups.
To assess DIF in relation to affective symptoms and personality measures, we used a MIMIC ESEM . DIF represents a direct association between the covariate and a particular item after accounting for the association between the covariate and the latent factor, which indicates that the covariate influences the response process on a particular item over and above its influence on the latent factor itself. DIF is thus similar to a case of threshold noninvariance across levels of the covariate, and suggests the presence of response bias at the item level (Kaplan, 2000;Morin et al., 2013). Specifically, three models are tested in a MIMIC analysis. In the MIMIC saturated model, the paths from the covariates to the latent factors are fixed at zero, but all direct paths from the covariates to the items are estimated.
In the MIMIC invariant model, the paths from the covariates to the items are fixed at zero, but the paths from the covariates to the latent factors are freely estimated. In the third MIMIC Null model, all paths from the covariates to the latent factors and items are constrained to be zero. A goodness-of-fit comparison between the first two models (Invariant and Saturated) and the last (Null) serves to assess whether the covariates have an effect on PBI ratings, whereas the comparison between the first two models (Invariant vs. Saturated) serves to assess the presence of DIF.
Since the chi-square is known to be highly sensitive to sample size (Marsh, Balla, & McDonald, 1988;Marsh, Hau, & Grayson, 2005), a variety of sample size independent goodness-of-fit indices was also examined to assess the fit of the alternative models: the root mean square error of approximation (RMSEA), the Tucker-Lewis index (TLI), and the comparative fit index (CFI; Fan, Thompson, & Wang, 1999;Hu & Bentler, 1999;Marsh, Hau, & Wen, 2004;Yu, 2002). The TLI and CFI vary along a 0 to 1 continuum and values greater than 0.90 and 0.95 typically reflect an acceptable and excellent fit to the data. RMSEA values of less than 0.06 and 0.08 indicate a close fit and an acceptable fit to the data, respectively. In terms of model comparisons for multiple-group analyses, a restrictive model is preferred if the change in model fit indices is not significantly inferior to those of the less restrictive model. For RMSEA, the change should be less than 0.015 (F. Chen, 2007). For CFI and TLI, the change should be less than 0.01 (F. Chen, 2007;Cheung & Rensvold, 2001). The WLSMV chi-square difference tests (computed with the DIFFTEST function,  compare the model under investigation to less restrictive alternative model.

Sample Demographics
Of the initial 5,362 newborn babies, 2,815 were male and 2,547 were female. For the main variables used in the current analysis (parental bonding at age 43, personality at age 26, and mental health data at ages 43 and 35), there were 1,217 participants with complete data, whereas information was completely missing for 1,373 participants (Supplement Table  S1; all supplementary materials are available online at http:// asm.sagepub.com/content/by/supplemental-data). In comparison to the participants with complete data, the samples with completely missing data had a higher percentage of males, and came from families of lower occupational social class at age 11 and had lower occupational social class at age 43.

Psychometric Properties
The results of PBI psychometric properties are presented in relation to number of PBI factors, measurement invariance according to gender, and uniform DIF in relation to personality and mental health measures.

Number of PBI Factors.
Mother-and father-specific PBI items were separately analyzed using ESEM. Models including two, three, and four factors were compared, and the results showed that ESEMs including three or four factors provided a satisfactory level of fit to the data (Table 1). Parameter estimates from these models are reported in Tables 2 (mothers) and 3 (fathers). These results show that, in both models, Factor 1 describes the a priori "care" dimension of the PBI questionnaire, whereas Factors 2 and 3 describe the "overprotection" and "autonomy" dimensions, whose items jointly form the original "control" factor. In both maternal and paternal measures, the "care" factor was negatively correlated with "overprotection" factor but positively correlated with the "autonomy" factor, whereas the "overprotection" factor was negatively correlated with the "autonomy" factor. However, the four-factor solution was not fully equivalent across ratings of mothers and fathers. For ratings of fathers, the fourth factor merely corresponded to a single item ("wanted me to grow up") from the "overprotection" factor. In contrast, in ratings of the mothers, the "overprotection" factor was more cleanly split into two factors. A close examination revealed that the three items corresponding to the third factor in the rating of the mothers had largely parallel wording (i.e., "gave me as much freedom as I wanted," "let me go out as often as I wanted," and "let me dress in any way I pleased"). We thus included correlated residuals between these three items and reran the analyses for the three-factor model. This revised three-factor model, including correlated residuals, provided a very clear three-factor solution, with similar solutions across ratings of mothers and fathers, and corresponded to the a priori "care," "overprotection," and "autonomy" factors found in many previous studies. This factor structure fits the data well and is consistent across ratings of the mothers and fathers. Subsequent analyses are therefore based on this factor structure.

Measurement Invariance Across Parents and Offspring
Gender. Six multiple-group ESEMs were specified (Table 4, m1-m6). Offspring ratings of their mothers and fathers were both included in the same model, with sons and daughters forming two separate groups. Hence, the tests of measurement invariance conducted here are based on two types of ratings (mothers vs. fathers) provided by two (sons vs. daughters) groups of offspring. The baseline model (m1) tests whether the factorial structure is consistent across groups of offspring ratings of their mothers or fathers, allowing parameters to be freely estimated across respondents and parents. The baseline model provided an excellent model fit (RMSEA: 0.04, TLI: 0.971, CFI: 0.966), supporting the configural invariance of the model. In the weak invariance model (m2), factor loadings were constrained equal across groups of offspring and parental ratings. The model fitted the data well (RMSEA: 0.033, TLI: 0.979, CFI: 0.977), and in comparison with Model 1, there was improvement in goodness of fit in terms of RMSEA, CFI, and TLI, indicating equal factor loadings across groups and types of parental ratings. In the strong invariance model (m3), the focus is the invariance of the item thresholds. In addition to constraining factor loadings equal, thresholds were held equal across daughters and sons, as well as across ratings of mothers and fathers. Model 3 fitted the data well, and showed minimal change in model fit indices, including RMSEA, CFI, and TLI, supporting the strong measurement invariance of the model. In Model 4 (m4), strict invariance was imposed by additionally holding residual variances constant across all groups of offspring and parental ratings. Again, this model fitted the data well, and showed improved goodness of fit in comparison with the strong invariance model (m3), supporting the strict invariance of the model. In Model 5, the variances and covariances of all factors were constrained to be equal across groups and types of parental ratings. This model again resulted in improved goodness-of-fit results, thus supporting the invariance of the latent variance and covariance matrix across groups of offspring and parental ratings. Finally, tests of the Note. ESEM = exploratory structural equation modeling; df = degrees of freedom; RMSEA = root mean square error of approximation; CFI = comparative fit index; TLI = Tucker-Lewis index; CU = correlated residuals. Residual variances were specified for three items: "gave me as much freedom as I wanted," "let me go out as often as I wanted," and "let me dress in any way I pleased." invariance of the latent means across groups of offspring and parental ratings (m6) resulted in a slight decrease in goodness of fit in comparison with Model 5, where latent means were freely estimated. Even though the decrease in fit remained minimal (less than 0.01 in RMSEA, TLI, and CFI), we decided to explore latent means differences given their substantive interest. We thus retained Model m5 as the final model. The invariant latent correlations were estimated as part of this model, as well as latent means across all daughters' and sons' ratings of their mothers and fathers, and are reported in Table 4.
Factor correlations (Table 5) showed that, for both mothers and fathers, care was positively correlated with autonomy but negatively correlated with overprotection. There was a high level of agreement across parenting characteristics of mothers and fathers (correlation coefficients were 0.519 for care, 0.666 for overprotection, and 0.808 for autonomy). This is consistent with the observed invariance of the factor variances-covariances.
Across the set of models considered here, the latent means are constrained at zero in one group of offspring rating of one parent (e.g., sons' ratings of their mothers) for Residual variances were specified for three items: "gave me as much freedom as I wanted," "let me go out as often as I wanted," and "let me dress in any way I pleased." Residual variances were specified for three items: "gave me as much freedom as I wanted," "let me go out as often as I wanted," and "let me dress in any way I pleased." identification purposes, allowing for the free estimation of the latent means for all other ratings (e.g., sons' ratings of their fathers, and daughters' ratings of both parents). This way, all freely estimated latent means directly represent deviations, in standard deviation units, from the referent latent mean constrained at zero. In order to more specifically assess latent means differences, Model m5 was thus reestimated four times, each time with a different set of latent means set to zero (the reference point). Examination of these results (see Table 5) suggests differences in parenting of mothers and fathers according both to sons and daughters. Compared with maternal measures, both daughters and sons rated fathers to be less caring (−0.532 for sons, −0.258 for daughters) and less overprotective (−0.455 for sons, −0.217 for daughters). Daughters also rated fathers as granting less autonomy (−0.117), although there was no difference for sons (−0.029, ns). There were also differences in how sons and daughters viewed the parenting of their mothers and fathers. There was no difference in how sons and daughters rated their mothers' parenting styles on care and overprotection. However, in comparison with the sons' perception of their mothers, daughters regarded their mothers as giving less autonomy (−0.251). Daughters also regarded fathers as more caring (0.314), more protective (0.301), and as granting less autonomy (−0.339) compared with sons.
DIF Analysis. In order to assess whether PBI ratings were subject to DIF in relation to covariates including personality and mental health measures, MIMIC ESEMs were    estimated, starting from the model of strict measurement invariance (Model m4). These results are reported in Table 4. In comparison with Models m7 (MIMIC Saturated) and m8 (MIMIC Invariant),Model m9 (MIMIC null) resulted in almost identical goodness-of-fit indices, suggesting that these covariates had no effects on sons and daughters ratings of their parents, thus also evidencing a lack of DIF and measurement biases relate to these covariates.

Predictive Validity of PBI Factors
Starting again from a model of strict measurement invariance (Model m4), we first estimated a multiple-group model with both maternal and paternal factors included as predictors of mental health outcomes at age 43 and 53 (Model m10, see Table 6). For daughters, none of the parenting style factors predicted mental health outcomes. However, for sons, paternal care predicted fewer mental health symptoms at age 43 (beta = −0.139). It is noteworthy that, although none of the predictions came out as significant in the daughters group, this group evidenced some inflated standardized regression coefficients and standard errors, suggesting the presence of multicollinearity among parenting style measures related to mothers and fathers. This implies that for parenting related to overprotection and autonomy, there was limited unique contribution of maternal or paternal parenting effects on later life mental health (predictions estimated in separate models for paternal and maternal measures are presented in the supplemental materials, Table S2).

Discussion
The present study is the first psychometric investigation of PBI based on a large representative population-based sample from the United Kingdom. Analyses supported a threefactor structure in the study population, and multiple-group ESEM and MIMIC models demonstrated the robustness of the psychometric properties of the PBI instrument as a function of the respondents' and parents' genders, as well as respondents' personality characteristics and mental health. The PBI was found to predictive of mental health in midlife in a gender-specific manner.

Psychometric Properties of the PBI
ESEM analyses of the NSHD sample led to a three-factor structure (Tables 1, 2, and 3). The care factor corresponded to the original factor of this name (Parker, 1979), but the control factor split into two further factors, overprotection and autonomy, in line with some psychometric studies of Western samples (Cox et al., 2000;Heider et al., 2005;E. Murphy et al., 1997). Although previous studies in non-Western cultures have found that a four-factor structure explained the data better (Behzadi & Parker, 2015;Liu et al., 2011;Suzuki & Kitamura, 2011;Uji et al., 2006), the three-factor solution found in the present investigation is consistent with several previous studies of Englishspeaking populations or other Western cultures (Cubis et al., 1989;Gómez-Beneyto et al., 1993;Kendler, 1996;Mohr, Preisig, Fenton, & Ferrero, 1999;E. Murphy et al., 1997). Countries in which the our factor solution was supported, such as Iran or Asian countries, are often characterized by a male-dominated culture in familial and societal environments, which may lead to differences in parenting styles when assessed using an instrument initially developed for Western cultures (Behzadi & Parker, 2015). Another possibility is the effect of cultural differences on the individual's response to the wordings of questionnaire items. The items that form further additional factors of the PBI are often items that score in the same direction toward either positive parenting or negative parenting. It has been suggested that responses to questionnaire items with mixed wordings (positive and negative) can be subject to cultural influences. In particular, negatively worded items are often interpreted differently across cultures (Schmitt & Allik, 2005). It has been suggested that additional factors due to positive/negative wording of items are interpreted as artefactual (Greenberger, Chen, Dmitrieva, & Farruggia, 2003). Our study is the first to demonstrate the measurement invariance of the PBI (m1-m6, Table 4) in relation to parental and offspring gender, utilizing sophisticated psychometric methods to verify the factor structure and measurement invariance across distinct gender groups. Under multiplegroup ESEM analysis, the three-factor structure in the current sample was found to be fully invariant in these respects, which enables valid comparison of analyses involving maternal and paternal parenting practices, and in male and female offspring.
Both male and female participants rated their fathers to be less caring and less overprotective than their mothers, which is in line with previous findings that mothers tend to adopt warmer parenting styles compared with fathers (E. Murphy et al., 1997;Russell et al., 1998). This is in line with the gender role theory that women are socialized to be warmer and more caregiving compared with their male counterparts, whereas men are perceived as more authoritarian (Hosley & Montemayor, 1997). However, this result may potentially also reflect cohort effects, as participants from the present study were born in the 1940s.
The psychometric properties of the PBI ratings were also shown to be robust in relation to external covariates (Table 4, m7-m8). This finding lends stronger support to the validity of associations reported in previous studies investigating the relations between parental styles and personality/mental health outcomes. Although previous studies also looked at the potential bias of PBI ratings due to factors such as concurrent depressive mood (Duggan, Sham, Minne, Lee, & Murray, 1998;Gotlib, Mount, Cordy, & Whiffen, 1988;Rodgers, 1996a also based on the NSHD sample), the present study is the first to rely on a psychometric approach that allows for the assessment of DIF in relation to individual items. This makes possible a more comprehensive examination of the measurement properties in relation to key covariates that are often studied as outcomes of parenting styles.

Predicative Validity of PBI Factors
In the present investigation, we did not find a unique effect of maternal or paternal parenting style factors for mental health outcomes in daughters. However, fathers' care predicted fewer mental health symptoms in sons. Although we are not able to find other studies with follow-up going well into adulthood, Lansford et al. (2014) reported the unique effect of father's autonomy on sons' externalizing behaviors. An earlier review (Pleck & Masciadrelli, 2004) also showed evidence of the father's involvement in longer term child outcomes. This might be explained from the perspective of role theory (Hosley & Montemayor, 1997) that sons are traditionally encouraged to be more independent and take more risks. Indeed, children tend to copy behavior from parents of the same sex (Laible & Carlo, 2004). Here, the results suggest that caring parenting from fathers predicts positive mental health in sons.
The lack of a unique parental style association for daughters is likely due to substantial confounding of maternal and paternal measurements, given the high correlation among maternal and paternal measures observed in the present study (r = 0.519-0.808). The separate results of maternal and paternal predictions (Table S5) confirmed that the gender-specific effect of the PBI was highly comparable for both sons and daughters. Studies with adolescents have also reported a moderate to high level of concordance between perceptions of mothers and fathers (Hoeve et al., 2011;Lansford et al., 2014). This indicates that it might be sufficient to measure PBI from one rather than from both parents.

Limitations and Future Directions
Sample attrition was found to relate to gender and occupational social class, indicating the missing at random mechanism. To account for this limitation, the analyses included occupational social class as a covariate, and the analyses were also stratified by gender. The WLSMV estimator used in the current analysis employs a pairwise-present strategy for dealing with missing data, which has been shown to produce unbiased estimates under missing at random assumptions in relation to observed covariates .
A feature of the current study design is that the parenting styles for prior to the age of 16 years were assessed while the participants were 43 years old. Although our study is unable to assess test-retest stability, as it is limited to only one wave of measurement, previous recollections of parenting styles have been shown to remain stable even when reassessed after a 20-year period (Eleanor Murphy et al., 2010;Wilhelm, Niven, Parker, & Hadzi-Pavlovic, 2005). Nevertheless, it would be interesting to study how parenting styles measured at different stages predict future outcomes.
The current study is based on a British sample for which English is the native language. Since the PBI has also been studied across a diverse range of countries (Spain: Gómez-Beneyto et al., 1993;China: Liu et al., 2011;Australia: Mackinnon et al., 1989; France: Mohr et al., 1999;E. Murphy et al., 1997;Pakistan: Qadir, Stewart, Khan, & Prince, 2005;Brazil: Terra et al., 2009;Japan: Uji et al., 2006), it would be interesting to assess whether this property studied in the current study holds true for other cultures.
Another limitation of the current study is that all participants were White, therefore future cross-cultural studies are needed to address the lack of cultural diversity in the current sample.

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by a Wellcome Trust grant (088869/Z/09/Z). PBJ acknowledges financial support from the NIHR CLAHRC East of England.