Why the Son-bias in Caregiving? Testing Sex-differences in the Associations Between Paternal Caregiving and Child Outcomes in England

Studies show that fathers across Western populations tend to provide more care to sons than daughters. Following a human behavioral ecological framework, we hypothesize that son-biases in fathering may (at least in part) be due to differences in fitness returns to paternal direct investments by child’s sex. In this study, we investigate sex-differences in the associations between paternal caregiving and children’s outcomes in stable, two-parent families. Using data from the Avon Longitudinal Study of Parents and Children, we test whether paternal caregiving in early childhood is associated with different effects on children’s school test scores and behavioral difficulties by children’s sex. Overall, we find that paternal caregiving is associated with higher school test scores and lower behavioral difficulty scores, but the association between paternal caregiving and school test scores was stronger for boys. Our findings highlight possible sex-differences in returns to paternal caregiving for certain domains of child outcomes in England.

Keywords child care, quantitative, father-child relationship, parent/child relations, gender and family

Background
Fathers are increasingly viewed as important caregivers in North American/ Western European (henceforth Western) populations. In the United Kingdom (UK), father involvement in two-parent families has seen an upward trend, with fathers reportedly spending 11-17 minutes more a day providing childcare as a primary activity in 2014 compared to 2000 (Henz, 2017). In particular, fathers in the UK have increased their time providing interactive care (i.e., reading and playing), although there has also been a decrease in the provision of physical care (i.e., feeding and bathing; Henz, 2019). These increases in "involved" fathering in the UK has been accompanied by the normalization of fathers as caregivers, with a gradual shift in government policy from fathers as economic providers to fathers as caregivers (Atkinson, 2017;Gregory & Milner, 2011).
These sociocultural changes have co-occurred with increasing research around the benefits of fathers as caregivers in Western populations: A systematic review of 18 longitudinal studies controlling for socioeconomic status found that paternal caregiving was generally associated with better socio-emotional, behavioral, and cognitive outcomes in childhood and later life (Sarkadi et al., 2008), while a meta-analysis of 66 studies found that father involvement was associated with better educational outcomes for teenagers in urban settings (Jeynes, 2014). More specifically, a study from the United States found positive trends between paternal caregiving and children's educational and behavioral outcomes (Hsin & Felfe, 2014), while specific activities such as paternal book-reading have been associated with greater language and cognitive skills (Duursma, 2014). In the UK, disengaged fathering (i.e., lack of interaction) at 3 months of age was associated with greater behavioral problems at 12 months (Ramchandani et al., 2012), while higher paternal involvement (including caregiving) among secondary school children was associated with lower socio-emotional difficulties and higher levels of prosociality (Flouri, 2008). These associations seem to hold across family structures, with non-resident father involvement also associated with lower levels of behavioral difficulties in early childhood (Choi et al., 2018). In Western developed populations, paternal caregiving seems to be an important resource for child development, as have been found for maternal caregiving (Bono et al., 2016;Cummings & Davies, 1994;Tramonte et al., 2013).
Along with the increasing evidence on the benefits of paternal caregiving for children, studies on Western populations have also found relatively consistent differences in paternal involvement depending on the sex of the child, where fathers tend to be more involved with their sons compared to daughters (Lundberg, 2005b). Studies in the United States have found that fathers tend to spend more time with their sons than daughters (Lam et al., 2012;Mammen, 2011;Raley & Bianchi, 2006). A similar son-bias in paternal caregiving has been reported among fathers in Switzerland (Rouyer et al., 2007) and fathers with lower levels of educational attainment in Denmark (Bonke & Esping-Andersen, 2009;Rouyer et al., 2007). A crossnational study on secondary school children across England, Germany, the Netherlands, and Sweden also found that divorced fathers were more likely to have co-parenting arrangements and contact with sons than daughters (Kalmijn, 2015). In the UK, a study using the Avon Longitudinal Study of Parents and Children (ALSPAC) found that the level of paternal caregiving decreased faster as children aged for daughters than sons (Lawson & Mace, 2009), while a study using the UK Millennium Cohort Study found that fathers were more likely to share caregiving responsibilities with mothers for sons than daughters (Norman et al., 2014). Such sex-biases are found in other forms of paternal investments beyond direct caregiving, including financial support, time spent on housework, and time spent in paid work (Anderson et al., 2001;Lundberg, 2005b;Pollmann-Schult, 2015). Note, such son-biases are context-specific, and daughter preferences have been reported in other populations (Cronk, 2000;Fuse, 2013).
Why do fathers in Western contexts tend to provide more care to boys than girls? The current article approaches this question from a human behavioral ecological (HBE) perspective, drawing on the inclusive-fitness costs and benefits of paternal direct caregiving.

Paternal Direct Caregiving from a Human Behavioral Ecological Perspective
Caregiving by fathers in developed populations have been defined and studied in many ways across disciplines, including father-child relationship quality, paternal attitudes, and fathering style (Brown et al., 2018;Paquette, 2004). From an HBE perspective (a sub-discipline of biological anthropology), paternal caregiving is a form of direct paternal investment; a transfer of time and energy from the father to a child via direct contact (Emmott & Page, 2019). Direct caregiving is therefore expressed by observable behaviors such as physical care as well as playing and teaching. This is distinct from relationship and attitude-related fathering constructs which we conceptualize as psycho-social drivers (or mechanisms) of direct caregiving. It is also distinct from paternal provisioning, which is a transfer of extrasomatic resources from father to child (such as provision of money and goods; Emmott & Page, 2019).
Importantly, paternal caregiving is not "free." Under HBE, we assume behaviors have costs and benefits to biological fitness (i.e., ability to survive and reproduce, translating to individual quality/capital), and that individuals adjust their behavior to maximize their inclusive-fitness (i.e., combined fitness of the individual plus their kin; Emmott & Page, 2019). Overall, this conceptualization of paternal caregiving is similar to "paternal engagement" (Lamb et al., 1987) or "father involvement" (Peck, 2010), but with explicit reference to the fitness-related costs and benefits of direct care incurred by fathers.
Within and between populations, not all fathers provide care. Fathering in humans is a facultative trait (i.e., not universally expressed) because it may benefit children to receive direct care from fathers but it is not always essential for their survival (Geary, 2015). Fathers are hypothesized to "pay the cost of caregiving" and provide care in contexts where it may lead to higher inclusive-fitness benefits, via increasing either their own fertility or child quality (by child quality we mean child health, growth, and other domains of internal capital manifesting as developmental outcomes; Emmott & Page, 2019). According to the embodied capital model (Kaplan, 1996;Kaplan & Lancaster, 2000), parents in developed populations bias their investments into increasing child quality over investing in fertility. Indeed, "modernization" of populations has been associated with increasing parental investments into children (Gibson & Lawson, 2011). Taking an HBE approach, we therefore assume that how much fathers provide care-or whether to care at all-in Western developed contexts are dependent on its impact on child quality.

Son-biases in Paternal Direct Caregiving in the West
With its focus on fitness costs and benefits, HBE provides a useful framework to explore the functional reasons behind son-biases in paternal caregiving. Here, our primary assumption is that fathers who experience greater fitness returns to paternal direct caregiving provide greater investments to their children. This leads to our hypothesis that fathers in some Western contexts may be investing more in sons because they "gain more" in terms of child quality.
Is it plausible that sons might gain more (in terms of child quality) from paternal caregiving relative to daughters in the West? First, there may be gender inequalities in future fitness-related outcomes that influence how fathers invest in their children. Across many developed populations including the UK, men achieve higher levels of income than women, although the severity of the gap varies between countries (Arulampalam et al., 2007;Pike, 2011). The gender pay gap in Western populations typically follow a "glass ceiling" effect, where women are less likely to be in high-earning positions (Arulampalam et al., 2007;Blau & Kahn, 2017), and academically highachieving women may be disadvantaged when it comes to hiring decisions (Quadlin, 2018). Given the positive associations between wealth, health, and reproductive outcomes (Stulp & Barrett, 2016;van Doorslaer et al., 1997), the benefits of paternal caregiving for daughters may be constrained compared to sons, discouraging fathers from providing higher levels of care to daughters. While studies examining such parenting trade-offs in developed populations are rare, biased parental investments depending on children's future outcomes have been evidenced across traditional, high-fertility populations (Bereczkei & Dunbar, 1997;Mace, 1996;Quinlan, 2006).
Similarly, fathers may be more inclined to invest in sons if lack of caregiving is associated with relatively higher costs for boys than girls. It has been suggested that fathers have a special role in the development of boys, where boys specifically benefit from interacting with fathers or father figures (Cobb-Clark & Tekin, 2014;Morgan et al., 2002). Further, there is some evidence to suggest that boys are more vulnerable in stressful environments and require greater levels of parental investments to achieve better outcomes (Amato & Keith, 1991). For example, the socio-emotional development of boys may be more vulnerable in stressful family environments compared to girls (Amato & Keith, 1991), while boys with adverse childhood experiences are more likely to be permanently sick in later adulthood than girls (Fahy et al., 2017). Several studies have found that father absence or paternal neglect increases risk of delinquency specifically in boys (Cobb-Clark & Tekin, 2014), while lower father involvement had a stronger association with externalizing behavior for boys than girls (Carlson, 2006). If boys require and/or benefit more from paternal investments, fathers could be incentivized to provide more care to their sons.
Note, here we do not suggest that fathers are making conscious decisions to invest/not invest in sons or daughters based on calculated fitness returns. Rather, paternal caregiving is likely to be underpinned by complex mechanisms which are derived from, and feed into, the relationship between fathering and child quality. To clarify, sex-differences in the returns to paternal investments could lead to bio-social pathways that encourage fathers to provide more care to sons, which in turn could reinforce and amplify the biological fitness benefits of fathering for sons. For example, embodied gender-role norms have been associated with sex-biases in paternal caregiving in Western populations (Raley & Bianchi, 2006): Fathers with "traditional" gender-role norms (where men are viewed as providers and women as caregivers) are less likely to be involved with their children overall (Braun et al., 2011;Bulanda, 2004), and such norms could lead to sex-biases in fathering where fathers carry out "male-typical activities" with sons only. Where there are son-preferences by fathers, be it subtle or explicit, mothers may be able to negotiate higher levels of paternal involvement for sons (Lundberg, 2005a). Biological differences between boys and girls, such as differences in physical and cognitive developmental trajectories (Giedd et al., 1999;Marceau et al., 2011), may additionally impact paternal behavior in subtle ways which could amplify and/or reinforce such social and cultural norms (Raley & Bianchi, 2006). These bio-social mechanisms are unlikely to be mutually exclusive, with multiple factors influencing and being influenced by son-biases in paternal caregiving observed across Western populations.

Aims of the Current Study
The current study builds on the well-evidenced son-bias in paternal direct caregiving in Western populations such as the UK. From an HBE perspective, we predict this is influenced by sex-differences in the marginal returns to paternal caregiving in child quality. We therefore hypothesize that the positive association between paternal caregiving and child outcomes will be stronger for boys than girls in the UK.
We focus specifically on paternal caregiving which has received less attention compared to father absence (e.g., Cobb-Clark & Tekin, 2014) or fathering relationships/attitudes (e.g., Brown et al., 2018;Paquette, 2004), perhaps as observable direct caregiving measures are less accessible for researchers. To date, the few studies explicitly investigating sex-differences in paternal caregiving and child outcomes present a mixed picture: Focusing specifically on the UK, lower levels of paternal engagement at age of seven predicted adolescent delinquency in boys but not girls (Flouri & Buchanan, 2002), while disengaged paternal caregiving at three months age was associated with later behavioral problems of boys but not girls (Ramchandani et al., 2012). However, a study on the ALSPAC birth cohort found no sex-differences in the association between paternal involvement during infancy and children's socio-emotional outcomes at age 9 and 11 (Opondo et al., 2016).
Here, we use data from the ALSPAC birth cohort Fraser et al., 2013) to test whether paternal caregiving is associated with greater benefits on the outcomes of boys compared to girls. Specifically, we explore paternal caregiving through early childhood and test its associations with two outcomes reflecting different domains of child quality: school test scores and behavioral difficulty scores. School test scores, specifically relating to reading and math abilities, have been positively associated with school completion, later educational achievement, and adult economic success (Bynner & Joshi, 2002;Gregg & Machin, 2001). Behavioral difficulty scores, a proxy of socio-emotional development, has been associated with psychiatric disorders as well as economic, health, and social issues in later life (Champion et al., 1995;Goodman, 1997). As Western children's social networks tend to undergo extensive change in adolescence, transitioning from parent-focused to peer-focused networks (Larson et al., 1996; thereby introducing additional complexities around paternal caregiving and child outcomes), we restrict our analyses to investigate the association between paternal caregiving and children's outcomes before age 10.

Sample
We use data from the ALSPAC. ALSPAC is a longitudinal cohort study based in the old county of Avon situated in South West England. The study began by recruiting pregnant women whose estimated delivery date fell between 1st April 1991 and 31st of December 1992. A total of 14,541 women were initially recruited, with further recruitment of eligible mothers and children at around age 7 years. In total, ALSPAC data hold 15,083 unique child IDs. Ethical approval for the study was obtained from the ALSPAC Ethics and Law Committee and the Local Research Ethics Committees. The study website contains details of all the data that is available through a fully searchable data dictionary (http://www.bris.ac.uk/alspac/researchers/data-access/datadictionary/). Further information on the cohort is also available elsewhere Fraser et al., 2013).
In the current study, we restrict our sample to households with singleton focal children (i.e., remove cases with twins, triplets, etc.) due to uncertainty with the interpretation of reported parental caregiving between the siblings. This led to removal of 205 cases (n = 14,878). As information on paternal caregiving is only available in father-present households, and stepfathers are predicted to have different incentives around caregiving (Emmott & Mace, 2014), we also restrict the sample to stable, two-parent households where biological fathers and mothers were present from birth to nine years of age (including married and unmarried couples). This led to removal of 4,906 cases, and our final ALSPAC eligible sample is composed of 9,972 cases.

Independent Variable: Paternal Caregiving
Paternal caregiving measures reported by the mother on multiple occasions when the focal child was between ~6 months and ~5.5 years old. Mothers were asked how often her partner took part in various activities with the child via multiple self-completion questionnaires (How often do [you] / [your partner] do × with your child?). Each activity was reported on a 4 point scale, either on a subjective or objective scale. The subjective scale ranged from "never" (Score: 0), "rarely" (Score: 1), "sometimes" (Score: 2) to "often/ nearly every day" (Score: 3), while the objective scale ranged from "never" (Score: 0), "less than once a week" (Score: 1), "3-5 times a week" (Score: 2) to "nearly every day" (Score: 3). These measures were specifically developed by ALSPAC, and further details on the various activities are presented in Table 1. Further information on ALSPAC questionnaires and survey methods are more generally are available online (http://www.bristol.ac.uk/alspac/ researchers/our-data/questionnaires/carer-questionnaires/).
We replicate Lawson and Mace (2009) in deriving paternal caregiving scores: For each wave, the scores of the caregiving activities were totaled for each father, then standardized to range from 0 to 10 ((observed/maximum) × 10). Here, caregiving scores are age-relative (i.e., "at wave x, how much is the father investing in direct care out of 10, if 10 is investing the maximum in terms of measured age-appropriate caregiving behaviors"). Overall, "1 point" of paternal caregiving can be interpreted as 10% of the possible total maximum score (Moeller, 2015). The descriptive statistics of paternal caregiving is presented in Table 2.
We take these caregiving scores to be a proxy of direct investments by the father. As outlined earlier, we define paternal direct investments as any caregiving behavior directed to a child which leads to increased child fitness, with opportunity costs for the father. Caretaking activities included in the current measure, such as feeding and washing, addresses the basic needs of young children. The absence of such caretaking is often presented as neglect which is associated with negative effects on child development (Hildyard & Wolfe, 2002). Similarly, childhood play, both supervised by and involving adults, has been argued to be a necessary component of childhood for optimal child development (Ginsburg, 2007). This paternal caregiving measure has been found to vary by household characteristics (Lawson & Mace, 2009) and associated with various child outcomes (Lawson, 2009). This suggests that the current measure has predictive validity meeting our theoretical assumptions, functioning as an appropriate proxy of paternal direct investments for our current study.

Dependent Variables: Child Outcomes
We conceptualize better child outcomes to represent higher child fitness (i.e., higher quality/individual capital). We focus on two domains of child Activities % "often" % "nearly every day" % "often" % "nearly every day" % "often" % "often" Note. The figures represent the percentage of fathers in the sample who were reported to take part in each activity at the highest frequency ("often" or "nearly every day"). Some activities were not measured for certain surveys, indicated with an X. The questions were asked on either a subjective scale ("How often?") or an objective scale ("How frequently?").  *We include median score for "behavioral difficulty score" due to its Poisson distribution, meaning median is a better measure of centrality. outcomes: children's educational achievement and children's socio-emotional development.
Children's educational achievement is measured by school test scores, available from Local Entry Assessments (LEA; taken by children upon entering the British school system at age 4 or 5 years) and Key Stage 1 Standard Assessments (KS1; taken by children between the ages of 6 and 7 years). Both assessments were administered by teachers at school, and focus on Mathematics and English skills. Childhood reading and math abilities, in particular, have been positively associated with school completion, later educational achievement, and adult socioeconomic position (Reschly, 2010;Ritchie & Bates, 2013;Watts et al., 2014), and educational attainment more broadly has been associated with higher income and better later health outcomes (Cutler & Lleras-Muney, 2006). As the maximum test scores differed between LEA and KS1, the test scores were standardized to range from 0 to 100 (were "1-point" can be interpreted as 1% of the possible total maximum test score).
Children's socio-emotional development is measured by behavioral difficulty scores from the Strengths and Difficulties Questionnaire (SDQs), a questionnaire devised specifically to measure children's socio-emotional development, covering hyperactivity, emotional symptoms, conduct problems, and peer problems (Goodman, 1997). SDQs were completed by the focal child's mother on three occasions, at around 3.5 years, 7 years, and 9 years, where she was asked to rate "how true" various statements were relating to her child's behavior. Each statement is measured on a 3 point scale of "not true," "somewhat true," and "certainly true," with a total maximum behavioral difficulty score of 40 points. Studies suggest children with high behavioral difficulty scores are more likely to have psychiatric disorders such as anxiety and conduct problems, and health and social issues in later life (Champion et al., 1995;Goodman & Goodman, 2009;Knoester, 2003;Stone et al., 2010).

Control Variables
We control for maternal caregiving, measured through the same play and caretaking activities as fathers with the frequency of activities self-reported by mothers (but see issues with ceiling effect under the section "Results"). As with paternal caregiving, the scores of the caregiving activities were totaled for each mother and standardized to range from 0 to 10.
In addition, we control for child's age in months, number of siblings in the household, weekly income (3 categories: <£200 p/wk., £200-£399 p/wk., >£400 p/wk.), homeownership (2 categories: no, yes), reported financial difficulty (range = 0-15; higher scores = higher difficulties), mother's employment (2 categories: employed, not in paid employment), and father's employment (2 categories: employed, not in paid employment), mother's highest qualification at the time of birth (3 categories: O-level and equivalents, A-levels, degree), father's highest qualification at time of birth (3 categories: O-level and equivalents, A-levels, degree), mother's age at time of birth (years), and child's ethnicity (2 categories: White, other). The descriptive statistics for all variables used in the presented analyses are available in Table 2.

Analyses
First, we tested whether differences in paternal caregiving by children's sex existed throughout early childhood in our specific ALSPAC sample of stable, two-parent households with biological parents. For comparison, we carried out multilevel linear regression (random-intercept) models for maternal caregiving and paternal caregiving separately, with children's reported sex, child's estimated age in months, and question type (subjective or objective scale) as independent variables. This method means we are able to model repeated measures of parental caregiving through time (Level 1) while taking account of the repeated nature of measurements which cluster within each parent (Level 2). This was followed by multilevel regression models to investigate the associations between paternal caregiving throughout early childhood and the children's outcomes, with measurements as Level 1 and child as Level 2. To minimize the effects of reverse causality, we lagged the caregiving scores in all analyses. This means that the reported child outcomes are predicted by caregiving scores from the previous wave (i.e., caregiving at 42 and 65 months used to predict test scores at 55 and 88 months; caregiving at 38 and 65 months used to predict total difficulty scores at 42, 81, and 116 months). Previous studies have shown that attrition and non-response in ALSPAC are higher for households with younger mothers, those from lower socialeconomic positions, those with lower levels of education, and those from minority ethnic backgrounds Fraser et al., 2013). We therefore include these household and socioeconomic covariates in the model. We also include parental employment status and number of focal child's siblings as covariates, as these factors are likely to be confounders.
We ran the following models based on the distribution of the dependent variables and model fit: random-intercept linear regression models for school test scores (with random-intercept term for child) and random-intercept random-slope Poisson models for behavioral difficulty score (with randomintercept term for child and random-slope term for child age). The randomslope component was added to the behavioral difficulty score model as it improved model fit based on the AIC score (∆AIC = −753), where a reduction in AIC by 3 or more points is broadly taken as evidence that it is a betterfit model (Burnham & Anderson, 2002). To test for the sex-dependent effects of paternal caregiving, we carried out interactions between paternal caregiving and sex of child. For comparison, we also tested for sex-dependent effects for maternal caregiving. All models were estimated using the lme4 package (Bates et al., 2015) in R v3.5.1.

Preliminary Analysis: Patterns of Caregiving by Children's Sex
Our study is based on the assumption that there is a sex-difference in paternal caregiving in the UK. Before our main analyses, we test this assumption in our sample of stable, two-parent families. We carry out random-intercept linear regressions to test the associations between paternal caregiving, children's sex, and children's age in months. For comparison, we also explore sex-biases in maternal caregiving, and we control for question type (subjective or objective measures of caregiving). We examined whether interaction terms between child's sex and child's age improved model fit based on changes to the AIC score, taking a reduction in AIC by 3 or more points to indicate a better-fit model (Burnham & Anderson, 2002). An interaction term between child's sex and age for paternal caregiving improved model fit (∆AIC = −37.1 points) but an interaction term did not improve model fit for maternal caregiving (∆AIC = +13.4 points). Table 3 displays our final models for paternal caregiving and maternal caregiving. Overall, our models predict that the mean paternal caregiving score at zero months of age is lower than the mean maternal caregiving score (predicted paternal caregiving at age 0 = 6.57/10; predicted maternal caregiving at age 0 = 8.17/10). Between individual parents, paternal caregiving levels vary more than maternal caregiving, as evidenced in the intercept variance (paternal caregiving intercept variance = 1.41; maternal caregiving intercept variance = 0.55).
Our final models suggest that paternal caregiving declines faster as children age compared to maternal caregiving, and this decline in paternal caregiving is steeper for girls than boys. In contrast, maternal caregiving levels are higher for girls than boys (Table 3). Figure 1 displays the predicted paternal and maternal caregiving by sex and child's age (note that the y-axis range varies between the two graphs to facilitate interpretation). This supports previous findings on a similar sample of ALSPAC families by Lawson and Mace (2009), where they found a son-bias in paternal caregiving as children became older and a daughter bias in maternal caregiving. Overall, our results replicate previous findings of son-biased paternal caregiving in our particular sample of stable, two-parent households with biological parents. We also note that maternal caregiving scores in our data are distributed at the upper end of the scale, with relatively smaller variation compared to paternal caregiving scores (see parental caregiving, Table 2). This suggests that most mothers in our eligible sample are investing very highly within the constraints of the measurement scale: In our data, 31% of maternal caregiving scores were above 9 (out of 10). Focusing specifically on maternal caregiving of infants (6 months of age), 58% of mothers scored 9 or above. This suggests that the maternal caregiving variables in the ALSPAC data may suffer from a ceiling effect, at least for our subsample of stable two-parent families. We are therefore less likely to find a large effect of maternal caregiving on child outcomes in our analyses. Nonetheless, we include maternal caregiving in our models as a point of comparison to paternal caregiving.

Paternal Caregiving and Child Outcomes
We hypothesize that the sex-differences in paternal caregiving may, at least in part, be explained by sex-differences in the returns to paternal caregiving in terms of children's outcomes. To test this, we carried out random-intercept regression models for school test scores and random-intercept random-slope Poisson regression models for behavioral difficulty scores. Interaction terms between paternal caregiving and sex of child were added to each model. Notes. The y-axis range varies between paternal and maternal caregiving to facilitate comparability. Plots were created using the sjPlot package (Lüdecke, 2018) in R. Notes. a For the "school test score" model, "child's age" is modeled in months and mean-centered for each survey wave as it correlates with "school assessment type." For the "behavioral difficulty score" model, "child age" is modeled in years and centered at 3 years (due to convergence issues when "child age" was entered in months). b For the "behavioral difficulty score" model, the intercept is not the "incidence rate ratio" (IRR) but the predicted average behavioral difficulty score.

Table 4. (continued)
Interaction terms between maternal caregiving and sex of child were also added, as comparison. Interaction terms were only kept in the final model if it reduced the AIC score by 3 or more points, indicating a better-fit model (Burnham & Anderson, 2002). The final models for school test scores and behavioral difficulty sore are displayed in Table 4. For school test score, an addition of an interaction term between paternal caregiving and sex of child reduced the model AIC score by 6.8 points (results not shown), indicating that it may be a better-fit model compared to the original model. Interactions between maternal caregiving and sex of child did not improve model fit and increased the AIC score (∆AIC = +1.1; results not shown).
Plotting of the interaction term between paternal caregiving and sex of child shows that higher paternal caregiving leads to higher test scores for both sexes. However, this effect is greater for boys, where paternal care has a relatively larger positive effect on boys than girls (Figure 2). For boys, a 1-point increase in paternal caregiving is associated with an average of 1.914 point increase in school test scores. For girls, a 1-point increase in paternal caregiving is associated with an average of 0.774 point increase in school test scores. Given that boys tend to score lower in school tests than girls, our model suggests that the gender-gap in test scores is narrower for children whose fathers provide higher levels of direct care. Overall, our results suggest that paternal caregiving is associated with higher school test scores but this association is stronger for boys.
For behavioral difficulty score, the interaction term between paternal and maternal caregiving and sex of child did not substantially improve model fit (∆AIC = +1.9 points and ∆AIC = −0.3 points, respectively), meaning the association between caregiving and children's behavioral difficulties are unlikely to meaningfully vary by sex in our data. Nevertheless, our result suggests that higher levels of paternal caregiving are associated with lower behavioral difficulties, where a 1-point increase in paternal caregiving was associated with an average of 4.7% reduction in behavioral difficulty score. Similarly, a 1-point increase in mother sore is associated with a 5.5% reduction in behavioral difficulty score.

Discussion
In the current study, we investigated the possible effects of paternal caregiving on child outcomes in a UK sample and explored whether this is dependent on child's sex. Previous studies on fathers in Western populations have tended to focus on father absence or fathering relationships/attitudes. Here we investigated the association between paternal direct caregiving behavior throughout early childhood and child outcomes, providing additional evidence around the importance of father involvement in stable two-parent families in England.
Controlling for household and parental characteristics, we found that paternal caregiving predicted higher test scores and lower behavioral difficulty scores for both boys and girls. This is in line with previous studies suggesting that paternal caregiving has beneficial effects on child development in Western contexts (e.g., Jeynes, 2014;Sarkadi et al., 2008). However, the positive association between paternal caregiving and school test scores was stronger for boys: Both boys and girls achieved relatively similar levels of test scores when paternal caregiving was high but boys who experienced less paternal caregiving had notably lower school test scores compared to girls. Our results suggest that a lack of paternal caregiving may have greater detrimental effects on the educational outcomes of boys in our UK sample. While the exact mechanisms behind these findings are unclear, previous studies have found that parental involvement is positively associated with student motivation (Gonzalez-DeHass et al., 2005), and the association between parental involvement and children's educational outcomes may be mediated by children's own perception of competence (Topor et al., 2010). Given that boys tend to have lower student motivation than girls (such as less focus and persistence; Martin, 2004), it is possible that paternal caregiving has a stronger influence on improving such pathways for boys. Overall, our findings are in line with the broader discussion around the "greater vulnerability" of boys, where boys are thought to be more sensitive to stressful environments and require greater levels of parental investments to achieve better outcomes (Amato & Keith, 1991).
Contrary to our hypothesis, we did not find evidence of sex-differences between paternal caregiving and children's behavioral difficulties in our data. While the reasons behind this null result are unclear, we note that previous studies which found sex-dependent associations between father absence/ involvement and behavioral difficulties in the U.S. samples focused on adolescent outcomes (e.g., Carlson, 2006;Cobb-Clark & Tekin, 2014), and emerging evidence suggests adolescence is a particularly important period for socio-emotional development (Blakemore & Mills, 2014;Steinberg, 2005). Therefore, one possibility is that the effects of paternal caregiving on socio-emotional outcomes do not differ by sex in childhood but manifests itself in adolescence. As our study focused on early childhood (before age 10), it is possible that our sample of children was too young to observe any sex-differences in the associations between paternal caregiving and behavioral difficulties.
Finally, despite sex-differences in the reported patterns of maternal caregiving in our data, we found no evidence of sex-differences in the associations between maternal caregiving and behavioral difficulty scores despite daughter-biases in maternal investments. This may be due to the ceiling effect of our maternal caregiving measure, therefore we advise caution around inference.
Taken together, our study adds to the current limited evidence around potential sex-differences in the association between direct caregiving by fathers and children's outcomes in the UK. Taking an HBE approach, we hypothesized that the well-evidenced son-biases in fathering across the United States/Western European populations may be driven by differential fitness returns to parental investment (as measured by child quality), where paternal caregiving is more beneficial to sons than daughters. Our hypothesis was partially supported, where low paternal caregiving had a greater detrimental effect on the school test scores of boys than girls, meaning the marginal fitness returns to paternal direct investments may be higher when investing in sons.
Biases in paternal caregiving and son-preferences tend to be explored in terms of sociocultural norms (e.g., Braun et al., 2011;Bulanda, 2004). We suggest that such norms may be embedded within a socioecological system where fathers who preferentially invest in sons receive "greater payoffs." Given the complexities of human behavior, however, it is unlikely that sex-differences in educational attainment is the only or primary driver for the son-bias in paternal caregiving. Rather, differences in the returns to caregiving may act as an additional factor influencing fathering within the broader bio-social pathways in Western populations: Societal biases which lead to son-preferences may emerge from, and/or are reinforced by, sex-differences in the benefits of paternal care.

Limitations
We highlight several limitations: First, the current study focuses on the possible impact of paternal caregiving within relatively stable two-parent households, where biological fathers and biological mothers are both consistently present. This sample is therefore likely to capture a particular sub-population of parents and children in the ALSPAC data. Our study does not address the impact of paternal caregiving from non-resident fathers, and it is unclear whether there is a difference in the effects of paternal care on child development by household stability. Second, our ALSPAC sample is from a relatively ethnically and culturally homogenous area in South West England, with 95% of the children in the final sample reported as being White. As gender-roles and sociocultural contexts can vary by ethnicity (Harris, 1994), the "payoffs" of fathering may also differ-meaning we cannot be confident that the identified association between paternal caregiving and child outcomes will be present among households with other cultural backgrounds. Third, as highlighted earlier, our current data likely suffer from a ceiling effect regarding maternal caregiving. We therefore call for caution regarding the interpretation of our findings around maternal caregiving and children's outcomes. Fourth, our measure of caregiving is derived from the frequency of various caregiving and play activities as reported by the mother and is subject to maternal response bias. We note that measuring activity frequency rather than perceived caregiving quality may mitigate some bias, and we control for possible confounders which may influence over-or under-reporting of paternal activities. Finally, we do not know how our caregiving measure relates to caregiving quality, parenting style differences, and time investments. For instance, reading a book to a child every day could equally be a 5 minutes of daily reading with minimal engagement between parent and child or 30 minutes of daily reading with active teaching.
Our findings may be strengthened by future research which explores the costs and benefits of paternal caregiving by children's sex in different socioecological contexts. For example, are the costs and benefits of paternal caregiving in the 1990s different from the 2010s? Do they differ by household socioeconomic position? Such questions may be better addressed by conducting within-household comparisons, investigating the effect of paternal caregiving and child outcomes between different-sex siblings (thereby addressing unobserved heterogeneity to an extent). Finally, for a holistic understanding of why fathers tend to invest more in sons over daughters in many Western contexts, there is a need to develop an in-depth understanding of the bio-social pathways between fathering and child outcomes. While our current study highlights the potential differences in the effects of paternal caregiving between boys and girls, it is not clear why this difference exists, and what impact this may have on sociocultural norms. As such, we encourage future research to consider both the costs and benefits of paternal caregiving in terms of biological fitness as well as sociocultural determinants of paternal care.

Acknowledgments
We are extremely grateful to all the families who took part in this study, the midwives for their help in recruiting them, and the whole ALSPAC team, which includes interviewers, computer, and laboratory technicians, clerical workers, research scientists, volunteers, managers, receptionists, and nurses. The UK Medical Research Council and Wellcome and the University of Bristol provide core support for ALSPAC. We would like to thank Dr. David W. Lawson for his previous work on the parenting scores used in this study, and thank our colleagues in the Human Evolutionary Ecology Group at UCL for their comments on the draft material. This publication is the work of the authors who serve as guarantors for the contents of this article and does not reflect the views of the ALSPAC executive.

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded by the Medical Research Council and the Economic and Social Research Council (Grant number G0900207) and ERC Advanced Grant (Grant ref: AdG 249347). ALSPAC was funded by the Medical Research Council and Wellcome (Grant ref: 102215/2/13/2). The funders had no role in the study design, data collection, and analysis, decision to publish, or preparation of the manuscript.