Psychometric Properties of the Student-Teacher Relationship Scale-Short Form in a Norwegian Early Childhood Education and Care Context

The Student-Teacher Relationship Scale-Short Form (STRS-SF) is one of the most frequently used instruments globally to measure professional caregivers’ perceptions of the relationship quality with a specific child. However, its psychometric properties for children younger than 3 years of age enrolled in early childhood education and care (ECEC) centers are largely unknown. Thus, this study aimed to investigate and evaluate the factorial validity of the STRS-SF and measurement invariance across children’s gender and age by combining two large Norwegian community samples (N = 2900), covering the full age range of children enrolled in ECEC (1–6 years olds). Our findings indicate promising psychometric properties for the STRS-SF; thus, its applicability is supported for both younger and older children indiscriminate of their gender. However, some caution is advised when comparing latent means between older and younger ECEC children because professional caregivers interpret the STRS-SF differently based on children’s age.


Introduction
Healthy development depends on the quality of young children's relationships with significant people in their life. The relationship quality between children and adults established in children's early years lays the foundation for future developmental outcomes, such as mental health and academic achievement (Center on the Developing Child, 2010Child, , 2009Miller-Lewis et al., 2014). When young children transit from home to early childhood education and care (ECEC) centers, professional caregivers may act as ad hoc attachment figures, performing many similar caregiving functions as parents (Verschueren & Koomen, 2021;Zhang & Sun, 2011). These early bonds may be especially important for the development of very young and vulnerable children, as their capacity for self-regulation is relatively limited and requires adult support . Additionally, teacher-child relationship quality appears to have a moderating effect that may compensate for early negative experiences and reciprocally influences a range of developmental outcomes (e.g., Griggs et al., 2009;Graziano et al., 2016;Sabol & Pianta, 2012;Schmitt et al., 2012;Skalická et al., 2015;Zhang & Sun, 2011).
The student-teacher relationship scale (STRS) (Pianta, 2001) is a widely used instrument to measure professional caregivers and elementary school teachers' perceptions of the relationship quality with a specific child from the age of four and above. However, as more than a third of children in Organisation for Economic Co-operation and Development (OECD) countries attend ECEC centers (OECD, 2021), the instrument is frequently applied for children younger than 4 years old as well. The STRS consists of 28 items that measure three factors: closeness, conflict, and dependency. Closeness reflects the professional caregiver's perception of openness, warmth, and security in the professional caregiver-child relationship. Conflict refers to the negative perception of the relationship with discordance and unpredictability, while dependency refers to the perception of children as developmentally overreliant and possessive (Pianta, 2001).
Several researchers have highlighted reliability and validity issues with the dependency factor, suggesting that it may be more susceptible to cultural context and subjective interpretation (Beyazkurk & Kesner, 2005;Doumen et al., 2009;Drugli & Hjemdal, 2013;Fraire et al., 2013;Solheim et al., 2012;Tsigilis & Gregoriadis, 2008;Tsigilis et al., 2018). Consequently, the STRSshort form (STRS-SF) (Pianta, 2001), which contains 15 items measuring the closeness and conflict factors (dependency excluded from the original version), has been frequently applied by researchers investigating professional caregiver-child relationship quality. This two-factor model has shown satisfactory psychometric properties in ECEC samples from different cultures as well as partial gender invariance (Aboagye et al., 2019;Settanni et al., 2015;Tsigilis & Gregoriadis, 2008). However, it has been noted that professional caregivers report greater closeness in their relationship with girls than with boys (Howes et al., 2000;Solheim et al., 2012). Moreover, two items related to the closeness factor have been reported to be the main sources of model misspecification in a Norwegian ECEC sample, namely, "this child spontaneously shares information about himself/herself" and "this child openly shares his/her feelings and experience with me" . Additionally, the closeness factor has been shown to display minor invariance problems across child genders for both an adaption of the full version of the STRS and the short version (Aboagye et al., 2019;Koomen et al., 2012), indicating that professional caregivers may, to some degree, perceive closeness in the relationship differently for girls and boys. For instance, the item "I share an affectionate, warm relationship with this child" has been shown to be non-invariant across both child gender and age in a Ghanaian ECEC context (Aboagye et al., 2019), while the items "If upset, this child will seek comfort from me" and "this child is uncomfortable with physical affection or touch from me" were shown to be non-invariant across child gender in a Dutch sample . Invariance issues have also been reported regarding children's age using the STRS when comparing ECEC and elementary school children , suggesting that the factors of closeness and conflict may operate differently for younger and older children.
In Norway, 93.4% of children aged 1-5 years attend ECEC centers, where most spend 41 hr or more each week (Statistics Norway, 2022). This means that young children spend a considerable amount of time with professional caregivers. Even though there is some documentation of the psychometric properties of the original STRS in a Norwegian ECEC context based on a sample of Norwegian 4-year-old children , the psychometric properties of the STRS-SF have not yet been investigated in Norway or in any other Nordic countries. Additionally, most studies that have investigated the psychometric properties of the STRS-SF in an ECEC context have only included children in the upper age range or used samples containing both ECEC and elementary school children. As the bidirectional professional caregiver-child relationship may vary across age groups depending on children's developmental maturity (Pianta et al., 2003), it is important to investigate the appropriateness of the STRS-SF for the youngest children, which is currently largely unknown. Hence, the aim of this study was to investigate and evaluate the factorial validity and measurement invariance of the STRS-SF in a sample applying the full age range of ECEC children aged 1-6 years old. As measurement invariance is a prerequisite for making meaningful comparisons between groups, such as children's age and gender, knowledge of this psychometric property is important for assessing the applicability of the STRS-SF.

Methods
This study is based on combined baseline data from two different ECEC projects conducted in central and south-eastern Norway. Baseline data from the project Children in Central Norway were collected over the period 2012-2014, whereas data from Thrive by Three were collected in 2018.

Procedure and Participants
Children in Central Norway. Parents of children enrolled in ECEC centers in three municipalities in central Norway received information letters regarding the project and via parent meetings before the project commenced. The information letter also provided the parents with the option to enroll their child into the project either by returning a signed consent form to their ECEC center or consenting digitally with their unique invitation code to the project's online survey. Parental consent was obtained to allow the professional caregiver who knew the child best to complete an online survey regarding the child. Professional caregivers provided written consent with their own unique invitation codes for the online survey. A total of 1631 parents (77%) of the invited parents consented to enroll their child in the project, and 169 professional caregivers reported on 1430 children between 1 and 6 years old (mAge = 44 months, 51% boys).
Thrive by Three. The baseline data from Thrive by Three included 1471 children (mAge = 21 months, 51% boys) and 184 units/groups from 78 ECEC centers. A professional caregiver within the unit/group who knew the child best answered the STRS-SF. When combining Thrive by Three with the sample from Children in Central Norway, the total sample for this study included 2901 children (mAge = 33 months, 51% boys) and 353 professional caregivers. On average, professional caregivers reported on 8.2 children each.
For both above-mentioned projects, participation was voluntary, and consent could be withdrawn without reprisal at any time until the participation registry was anonymized.

Measurement
The Student-Teacher Relationship Scale-Short Form. As previously mentioned, the STRS-SF (Pianta, 2001) is a self-reported instrument regarding professional caregivers' perceptions of the relationship quality with a specific child. The instrument comprises 15 items (see Appendix for overview) measuring the two factors, closeness (eight items; e.g., "If upset, this child will seek comfort from me") and conflict (seven items; e.g., "This child and I always seem to struggle with each other") on a five-point Likert scale with response options ranging from 1 = Definitely does not apply to 5 = Definitely does apply. Thus, the closeness factor score ranged from 8 to 40, while the conflict factor score ranged from 7 to 35. Higher scores on the closeness factor indicate more positive interactions, while higher scores on the conflict factor indicate more negative interactions.

Statistical Analyses
Children's age was dichotomized into younger ECEC children (under 36 months) and older ECEC children (36 months and older) to reflect the organizational structure of ECEC centers in Norway. First, the internal consistency of the STRS-SF factors, closeness and conflict, for the full sample was investigated to obtain the multi-level omega (ω) coefficient. The ω coefficient was preferred over the more commonly used Cronbach's alpha because the latter depends on rather strict assumptions, such as tau-equivalence and normally distributed scores, which can lead to biased estimates if violated (Dunn et al., 2014;McNeish, 2018;Peters, 2014;Sijtsma, 2009;Yang & Green, 2011). The multi-level ω with 95% confidence interval was computed with the package "multilevelTools" (Wiley, 2022) in Rstudio. The multi-level ω estimates are interpreted in the same way as the alpha, where estimates ≥.70 are considered to indicate satisfactory internal consistency (Taber, 2018).
Before proceeding to the next step, the intraclass correlation (ICC) for the two factors was investigated separately and combined using STATA17 due to the nested structure of the data, as professional caregivers reported on average for 8.2 children each. The ICC was .23 for closeness, .20 for conflict, and the residual ICC was .23. Consequently, multi-level analyses were performed. Next, we investigated the factorial validity and measurement invariance of the STRS-SF using a series of multi-level multi-group confirmatory factor analyses (MGCFA) based on children's age and gender (Figure 1), as this two-factor structure has shown promising psychometric properties in previous studies (Aboagye et al., 2019;Settanni et al., 2015;Tsigilis & Gregoriadis, 2008). The purpose of carrying out MGCFA is to determine whether the respondents attribute the same meaning to the latent factors as well as whether the means and scores can be interpreted similarly across groups (van de Schoot et al., 2012). This is done by investigating the model fit indices while adding additional constraints to the models following a hierarchical structure ranging from configural (weak invariance) to scalar (strong invariance), more specifically: 1. Configural invariance (equal factor structure across groups) 2. Metric invariance (equal factor loadings across groups) 3. Scalar invariance (equal thresholds across groups as the variables are ordered categorical).
Step 1 was to test the STRS-SF two-factor baseline model across children's age and gender (configural variance), where all parameters could vary freely.
Step 2 was to test a model in which only the factor loadings were constrained between groups while the thresholds could vary freely (metric variance). In Step 3 we tested a model in which both the loadings and thresholds were constrained to be equal between groups (scalar invariance). Configural invariance exists if the two-factor model shows a good fit across the groups tested. Metric invariance exists if the more constrained model still shows a good model fit compared with the baseline model, whereas scalar invariance exists if the even more constrained model still shows a good model fit compared with the metric invariance model (Hirschfeld & Von Brachel, 2014).
The model fit was evaluated by inspecting the root mean square error of approximation (RMSEA), comparative fit index (CFI), and Tucker-Lewis index (TLI). RMSEA values of ≤ .05 indicate a good fit and values between .05 and .10 indicate an acceptable fit (MacCallum et al., 1996). For the CFI and TLI, values of ≥ .95 are commonly used to indicate a good model fit (Hu & Bentler, 1999), however, Browne and Cudeck (1993) argue that these thresholds are too strict and rather recommend a threshold of > .90 to indicate a good model fit and values of .80-.90 indicate an acceptable model fit. For the evaluation of invariance, Cheung and Rensvold (2002) recommend that a CFI reduction of ≤ .01 when adding additional constraints to the model indicates that the null hypothesis of invariance should not be rejected. The CFI difference between models was preferred as an indicator of invariance, as it is less sensitive to sample size and more sensitive to lack of variance than chi-square (χ 2 ) statistics (Meade et al., 2008). Multi-level MGCFA analyses were performed with Mplus v.8.4 (Muthén & Muthén, 1998-2017 using the weighted least square mean variance (WLSMV) estimator. The WLSMV estimator is appropriate for ordered categorical data and produces accurate parameter estimates (DiStefano & Morgan, 2014). Lastly, if scalar invariance was not found, we inspected the modification indices (χ 2 ) to locate noninvariant items and then relaxed the constraints for the non-variant items one by one, starting with the item with the greatest expected parameter change (EPC), to see if this improved the model fit.
If the less constrained scalar invariant model showed a CFI estimate within the threshold of ≤ .01, compared to the metric model, partial scalar invariance was observed.

Results
One child was excluded because of missing STRS-SF data, resulting in a final sample of 2900 children. The means and standard deviations for the STRS-SF items, as well as the closeness and conflict factors separated by children's gender and age are shown in Table 1 The fit indices for the unconstrained two-factor baseline model are shown in Table 2, indicating a good to acceptable overall model fit for the STRS-SF two-factor structure across children's gender (CFI = .932, TLI = .920, RMSEA = .069) and age (CFI = .920, TLI = .906, RMSEA = .069). Inspecting the CFI estimates in Table 3, the non-essential drop (≤.01) in CFI when adding additional constraints for children's gender provides support for both metric and scalar invariance for these groups. Regarding measurement invariance based on children's age, Table 3 shows a non-essential drop when constraining factor loadings but an essential drop (>.01) when constraints are added to the thresholds. Thus, metric invariance for age groups is supported, whereas scalar invariance is not.
As the STRS-SF showed scalar non-invariance based on children's age, modification indices were inspected. Following this, the threshold constraints for Item 7 ("This child spontaneously shares information about himself/herself," χ2 = 160.09, EPC = À.42 for the younger age group  Note. CFI = comparative fit index, TLI = Tucker-Lewis index, RMSEA = root mean square error of approximation. and χ2 = 160.12, EPC = .42 for the older age group) related to the closeness factor, were relaxed. Relaxing the constraints for this item resulted in an increase in the CFI estimate from .904 to .924, which is higher than the >.01 threshold compared to the metric invariant model with a CFI of .921. Consequently, partial scalar invariance for the STRS-SF based on age group was evidenced when relaxing the constraints for item 7.

Discussion
This study aimed to investigate and evaluate the factorial validity of the STRS-SF (Pianta, 2001) and its measurement invariance across ECEC children's gender and age. As the instrument is frequently used to measure professional caregivers' perceptions of the relationship quality with a specific child, knowledge about how the instrument works within the ECEC context and across subgroups is pivotal to make accurate estimates and interpretations. This study is the first to investigate the psychometric properties of the STRS-SF in a Nordic ECEC context and expands the knowledge about the instrument's applicability globally by including the full age range of ECEC children, where children under the age of three have previously received little or no attention. Our findings indicate that the STRS-SF has promising psychometric properties regarding internal consistency, factorial validity, and measurement invariance. However, some precautions are warranted with respect to measurement invariance and need to be considered.

Factorial Validity and Measurement Invariance
In line with previous studies (Aboagye et al., 2019;Settanni et al., 2015;Tsigilis & Gregoriadis, 2008), this study found promising psychometric properties of the STRS-SF. The instrument showed satisfactory internal consistency, a good to acceptable model fit for the two-factor structure, and full or partial measurement invariance across children's gender and age. Similar to previous studies investigating the measurement invariance of the STRS-SF, some items are flagged as non-invariant. Our findings showed full scalar invariance across children's gender, indicating that the latent means of the closeness and conflict factors can be compared meaningfully between boys and girls, as they show similar structure and meaning across the groups. In other words, professional caregivers interpreted the items similarly, regardless of the child's gender. Regarding the children's age, only partial scalar invariance was found. This indicates that the factor structure and strength of the factor loadings operate similarly between older and younger ECEC children, while caution is warranted when comparing latent means from the closeness factor, as one item operated differently depending on the child's age. Non-invariant items based on children's age related to the closeness factor have been reported in previous studies (Aboagye et al., 2019;Koomen et al., 2012), which is also the case in this study. However, and contrary to previous studies, the main sources for not finding full scalar invariance across children's age in this study were due to the item "this child spontaneously shares information about himself/herself". This indicates that professional caregivers interpret this item differently in older and younger ECEC children, which consequently may bias the closeness factor as it contains an item that holds different meaning across age groups. Additionally, the same item was also flagged by Solheim et al. (2012) as one of the main sources of model misspecification in the original full version of the STRS among Norwegian four-year-old children.
As the above-mentioned non-invariant item has a communication element to it, it is plausible to think that older and younger ECEC children varies in their degree of spontaneity and ability to share information about themselves, which in turn will lead professional caregivers to interpret this item differently. In other words, as children matures and develop their cognitive, communicative, and self-regulatory capacities, their communicative expression changes, which in turn may influence the professional caregivers' interpretations of this item on the STRS-SF. This said, it has been shown that both professional caregivers and children characteristics influence professional caregivers' perception of the child-professional caregiver relationship Choi & Dobbs-Oates, 2016). For instance, children with autism spectrum disorders have difficulties with initiating and maintaining social relationships (i.e., social communication) (Yoder et al., 2014) and it has been reported that children's level of autistic mannerisms are negatively related to professional caregivers' perception of closeness in the relationship (Blacher et al., 2014). Consequently, users of the STRS-SF should be aware of factors that may influence the ratings on the instrument.
Overall, this study supports the applicability of the STRS-SF for younger and older children in an ECEC context and across children's genders, even though the STRS-SF was originally intended to be used with children from the age of four. However, awareness of the non-variant item related to the closeness factor based on children's age is warranted when applying the instrument, as well as the rapid pace children develop during the preschool period and the limited capacities and behavioral repertoire for the youngest children. One way to deal with age-related noninvariance and developmental processes may be to use age-specific norms, acknowledging that even though the factor structure and factor loadings are comparable, the latent means may hold different meanings and thus may not be directly comparable.

Strengths and Limitations
The main strength of this study is the inclusion of two large community samples covering the full age range of children enrolled in ECEC centers. One limitation of the current study is the nonconvergence of data when investigating the psychometric properties of the STRS-SF using oneyear-intervals rather than dichotomizing children's age in under 36 months and 36 months and older. From a developmental perspective, future research should investigate this further to pinpoint more precisely at what age non-invariance is introduced to the instrument. However, in a Norwegian context this dichotomization of children's age reflects the organizational structure of ECEC centers in Norway, as children are grouped based on their age. Another possible limitation is the social desirability bias, which would be present if professional caregivers feel that forming high-quality relationships with children is expected, leading them to report higher relationship quality than it actually is. In this study, we instructed professional caregivers who knew the child best to complete the STRS-SF. Normally, children will interact with several adults during their days in an ECEC center. Unfortunately, an inter-rater approach was not available for this study. Thus, future research should investigate the inter-rater agreement between staff members in ECEC centers to more closely investigate how perceptions of relationship quality are formed and factors related to it. Additionally, investigating the psychometric properties of the STRS-SF using other subgroups should be explored further (e.g., children developing normally vs. developmentally delayed). Last, even though the findings from this study lend support to the factor structure of the STRS-SF, the non-invariant items seem to differ between cultures, indicating that professional caregivers from different cultures seem to interpret items differently. Consequently, an interesting aim for future research would be to investigate the STRS-SF from a multicultural perspective to see how culture influences the perception of relationship quality.

Conclusion
This study adds to the knowledge about the psychometric properties of the STRS-SF, both globally and in Nordic countries, supporting its applicability in an ECEC context. The findings indicate that the STRS-SF can be indiscriminately applied to both younger and older children in ECEC centers based on children's gender, but some caution is warranted when comparing latent mean scores of older and younger children due to a non-invariant item related to the closeness factor, indicating that this item is interpreted differently by professional caregivers across children's age groups.