Interrater Reliability of the FOCUS-34: Parent-to-Parent and Parent-to-Clinician

This brief report presents interrater reliability data for the Focus on the Outcomes of Communication Under Six (FOCUS-34) between parents, and between parents and speech-language pathologists (SLPs). Reliability for all three raters combined was good to excellent across three assessments. Reliability for pairs of raters was variable but generally good.

The Focus on the Outcomes of Communication Under Six (FOCUS) is a widely used parent report measure that captures changes in children's functional communication skills during speech-language therapies (Thomas-Stonell et al., 2010).Both versions  are ideally completed by a parent, but if parents are absent, a clinician version is available.The FOCUS has demonstrated differential change over time in children with different functional abilities and communication impairments (Cunningham et al., 2021).Psychometric assessments have documented evidence of internal consistency (Thomas-Stonell et al., 2009), convergent and discriminant validity (e.g., Thomas-Stonell, Oddson, et al., 2013;Washington et al., 2013), and good interrater reliability between clinician raters (Oddson et al., 2013).Interrater reliability between parents and speech-language pathologist (SLP) raters was found to be good in one study, with fair reliability for change scores (Thomas-Stonell, Oddson, et al., 2013).
Thomas-Stonell, Oddson, et al. (2013) found good agreement between parents' and SLPs' ratings across multiple assessments, however, there was less agreement about change scores, and only mothers participated in the study.Additional evidence is needed to understand interrater reliability between parents (including fathers) and SLPs.To date, no data have been published showing reliability between parents.This evidence is important for clinicians using the FOCUS in clinical programs (e.g., Cunningham et al., 2018) since the same parent cannot always complete all assessments.While the original 50-item FOCUS has been used most in research, the FOCUS-34 is used often in clinical settings (Cunningham et al., 2021), as in this study.This study explored the interrater reliability of FOCUS-34 total and change scores between parents, and between parents and SLPs.We hypothesized good agreement for both total FOCUS-34 and FOCUS-34 change scores.Strength of agreement is further defined below.

Ethical Approval
This study was associated with a quality improvement contract involving the participating clinic, so formal ethics review was not required.

Participants
Twelve SLP volunteers from one community clinic recruited parent participants and collected data for this study.All families seen for assessment by these SLPs were invited to participate, but only families where both parents could attend the first appointment in person were included to ensure all participants had the same knowledge of the FOCUS-34.Data for 24 children were reported, however, complete data were available for only 13 children.Data were considered complete if both parents and the SLP submitted data for the first two assessments.

Materials
FOCUS-34.The FOCUS-34 is parent-report measure.Parents rate items about children's usual communicative participation on 7-point Likert scales (Oddson et al., 2019).Scores range from a minimum of 34 to a maximum of 238; however, it is the change score that is used to determine whether children have made clinically meaningful change, namely 11 or more points on the FOCUS-34 (Thomas-Stonell et al., 2020).The 11-point criterion was derived both statistically and clinically based on parents' and SLPs' judgments of whether meaningful change occurred (Oddson et al., 2019;Thomas-Stonell, Washington, et al., 2013).Written instructions and definitions of communication terms are provided for those completing the FOCUS (Thomas-Stonell et al., 2020).
Informal Data Collection Form.An informal data collection form captured basic demographic and service-based information at each assessment (i.e., SLP's name, child's study ID, child's age in months, and child's level of communicative function as described using the Communication Function Classification System (CFCS), a tool for categorizing children's abilities into one of five functional levels (Hidecker et al., 2011).Together with parents, SLPs identify a child's CFCS level by considering all methods of communication and how children usually engage in everyday situations requiring communication (see cfcs.us for more information).

Procedures
Participating SLPs reviewed a copy of the FOCUS-34 manual and attended an information session where administration procedures were described.The FOCUS-34 was completed independently by all participants (both caregivers and the SLP) at up to three assessment points: (1) when the child first attended therapy, (2) at the end of that therapy block (average = 4.25 months between), and (3) at a re-assessment following a period of no-treatment (average = 5.39 months from time 2).Ratings were based on participants' direct and informal observations of the child in various contexts.FOCUS-34 forms were submitted to a local coordinator who submitted de-identified files to the researchers.

Data Analysis
Descriptive statistics were used to profile child participants.Intraclass correlation coefficients (ICCs) were calculated using a one-way random effects model (Koo & Li, 2016) to determine level of agreement for FOCUS-34 scores at each assessment and for change scores.Agreement was calculated separately for each change interval (e.g., change from Time 1-2), and within each change interval for agreement between different pairs of raters (e.g., mother-to-father agreement) where possible.With 80% power and alpha set at 0.05, a minimum sample size of 7 for the 2 × 2 ratings, and 4 for the 3 x 3 categorizations was required to detect an ICC coefficient of .80 (Bujang & Baharum, 2017).If the minimum sample size was not met, ICC coefficients were not calculated.ICC values < 0.5 were interpreted as "poor" reliability, 0.5 to 0.75 were "moderate," 0.75 to 0.9 were "good," and >0.9 were "excellent" (Koo & Li, 2016).

Results
Results are presented for 12 participants.One case was identified as an outlier (i.e., more than 3 SD from the mean difference between raters' scores) and removed (Oddson et al., 2013).The average age of these children at the first assessment was 36.5 months (SD = 3.9).Children represented the full span of functional communication levels: CFCS Level I (n = 1), II (n = 3), III (n = 4), IV (n = 3), and V (n = 1).
Interrater reliability for the three raters combined was excellent at Times 1 and 2, and good at Time 3. Reliability was excellent between all pairs of raters at Time 1 and Time 2. At Time 3, reliability remained good between all three raters (i.e., mothers, fathers, and SLPs), and good between fathers and SLPs, but was moderate between mothers and SLPs (see Table 1).
Overall interrater reliability was good when comparing change scores across the three raters for Time 1 to 2 (see Table 2).Reliability was good between mothers and fathers, excellent between mothers and SLPs, and good between fathers and SLPs.Good reliability was also observed between the three raters for change from Time 2 to 3; however, the small sample size limited our ability to calculate ICCs for the pairs of raters in this change period.

Discussion
Several clinical and research programs have adopted the FOCUS as an outcome measure and use it to assess the impact of interventions for children across impairment types and levels of ability (Cunningham et al., 2021).To ensure research rigor, the same caregiver would complete all assessments.In practice this is not always feasible, so it was important to determine whether the FOCUS-34 could be completed reliably by different raters.Results indicated good interrater reliability between three raters (i.e., mothers, fathers and SLPs) across multiple assessment points.Reliability was lower at Time 3 for mother-SLP ratings.This may be due to the smaller sample size (N = 7), combined with one large disagreement between two raters.
Across assessments most mothers scored their children higher than the SLPs.This could be due to the vastly greater opportunities mothers have to observe their children's communication in multiple contexts, because familiarity facilitates communication with their child, or perhaps because children perform differently in different environments.It is also possible that SLPs see more children communicate and may unknowingly compare between children (not the intended use of the FOCUS), leading to their lower ratings.
Outcome measures are designed to assess change.Therefore, it was essential to evaluate interrater reliability for change scores.Reliability for change scores in this study was good for both change intervals.When examining change between Times 1 and 2, moderate reliability was noted between the father-SLP pairs.While we did not collect this information, we suspect, based on past research with this program (e.g., Thomas-Stonell, Washington, et al., 2013), that it was primarily mothers who brought the children to speech-language therapy sessions.Therefore, the mothers may have had more ongoing contact with the SLPs and more opportunities to discuss their child's progress.It is also possible that mothers had more opportunity to learn about their child's communication attempts through observation at home and during therapy sessions.These interpretations may also explain the slightly higher reliability for mothers when compared to fathers.Ongoing contact between parent and SLP may be an important factor in ensuring high reliability over time.It is thus recommended that a consistent rater complete the FOCUS-34 over time, but our results indicate that, if needed, different raters can reliably complete the FOCUS-34.It was beyond the scope of this study to explore differences in participants' ratings, but future work should explore whether there are differences in how parents and SLPs perceive specific aspects of communication.It is important to note that larger sample sizes are typically used in assessing reliability (Koo & Li, 2016).This study used a small sample of convenience obtained by SLPs recruiting parent volunteers.As such, selection bias is possible, and the predictive value of our results may be limited.The FOCUS-34 was designed to be applicable for children with a variety of communication differences, and while the children in this study represent a range of functional abilities, they may not be representative of all preschoolers attending speech-language therapy.

Conclusions
While it remains preferable to use a consistent rater to complete the FOCUS-34 over time, results demonstrate that different raters can reliably complete it.This finding has practical implications in large clinical programs.Results also suggest that reliability may be improved with ongoing contact between parents and SLPs.Future research should include replication of this study with a larger sample size and evaluate reliability of the FOCUS-34 when completed by teachers and early childhood educators-with whom SLPs often consult.

Table 1 .
Interrater Reliability for Total FOCUS-34 Scores at Each Assessment Point.