A Systematic Review on Evaluating Responsiveness of Parent- or Caregiver-Reported Child Maltreatment Measures for Interventions

Aims: Child maltreatment (CM) is a global public health and social problem, resulting in serious long-term health and socioeconomic consequences. As parents are the most common perpetrators of CM, parenting interventions are appropriate strategies to prevent CM. However, research on parenting interventions on CM has been hampered by lack of consensus on what measures are most responsive to detect a reduction in parental maltreating behaviours after parenting intervention. This systematic review aimed to evaluate the responsiveness of all current parent- or caregiver-reported CM measures. Methods: A systematic search was conducted in CINAHL, Embase, ERIC, PsycINFO, PubMed and Sociological Abstracts. The quality of studies and responsiveness of the measures were evaluated using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) guidelines for systematic reviews of patient-reported outcome measures. Only measures developed and published in English were included. Studies reporting data on responsiveness of the included measures were selected. Results: Sixty-nine articles reported on responsiveness of 15 identified measures. The study quality was overall adequate. The responsiveness of the measures was overall insufficient or not reported; high-quality evidence on responsiveness was limited. Conclusions: Only the Physical Abuse subscale of the ISPCAN Child Abuse Screening Tool for use in Trials (ICAST-Trial) can be recommended as most responsive for use in parenting interventions, with high-quality evidence supporting sufficient responsiveness. All other overall scales or subscales of the 15 included measures were identified as promising based on current data on responsiveness. Additional psychometric evidence is required before they can be recommended.


Rationale
3 Describe the rationale for the review in the context of what is already known.

3-7
Objectives 4 Provide an explicit statement of questions being addressed with reference to participants, interventions, comparisons, outcomes, and study design (PICOS).

Protocol and registration 5
Indicate if a review protocol exists, if and where it can be accessed (e.g., Web address), and, if available, provide registration information including registration number.

N/A
Eligibility criteria 6 Specify study characteristics (e.g., PICOS, length of follow-up) and report characteristics (e.g., years considered, language, publication status) used as criteria for eligibility, giving rationale.9 Information sources 7 Describe all information sources (e.g., databases with dates of coverage, contact with study authors to identify additional studies) in the search and date last searched.9 Search 8 Present full electronic search strategy for at least one database, including any limits used, such that it could be repeated.9 Study selection 9 State the process for selecting studies (i.e., screening, eligibility, included in systematic review, and, if applicable, included in the meta-analysis).

9-10
Data collection process 10 Describe method of data extraction from reports (e.g., piloted forms, independently, in duplicate) and any processes for obtaining and confirming data from investigators.
10 Data items 11 List and define all variables for which data were sought (e.g., PICOS, funding sources) and any assumptions and simplifications made.

10-15
Risk of bias in individual studies 12 Describe methods used for assessing risk of bias of individual studies (including specification of whether this was done at the study or outcome level), and how this information is to be used in any data synthesis.

14
Describe the methods of handling data and combining results of studies, if done, including measures of consistency (e.g., I 2 ) for each meta-analysis.

13
Risk of bias across studies 15 Specify any assessment of risk of bias that may affect the cumulative evidence (e.g., publication bias, selective reporting within studies).

14
Additional analyses 16 Describe methods of additional analyses (e.g., sensitivity or subgroup analyses, meta-regression), if done, indicating which were pre-specified.For all outcomes considered (benefits or harms), present, for each study: (a) simple summary data for each intervention group (b) effect estimates and confidence intervals, ideally with a forest plot.

21
Present results of each meta-analysis done, including confidence intervals and measures of consistency.

17-18
Risk of bias across studies 22 Present results of any assessment of risk of bias across studies (see Item 15).

Summary of evidence 24
Summarize the main findings including the strength of evidence for each main outcome; consider their relevance to key groups (e.g., healthcare providers, users, and policy makers).

18-26
Limitations 25 Discuss limitations at study and outcome level (e.g., risk of bias), and at review-level (e.g., incomplete retrieval of identified research, reporting bias).

23-24
Conclusions 26 Provide a general interpretation of the results in the context of other evidence, and implications for future research.

FUNDING
Funding 27 Describe sources of funding for the systematic review and other support (e.g., supply of data); role of funders for the systematic review. N/A

As per CINAHL 63
Notes.All searches performed on the 15th and 16th of January 2020 with an update on the 23rd of March 2021.a Search terms in PubMed and Sociological Abstracts are same as in CINAHL except using double quotation marks before and after name of measures.
Table S3.Risk of Bias checklist for assessing the methodological quality of studies adapted from the COSMIN manual for systematic reviews of measures (Mokkink et al., 2018).

Item description
Responsiveness Comparison before and after an intervention Design requirements Was an adequate description provided of the intervention given?

Statistical methods
Was the statistical method appropriate for the hypotheses to be tested?

Other flaws
Were there any other important flaws in the design or statistical methods of the study?
Note.AUC = Area Under the Curve; The Risk of Bias checklist was used for assessing the methodological quality of studies (Step 2 in Figure 1).a Each standard on methodological quality was rated using a four-point rating scale: inadequate, doubtful, adequate, and very good; The overall methodological quality per study was determined calculating a percentage of the ratings (Cordier et al., 2015): inadequate = 0-25%, doubtful = 25.1-50%,adequate = 50.1-75%,and very good = 75.1-100%.
Table S4.Criteria for good responsiveness adapted from the COSMIN manual for systematic reviews of measures (Mokkink et al., 2018).Note.AUC = Area Under the Curve; The criteria for good responsiveness was used for rating the results of single studies on responsiveness (Step 3.1 of Figure 1) and rating the pooled results of all studies per measure (Step 3.2 of Figure1).a + = Sufficient, -= Insufficient, ?= Indeterminate, and NR = Not Reported.b The quality criterion for good responsiveness on comparison of change scores before and after intervention was determined as medium effect size (Hedges' g = 0.5) using (Cohen, 1988) conventions to interpret effect size, which was decided by the review team for this current review as suggested by the COSMIN manual (Mokkink et al., 2018).
Table S5.Modified GRADE approach for grading the quality of evidence on responsiveness per measure adapted from the COSMIN manual for systematic reviews of measures (Mokkink et al., 2018).All studies not addressing construct or target population of the review Note.The modified GRADE approach was used for grading the quality of summarized evidence on responsiveness (Step 3.3 of Figure 1); The starting point of evidence quality is 'high' quality of evidence; the level of evidence quality is downgraded by the sum of scores per factor.a The criterion for inconsistency was determined by the review team for this current review as suggested by the COSMIN manual (Mokkink et al., 2018(Mokkink et al., ), et al., 2018)); The review team decided to evaluate inconsistency or heterogeneity in results across studies using I-squared (I 2 ) statistic that is the percentage of the total variability in a set of effect sizes across the studies due to heterogeneity; Values of less than 50%, 50% to 74%, and higher than 75% denote low, moderate, and high heterogeneity, respectively (Higgins et al., 2003).a Random sample allocation indicates that the sample is randomly allocated to an intervention or control group; Non-random sample allocation indicates that the sample is not randomly allocated to an intervention or control group (Altman, 1991).b Sample size is the total number of sample completing the measures both before and after intervention in treatment group.a Subscales were included if data on factor analysis and Cronbach's alpha determined per subscale could be retrieved from the literature, thus confirming the scale's multidimensional structure (Mokkink et al., 2018).b Methodological quality was evaluated using the Risk of Bias checklist for assessing the methodological quality of studies on responsiveness (Online Supplemental Table S3) in Step 2 of Figure 1.c Statistical method for mean difference before and after intervention was used either to calculate p-values or to estimate effect sizes in the included studies.P-values were calculated through paired t-tests or repeated measures ANOVAs in most cases; effect size was estimated through calculating standardized mean differences (SMD) such as Cohen's d or Hedges' g (Hedges & Olkin, 2014).d Random sample allocation indicates that the sample is randomly allocated to an intervention or control group; Non-random sample allocation indicates that the sample is not randomly allocated to an intervention or control group (Altman, 1991).e Effect size was calculated using the formulas presented by Borenstein et al. (2009); Hedges' g = a statistic to measure the effect size from change scores between before and after intervention (Hedges & Olkin, 2014), CI = Confidence Interval.f Rating on result of each study was determined using the criteria for good responsiveness (Online Supplemental Table S4) in Step 3.1 of Figure 1; + = Sufficient, ?= Indeterminate, -= Insufficient, ± = Inconsistent.a Subscales were included if data on factor analysis and Cronbach's alpha determined per subscale could be retrieved from the literature, thus confirming the scale's multidimensional structure (Mokkink et al., 2018).b Quality of evidence consists of four factors: risk of bias (methodological quality of the studies: step 2 in Figure 1), inconsistency (inconsistent results across the studies: final pooled results from step 3.2 in Figure 1), Imprecision (small pooled sample size of the studies resulting in wide confidence intervals), and indirectness (evidence from different populations other than the ones of interest in the review).c Effect size was calculated using the formulas presented by Borenstein et al. (2009); Hedges' g = a statistic to measure the effect size from change scores between before and after intervention (Hedges & Olkin, 2014), CI = Confidence Interval, I 2 = I-squared as measure of inconsistency (the percentage of total variability across studies due to heterogeneity; Higgins et al., 2003).d Publication bias refers to the bias that may inflate the pooled effects as studies with small sample sizes and small effects may potentially be unpublished and missing (Higgins & Green, 2011); A publication bias p-value obtained by Begg's test (Begg & Mazumdar, 1994) of less than 0.05 indicated significant publication bias existed in the pooled effect size; If publication bias was indicated, then the trim-and-fill test by Duval and Tweedie (2000) was next performed using the fixed-effect model to produce an adjusted pooled effect size and confidence interval after accounting for missing studies due to publication bias (Duval & Tweedie, 2000); The publication bias were not tested with less than three studies (Higgins & Green, 2011).e Overall rating on pooled result of all studies was determined using the criteria for good responsiveness (Online Supplemental Table S4) in the step 3.2 of Figure 1; + = Sufficient, ?= Indeterminate, -= Insufficient, ± = Inconsistent; The same criteria were applied to determine overall rating on adjusted pooled results accounting for missing studies due to publication bias (Duval & Tweedie, 2000).f Overall quality of evidence was downgraded using the modified GRADE approach (Online Supplemental Table S5) for grading the quality of summarized evidence on responsiveness (Step 3.3 of Figure 1) when there were concerns regarding each factor on quality of evidence: High = high level of confidence in overall ratings, Moderate = moderate level of confidence in overall ratings, Low = low level of confidence in overall ratings, Very Low = very low level of confidence in overall ratings; Publication bias was not considered for grading the quality of evidence in the modified GRADE approach due to a lack of registries for studies on psychometric properties according to the COSMIN manual (Mokkink, Prinsen, et al., 2018).

SBS-SF
-2 = Adult Adolescent Parenting Inventory-2, APT = Analog Parenting Task, CNQ = Child Neglect Questionnaire, CNS-MMS = Child Neglect Scales-Maternal Monitoring and Supervision Scale, CTS-ES = Child Trauma Screen-Exposure Score, CTSPC = Conflict Tactics Scales: Parent-Child version, FM-CA = Family Maltreatment-Child Abuse criteria, ICAST-Trial = ISPCAN (International Society for the Prevention of Child Abuse and Neglect) Child Abuse Screening Tool for use in Trials, IPPS = Intensity of Parental Punishment Scale, MCNS = Mother-Child Neglect Scale, MCNS-SF = Mother-Child Neglect Scale-Short Form, P-CAAM = Parent-Child Aggression Acceptability Movie task, POQ = Parent Opinion Questionnaire, PRCM = Parental Response to Child Misbehavior questionnaire, SBS-SV = Shaken Baby Syndrome awareness assessment-Short Version; NR = Not Reported.
-2 = Adult Adolescent Parenting Inventory-2, APT = Analog Parenting Task, CNQ = Child Neglect Questionnaire, CNS-MMS = Child Neglect Scales-Maternal Monitoring and Supervision Scale, CTS-ES = Child Trauma Screen-Exposure Score, CTSPC = Conflict Tactics Scales: Parent-Child version, FM-CA = Family Maltreatment-Child Abuse criteria, ICAST-Trial = ISPCAN (International Society for the Prevention of Child Abuse and Neglect) Child Abuse Screening Tool for use in Trials, IPPS = Intensity of Parental Punishment Scale, MCNS = Mother-Child Neglect Scale, MCNS-SF = Mother-Child Neglect Scale-Short Form, P-CAAM = Parent-Child Aggression Acceptability Movie task, POQ = Parent Opinion Questionnaire, PRCM = Parental Response to Child Misbehavior questionnaire, SBS-SV = Shaken Baby Syndrome awareness assessment-Short Version; NE = Not Evaluated due to no intervention study assessing responsiveness, NR = Not Reported due to no relevant data found to calculate effect size.
Note.AAPI-2 = Adult Adolescent Parenting Inventory-2, APT = Analog Parenting Task, CNQ = Child Neglect Questionnaire, CNS-MMS = Child Neglect Scales-Maternal Monitoring and Supervision Scale, CTS-ES = Child Trauma Screen-Exposure Score, CTSPC = Conflict Tactics Scales: Parent-Child version, FM-CA = Family Maltreatment-Child Abuse criteria, ICAST-Trial = ISPCAN (International Society for the Prevention of Child Abuse and Neglect) Child Abuse Screening Tool for use in Trials, IPPS = Intensity of Parental Punishment Scale, MCNS = Mother-Child Neglect Scale, MCNS-SF = Mother-Child Neglect Scale-Short Form, P-CAAM = Parent-Child Aggression Acceptability Movie task, POQ = Parent Opinion Questionnaire, PRCM = Parental Response to Child Misbehavior questionnaire, SBS-SV = Shaken Baby Syndrome awareness assessment-Short Version; NE = Not Evaluated due to no intervention study assessing responsiveness; NA = Not Applicable due to there being either a limited number of studies or nonsignificant publication bias.

Table S2 .
Database Search Strategies.

Table S6 .
Descriptions of included articles on responsiveness of measures for the assessment of child maltreatment.

Table S7 .
Single analysis at scale level results and ratings on responsiveness: Detailed findings for Step 3.1 in Figure1.

Table S8 .
Pooled results, overall ratings, and quality of evidence on responsiveness per measure: Detailed findings for Step 3.2 and 3.3 in Figure1.