A systematic review of training methods to increase staff’s knowledge and implementation of positive behaviour support in residential and day settings for individuals with intellectual and developmental disabilities

Behaviour support plans (BSPs), if accurately implemented, have been found to increase skills and decrease challenging behaviour of individuals with intellectual and developmental disabilities. Training is essential for staff to acquire the skills necessary for accurate implementation. The aim of this systematic literature review was to evaluate procedures used to train staff in Positive Behaviour Support (PBS), on both knowledge of PBS and implementation of BSPs. Systematic searches of 4 databases identified 18 studies as meeting criteria. Findings indicate that description alone was not consistently effective in increasing knowledge and should be used in combination with other training strategies. Staff’s implementation of BSPs were increased by different combinations of the following training components: description, feedback, modelling, role-play, monitory incentive, and escape contingency. To identify evidenced based practice when training staff on BSPs, it is necessary to evaluate active and feasible training components from current training models.


Introduction
It is estimated that 10-15% of people with intellectual and developmental disabilities present with challenging behaviour (CB; Allen et al., 2005;Emerson et al., 2001). Challenging behaviour negatively impacts on an individual's educational and social life and if not addressed may persist across an individual's lifetime (Kiernan and Kiernan, 1994). This can result in limited access to the community, education, employment and social relationships (social isolation, exclusion) (Emerson et al., 2001;Horner et al., 2002;Royal College of Psychiatrists, 2007). Therefore, it is crucial to identify suitable interventions to increase the quality of life of individuals and those supporting them, as CB can impact on staff working directly with individuals engaging in such behaviours, often resulting in staff burn out and high staff turnover (Devereux et al., 2009;Felce et al., 1993).
Positive Behaviour Support (PBS) is one such approach which has yielded positive outcomes in increasing positive skills and reducing CB for individuals with intellectual and developmental disabilities, thus, increasing overall quality of life (LaVigna and Willis, 2012). PBS evolved from applied behaviour analysis in the 1980s and 1990s. Since this time, there have been developments in the core understanding of PBS as a value based person focused approach, using long-term, systems change and educational methods to achieve outcomes . PBS is now recommended as best practice within special educational law in the USA (IDEA, 2004) and is mandated within residential services in Ireland for individuals who engage in CB (Health Act, 2007) in the form of a Behaviour Support Plan (BSP). This has paved the way for PBS to be adapted both at an organizational and individual level, within services.
A BSP incorporates several behavioural interventions across four main categories: i) environmental accommodations (ecological; to reduce the likelihood of CB occurring); ii) skills teaching interventions (positive programming; teaching alternative appropriate skills to replace CB); iii) direct interventions (focused support strategies; alternative appropriate skills are reinforced over CB); iv) and reactive strategies (non-aversive strategies to ensure safety and the dignity of the person when responding to CB) (LaVigna and Willis, 2005).
For a BSP to have the desired effect of reducing CB and improving the quality of life of individuals with intellectual and developmental disabilities, accurate implementation is essential. Frontline staff often hold primary responsibility for implementation of BSPs, therefore, staff training is pivotal in achieving accurate implementation (Hastings and Brown, 2000;Reid, 2004). Staff training is also vital from an ethical perspective, as incorrectly implemented interventions can impact negatively on both individuals supported by the service (who will be referred to as clients) (Carroll et al., 2013), and staff (Kazemi et al., 2015). When contemplating staff training it is important to consider the content, the process of delivery, and the resulting outcomes (e.g. skills acquired by staff). Given that PBS is mandated through legislation, this has led to system change within organizations, which includes providing training on PBS, which was not evident to the same degree when PBS was coming to the fore in the 90s. Existing literature on staff training indicates that, a variety of different training packages (e.g. instructions, modelling, role-play, and feedback) have been used for teaching behavioural intervention.
A large-scale scoping review of training practices for staff supporting individuals with intellectual and developmental disabilities indicated that staff implementation was the most frequently evaluated variable, however few studies focused on client outcomes (Gormley et al., 2019b). A meta-analysis examining the impact of training staff that support individuals with intellectual disabilities and engage in CB indicated that while training was moderately effective in changing staff outcomes, there wasn't conclusive evidence of change in levels of CB for the supported individuals (Knotter et al., 2018). To date, MacDonald and McGill (2013) is the only Systematic Literature Reviews (SLR) undertaken to examine the effectiveness of staff training for interventions targeting CB of individuals with intellectual and developmental disabilities specifically for PBS. MacDonald and McGill (2013), examined the effects of staff training in PBS on outcomes of staff, including knowledge or skills. The SLR reported on the length, format and content of the training, however, insufficient detail was provided on the methods used to train staff, preventing the training components responsible for training efficacy to be identified. A recent review by Brady et al. (2019) undertook a SLR which summarized outcomes of studies with respect to procedural fidelity of behavioural interventions (which included but was not specific to BSPs) for individuals with intellectual and developmental disabilities in human services. They concluded that feedback was the most commonly used training strategy, with moderate effect sizes when used in isolation and large effect sizes when used as an element of a training package. Whereas instruction and teaching had the highest effect size but were only ever used in combination with other training strategies. This review was completed across human services settings, where there are large variances in available resources, with a wide range of participants and included studies which were training both BSPs and single behavioural interventions.
Consequently, the objective of the current systematic literature review is to extend on the review undertaken by MacDonald and McGill (2013) to evaluate procedures of training staff in PBS from 1990 to 2019 across children and adults with intellectual and developmental disabilities on the outcome variables of knowledge and implementation. In addition, the current review aims to refine the review carried out by Brady et al. (2019) to examine the effects of staff training on the implementation of BSPs in residential and day settings for individuals with intellectual and developmental disabilities.

Protocol and registration
Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines (Moher et al., 2009) were adhered to throughout the SLR process. The SLR protocol was developed by the research team and was registered on the International Prospective Register of Systematic Reviews (PROSPERO) database: http://www.crd.york.ac.uk/PROSPERO/display_record.php?ID¼CRD42017081748.

Search strategy
Four electronic databases were searched separately: Psych INFO, Scopus, Psychology and Behavioral Sciences, and Web of Science; as well as hand searches of reference lists of identified studies. The key terms were separated into Lists; List 1: staff train*, carer train*, paraprofessional train*, List 2: positive behav*, behavior* intervention, List 3: intellectual disab*, developmental disab*. The key terms were entered so that every term from List 1 was paired with every term from List 2 and from List 3. The search was limited to peer reviewed journals, with English language, from 1990 to July 2019.

Search procedures
Systematic searches were undertaken in two stages. Stage 1 consisted of screening titles and abstracts according to the inclusion and exclusion criteria. Ambiguous abstracts were further reviewed in Stage 2 which consisted of using the inclusion and exclusion criteria to review full text articles, identifying studies to be included in the SLR. See Figure 1 for a summary of search strategies. Horner et al. (1990) highlight nine characteristics of PBS: an emphasis on lifestyle change, functional analysis, multicomponent interventions, manipulation of ecological and setting events, emphasis on antecedent manipulations, teaching adaptive behaviour, building environments with effective consequences, minimizing the use of punishers, distinguishing emergency procedures from proactive programming, and social validation and the role of dignity in behavioural support. (pp. 127-129) This definition was used to inform the inclusion/exclusion criteria of the current review.

Inclusion and exclusion criteria
Studies were included if: a) They had a quasi or experimental design; b) Staff participants supporting individuals with intellectual and developmental disabilities in residential and day settings; c) Training provided on behavioural principles and/or BSPs (including behavioural interventions from two or more of the four categories of multicomponent BSPs: environmental accommodations, skills teaching interventions, direct interventions and reactive strategies); and the methods of training staff were clearly outlined (e.g. feedback, role-play); d) Data on staff implementation of BSPs to reduce CB or staff knowledge about behavioural principles was included as a dependent variable post training. Horner et al. (1990) highlighted 'minimizing the use of punishers' not eliminating the use of punishers as a characteristic, for this reason comprehensive BSPs which incorporate a punishment element were included in the current review.
Studies were excluded if: a) They had qualitative data or a review research design; b) The participants trained were caregivers or worked outside residential or day settings such as schools; and c) No description was provided on staff training components; if the intervention was not informed by functional assessment, solely focused on training data collection methods, encompassed only one element of a BSP or included components that were not based on PBS (training on therapies with a behavioural component, e.g. dialectical behaviour therapy, cognitive behavioural therapy, acceptance and commitment therapy, mindfulness-based PBS); d) CB was not the focus of the intervention; if staff perception/staff attitudes of CB/PBS or staff efficacy/confidence were the sole dependent variables.

Data extraction
Articles identified as meeting the inclusion criteria were summarized and data was extracted for: a) reference and the country where the research was undertaken; b) participants information (i.e., number included, setting, previous experience with PBS); c) training content; d) training components (i.e., description, feedback, modelling, rehearsal); e) format/duration; f) research design; g) dependent variables/measures; and h) staff and client outcomes. Training components operational definitions. Instruction included both written and vocal, sessions where PBS theory, including functions of challenging behaviour or information on the behavioural interventions were the focus. Discussions were included within this. Modelling included the trainer providing a model of how to implement the intervention with a confederate or with client either in vivo or a video format outlining correct and incorrect examples. Rehearsal included the trainee role-playing the intervention with a confederate, colleague or carrying out the intervention with the client. Feedback included vocal or written feedback from trainer both corrective feedback for partial or incorrect implementation and praise for elements correctly implemented. Feedback included immediate, delayed, in vivo, via videos. Fluency training involved the use of a precision teaching approach to train to fluency using SAFMEDS. Monetary incentive included if staff met a set criterion, they received financial payment. Negative reinforcement included if the staff member achieved a set criterion during role-play or in vivo, the session ended versus if they didn't meet the criterion they continued rehearsing until the criteria was met.
Certainty of evidence. Certainty of evidence was used to describe the methodological rigour of studies and was rated as 'suggestive, preponderant' or 'conclusive' (Smith, 1981). Studies were described as suggestive if they did not utilize a true experimental design (e.g., if they used a quasiexperimental design) to evaluate outcomes. Studies were described as preponderant if they had: (i) an experimental design (e.g., multiple-baseline design/between groups with random assignment); (ii) dependent variable's that were operationally defined; (iii) inter-observer agreement for at least 20% of sessions with agreement at 80% or higher; (iv) interventions were explained clearly enough to enable replication; (v) or were not able to control for other variables that could have impacted on intervention outcomes. Conclusive was the highest level of certainty. Studies were rated at this level if they contained all the preponderant level characteristics but included treatment fidelity measures. Also, if they attempted to control for other variables/factors that could have impacted on intervention outcomes.

Synthesis of results.
A descriptive narrative synthesis was used to summarize the evidence relating to the effectiveness of different types of staff training in PBS/BSP. Staff training results, were described as positive, negative, or mixed (e.g., Machalicek et al., 2008). Outcomes for single subject research design studies were described as positive if visual analysis indicated improvement for all participants on implementation. Mixed, if visual analysis indicated improvement for only some of the participants on implementation. Negative, if visual analysis indicated no improvement for any participants on implementation. Outcomes for between subjects' design and within subjects design studies were described as positive, if there were statistically significant improvement for participants (PBS group and pre-post) on all dependent variables related to PBS knowledge and implementation. Mixed, if there were statistically significant improvements for participants (PBS group and pre-post) on some but not all dependent variables related to PBS knowledge and implementation. Negative, if there were no statistically significant improvements for participants (PBS group and pre-post) on any dependent variables related to PBS knowledge and implementation. Client outcomes were described using the same classification.

Reliability of search procedures and inter-rater agreement (IRA)
Stage 1 and Stage 2 search strategies were conducted independently by the first and second author with 94% and 89% agreement respectively. To calculate the inter-rater score, the number of agreements was divided by the number of disagreements, then added with agreements and multiplied by 100%. A third rater (fourth author) was used in the event of all disagreements and group discussion was used to reach consensus on inconsistencies until 100% agreement was reached. The first author undertook data extraction and the second author independently performed inter-rater agreement on each variable extracted (including certainty of evidence and synthesis of results) with 88% agreement. In the event of all disagreements consensus was reached through discussion until 100% agreement was reached.

Results
Eighteen articles met the criteria for inclusion in this review from 1990 to 2019, 1 from Australia, 3 from Ireland, 7 from the UK, 6 from the USA, and 1 from the Netherlands. Included articles are summarized in Table 1 in alphabetical order. Table 2 summarizes the staff training methods utilized for each study and the associated intervention outcomes.

Participants
The 18 studies included 1352 participants, and 10 of the 18 studies (55.6%) reported age characteristics, with ages ranging from 19 to 63 years. Eleven of the 18 studies (61.1%) included information on gender characteristics (1027 participants) 76.7% (n ¼ 788) were female and 23.2% (n ¼ 238) were male. Participants included staff that supported individuals with intellectual disabilities (n ¼ 6, 33.3%), intellectual and developmental disabilities, who engaged in challenging behaviour (CB) (n ¼ 5, 27.8%), severe disabilities (Macurik et al., 2008) developmental disabilities and psychiatric disorders (Berryman, et al., 1994) and varying diagnosis who engaged in self-injurious behaviour (Courtemanche et al., 2014;Shore et al., 1995). Participants also included nursing assistants, nurses and managers in health care services (Lowe et al., 2007), students completing a diploma (McGill et al., 2007), and staff in disability services (Crates and Spicer, 2012;Tierney et al., 2007). Eight of the 18 studies (44.4%) reported information on participant's prior level of PBS knowledge. Experience of participants varied across studies. Berryman et al. (1994) and Hardesty et al. (2014) reported that participants received orientation training related to behaviour management/behavioural principles and strategies, upon commencing their job. In addition, Berryman et al. (1994) described that participants received 2 days of training yearly and asked participants to rate how familiar they were with content of training, similarly in Gormley et al. (2019a) participants were asked if they had received training on the specific content taught. In contract, other studies required that participants not to have participated in a comparable training regarding the management of CB for at least 2 years (van Oorsouw et al., 2010) and to not have attended a CB training within the last 6 months (Rose et al., 2014). Across other studies participants had limited experience with CB training (not specified) (McGill et al., 2007), and no prior training (Branch et al., 2018). Jarmolowicz et al. (2008), compared experienced versus inexperienced participants. Experienced participants had at least 1 years' experience of graduate study in Applied Behaviour Analysis (ABA) whereas inexperienced participants had no formal training.

Format of training
Format was not directly reported for five studies (27.8%) but was extracted from the procedure description. From the studies which took measures of participant knowledge (n ¼ 13, 72.2%), 76.9% (n ¼ 10) of studies had a group format, Campbell and Hogg (2008) used an individual format, as it was a distance-learning course and McGill et al. (2007) used a combined group and individual format. Fifty per cent (n ¼ 4) of implementation studies used a group format, 25%, (n ¼ 2) used an individual format and 25%, (n ¼ 2) a combined format.

Duration of training
Studies which measured knowledge. Training duration varied across studies which solely focused on knowledge, from 2-hour workshops (Hardesty et al., 2014), 1 day workshops (Berryman, et al., 1994;Dowey et al., 2007;Rose et al., 2014), 3 day workshops (Tierney et al., 2007), 4 hour workshops, practice across 4 weeks, support once a week (Branch et al., 2018), 24.5 hours across 7 workshops (van Oorsouw et al., 2010) to 57 days of workshops spread across 2 years (McGill et al., 2007). Distance learning, with each module to be complete within a 3-month period (Campbell and Hogg, 2008) and 80 hours direct teaching across 10 consecutive days (32 h home-based study leave, 40 h work-based study time and were expected to contribute 28 h of their personal time) (Lowe et al., 2007).

Studies which measured both knowledge and implementation.
Duration also varied across studies which assessed both implementation and knowledge (n ¼ 3, 16.7%). Macurik et al. (2008) found Studies which measured implementation. From the implementation studies, three studies (37.5%) did not report duration of training but provided information on the criterion required to progress. Of the studies which did report this variable, duration of training ranged from 10 full days of training and coaching across 9 months (McClean and Grey, 2012), to 13 days, including longitudinal practicum spread over a 9-month period (Crates and Spicer, 2012).
Studies which measured knowledge. Five studies (38.5%) used description as the sole method of training (in conjunction with small and group activities). Three studies (23.1%) utilized both description and role-play (Dowey et al., 2007;Tierney et al., 2007;Van Oorsouw et al., 2010). One study (7.7%) used description and fluency (Branch et al., 2018) and Hardesty et al., (2014) utilized description and feedback, comparing individual feedback to group feedback through the use of response cards. Studies which measured implementation. Seventy-five per cent of studies (n ¼ 6) that measured implementation (n ¼ 8) utilized feedback, 50% (n ¼ 4) used modelling and role-play. Two studies (25%) utilized the training package of description, modelling, and feedback (Crates and Spicer, 2012;Shore et al., 1995). Shore et al. (1995)

Outcomes and certainty of evidence
Studies which measured knowledge. Six out of 13 studies (46.2%) that measured knowledge (three standardized measures and four unstandardized) had positive outcomes (Berryman et al., 1994;Branch et al., 2018;Dowey et al., 2007;Hardesty et al., 2014;Rose et al., 2014;van Oorsouw et al 2010). Five studies (38.5%) had mixed outcomes with two standardized measures and two with both standardized and unstandardized (Campbell and Hogg, 2008;Gormley et al., 2019a;Lowe et al., 2007;MacDonald et al., 2018;McGill et al., 2007). Tierney et al. (2007) had negative outcomes (standardized measure). Macurik et al., (2008) results were not interpretable. Of the studies that had positive outcomes, Berryman et al., (1994) had the strongest certainty of evidence: preponderant due to not including a measure of treatment fidelity. This study utilized the sole intervention component of description. All other studies that had positive outcomes had suggestive levels of certainty. Of the studies that had mixed outcomes, Gormley et al. (2019a) had the strongest certainty of evidence: preponderant due to no baseline data being measured for skills. However, a positive outcome was identified for knowledge from a training package of description, modelling, role-play and feedback (Gormley et al., 2019a). All other studies that had mixed outcomes had suggestive levels of certainty. For MacDonald et al. (2018) an increase in knowledge on the specific PBS test, but not the CHABA, was identified for managers. However, there were no significant increases for frontline staff on either measure. Tierney et al. (2007) which had negative outcomes had a suggestive level of certainty and combined description and role-play. MacDonald et al. (2018) was the only knowledge study which took client measures with mixed outcomes.
Studies which measured implementation. Three out of eight studies that took measures on implementation had positive outcomes (Courtemanche et al, 2014;Jarmolowicz et al., 2008;Shore et al., 1995). Five studies (62.5%) were not interpretable (Crates and Spicer, 2012;Gormley et al., 2019a;MacDonald et al., 2018;Macurik et al., 2008;McClean and Grey, 2012 Jarmolowicz et al. (2008) and Shore et al. (1995) had the next strongest level of certainty: preponderant due to treatment fidelity not being reported. Jarmolowicz et al. (2008) utilized description and role-play. They compared a technical language description (negative outcomes) which they found to be less effective than a non-technical description (positive outcomes). Shore et al. (1995), evaluated a training package consisting of description, modelling, and feedback. Description and video modelling were compared to description, video modelling (carried over from previous condition) and in vivo. Of the studies that had mixed outcomes, Gormley et al. (2019a) was preponderant due to no baseline data being measured for skills utilizing behavioural skills training. Five out of eight studies which evaluated participant implementation, took measures on client outcomes. Two of these studies (Crates and Spicer 2012;McClean and Grey, 2012) had positive outcomes for clients. Courtemanche et al. (2014), MacDonald et al. (2018, and Shore et al. (1995) had mixed outcomes for clients.

Maintenance and generalization
Eleven studies (61.1%), (nine knowledge, two implementation) took measures of maintenance, ranging from 2 weeks (Branch et al., 2018;Hardesty et al., 2014) to 1 year (Lowe et al., 2007). Berryman et al. (1994) found effects to be maintained at 9-month follow-up, Rose et al. (2014) andvan Oorsouw et al. (2010) at 2 months. Branch et al. (2018) and Hardesty et al. (2014) found effects to be maintained for the fluency group and response card groups at 2-weeks. Lowe et al. (2007) found that at 1-year follow-up, CHABA scores had generally returned to baseline levels but for the sub-group which completed the knowledge questionnaire, there was an increase in knowledge score at follow-up. Campbell and Hogg (2008) found no statistical difference between groups at the 3-month post-test. There were no measures of generalization taken for knowledge. Generalization of skills learned in training to working with the client was taken for 50% (n ¼ 4) of implementation studies.

Social validity
Three knowledge studies (23.1%) measured social validity of training delivered (Hardesty et al., 2014;Lowe et al., 2007;van Oorsouw et al., 2010). Participants in the response card group in Hardesty et al. (2014) and Lowe et al.'s (2007) study rated training positively. van Oorsouw et al. (2010) participants rated the training from acceptable to good. Four implementation studies (50%) measured social validity of the training carried out (Courtmanche et al., 2014;Crates and Spicer, 2012;Gormley et al., 2019a;Macurik et al., 2008), with all trainings being rated positively. Jarmolowicz et al. (2008) and McClean and Grey, (2012) measured social validity of the procedures trained rather than training method.

Discussion
The current review aimed to evaluate the most effective staff training methods used to increase knowledge and implementation of BSPs among staff supporting individuals with intellectual and developmental disabilities in residential or day settings. In addition, the review aimed to identify the components responsible for training efficacy. A total of 18 studies were included in the review, 13 of which measured knowledge and 8 measured implementation, while 3 articles measured both dependent variables. Descriptive analysis of the included studies led to the following conclusions.
Description, despite being the most commonly used training component, when used in isolation did not consistently result in increasing staff knowledge of PBS. Of the two studies that had positive outcomes (Berryman, et al., 1994;Rose et al., 2013), certainty of evidence was at the preponderant and suggestive levels, thus prohibiting support for the use of description in isolation to increase staff knowledge of PBS. Description when used in combination with other strategies resulted in acquisition of knowledge, in particular when description was combined with fluency training or group feedback. However, description combined with role-play was found to have mixed results (effective within two studies but ineffective in another). Descriptions from two of the three studies outlined that role-play's were used to demonstrate specified physical intervention skills (van Oorsouw et al., 2010) and personal safety techniques (Tierney et al., 2007). Therefore, it is not clear if the role-play's targeted knowledge specifically. Gormley et al. (2019a) was found to have positive outcomes for knowledge following the use of a training package consisting of description, rehearsal and feedback. Small and group activities were often used in conjunction with description, but sufficient detail was not provided, therefore, it was not possible to determine whether modelling, role-play, or other strategies were incorporated within these.
Description, modelling, feedback and role-play were the most commonly used training components in different combinations across all eight implementation studies. Description was effective at increasing implementation when used in combination with: (i) feedback; (ii) role-play; and (iii) modelling, role-play, feedback, monitory incentive and escape contingency (Courtmanche et al., 2014). Shore et al. (1995) found description and video modelling alone weren't successful, requiring the addition of in vivo feedback. Current literature evaluating component analysis of BST have utilized single subject research design methods and are not specific to training frontline staff to implement BSPs (Davis, Thomson and Connolly, 2019).
As well as variance in the combination of components included in the training packages there were differences across the type of components. For example, modelling included in vivo modelling with staff members (Courtmanche et al., 2014) and video modelling which included clients (Gormley et al., 2019a;Shore et al., 1995). Similarly, the types and quantity of feedback given varied including written, verbal, in vivo, delayed, and immediate. Within the current review, in vivo modelling was effective in a combination package, while video modelling when used in isolation with description was not effective. These findings further emphasize the need for component analysis of training strategies in order to identify the most effective and efficient components.
One factor which may warrant further consideration during training is the language used. Jarmolowicz et al. (2008) reported that the use of non-technical description and role-play resulted in greater staff implementation compared to the use of a technical description and role-play (Jarmolowicz et al., 2008). As the majority of training is designed for frontline staff, the use of non-technical language should be considered when preparing training materials as well as when developing BSPs. The use of a monetary incentive was found to be an effective component within a training package by one study. However, the use of this component within a training package requires careful consideration as it may not be feasible for organizations to support this component in the long term.
Of the studies which examined staff knowledge, client outcomes were not measured. Therefore, the impact of change in knowledge upon the individuals supports is not known. In contrast, five of the eight studies which measured implementation reported client outcomes. All five studies reported a reduction in CB (Courtemanche et al., 2014;Crates and Spicer, 2012;MacDonald et al., 2018;McClean and Grey, 2012;Shore et al., 1995), and one study measured quality of life (MacDonald et al., 2018), with no significant change reported. Therefore, further research is needed to investigate the impact of BSPs on the quality of life for the individual supported (MacDonald and McGill, 2013). An observation from these findings is that it may be challenging to assess QOL following briefer workshops, whereas this may be more feasible in longitudinal trainings such as in McClean and Grey, (2012). Alternatively, it may be worth gathering data on whether the clients acquired adaptive behaviours as a result of the staff training, to assess if the training was having a direct impact on socially significant behaviours for the clients.
The current review builds on MacDonald and McGill (2013) by including a thorough descriptive analysis of training components and related outcomes. Similar to the findings of their review, there was variation in format, duration, content, measures used across studies reported, which must be taken into account when interpreting findings. Within the current review, seven studies included frontline staff managers in the training, for five of these there was no differentiation between supervisor outcomes and frontline staff outcomes. Shore et al. (1995) trained the managers to train the frontline staff and in MacDonald et al. (2018) study, the managers were the primary participants that oversaw the implementation of the BSPs after receiving training. Shore et al. (1995) demonstrated that the managers were effective in training the frontline staff to implement the BSPs. These findings also indicate that key personnel (in this instance managers) could also be successfully trained (Shore et al., 1995). This is a consideration for organizations in terms of cost effectiveness. MacDonald et al. (2018) utilized managers to observe and provide feedback to staff on their implementation, while data was not taken on the frontline staff's implementation it is not possible to interpret its impact. However, having mangers on-site who are skilled in this way may have other benefits of providing ongoing feedback and monitoring. Therefore, further research should evaluate pyramidal training.
To enhance the current knowledge and practice base, it is essential to assess and identify active components of current training models. Future research should also consider cost effectiveness in terms of brief skills-based training (e.g. Gormley et al., 2019a) versus longitudinal training (McClean and Grey, 2012) in terms of best outcomes for acquiring skills, knowledge, client QOL, generalization, maintenance, social validity and what is ecologically valid for organizations. However, prior to evaluations of person focused, brief and pyramidal training, which would impact on development of overall training approaches, it is crucial to identify the active training components (e.g. feedback) necessary for frontline staff to acquire implementation skills which could then be incorporated within these training models.
Within the current review a wide variety of measures (both standardized and unstandardized) were used to measure change in participants knowledge, attributions, behavioural explanations and implementation of specific behavioural interventions. Standardized measures had mixed outcomes whereas unstandardized measures all had positive outcomes. Only two studies used both standardized and unstandardized measures. In order to evaluate the full impact of training it may be important to include both standardized and training specific measures to investigate both changes in attributions/understanding of CB and knowledge of taught procedures. It also raises the question of whether there is a need for refresher courses on the theory around functions of behaviour, whereas knowledge on interventions that are being implemented on a daily basis may be more likely to be maintained.
Across all studies (both implementation and knowledge), there was a lack of information provided on participants previous experience. The information provided focused on how long participants had been working with individuals with intellectual and developmental disabilities or in the particular setting, but not specifically related to PBS training/experience.
In terms of study designs, is was not possible to interpret the findings with respect to implementation for larger scale group designs (Gormley et al., 2019a;MacDonald et al., 2018;Macurik et al., 2008;McGill and Murphy, 2018), as no baseline measures were taken. These studies incorporated training staff on the principals of PBS/ABA and on specific elements of BSPs. The challenges around obtaining baseline measures of implementation, in large group designs, in residential or day-care settings must be acknowledged due to reactivity, time constraints and the resources required. Simulated role-play baseline assessments which are more often used in educational settings (Shapiro and Kazemi, 2017) may be a viable alternative, while incorporating an on-site observation for a select intervention. If the most commonly recommended behavioural interventions of BSPs are identified, they could be used as the target interventions for baseline assessment and incorporated as the training content. It will be likely that these interventions will be implemented more frequently and reflect an accurate representation for baseline skills which can then be assessed post training. This may also serve to enhance the social validity of training for staff. In an instance that the BSPs are already developed, specific examples from active BSPs may further enhance social validity.
Implementation studies which measured generalization focused on generalization from the training environment to working with the client. A consideration for Jarmolowicz et al. (2008), was staff's performance with a confederate (not a client), which highlights the need to determine if those skills would have generalized to working with the individuals the staff support. Across studies, maintenance was assessed more frequently in knowledge-based studies. However, it is recommended that further evaluations of knowledge include longer term follow-up, such as 9 months (Berrryman et al., 1994) and 1 year (Lowe et al., 2007). In addition, future research could examine maintenance across attributions about CB versus acquired knowledge of PBS, and specific behavioural interventions. Finally, within the current review 12 studies had suggestive certainty of evidence, which indicates that there is a lack of fidelity measures taken. Therefore, the quality of the training provided to the staff is not clear.
Future research in the area of staff training could be enhanced by specifying participants previous experience, as to date, the impact of previous PBS training, knowledge and experience is unknown. In order to control for previous experience as an extraneous variable it is necessary for studies to account for this factor. Similarly, clear descriptions of 'additional activities' are necessary to enhance rigor, as well as the use of experimental designs to evaluate the effects of staff training on knowledge and implementation of BSPs. While there were many strengths to this review, limiting the participants to staff that provide care for individuals with intellectual and developmental disabilities in residential and day settings may have resulted in missed studies examining BSP implementation in school settings, however other reviews have included these settings.
This review further supports Brady et al. (2019) findings for the need to evaluate different components of current training models and builds on it by specifically focusing on BSPs and residential/day settings. It also extends on the work of Brady et al. (2019) by highlighting in detail methodological factors necessary to achieve high quality research that would enable evidencebased practice to be identified.
In conclusion, to teach knowledge of PBS to staff, the training component description alone is not consistently effective and should be used in combination with other training strategies. In intellectual and developmental disabilities services (residential and day settings), the primary goal is the implementation of BSPs, which is supported by staff knowledge of both CB and of the behavioural interventions. Targeting both of these areas of knowledge is important. For staff to acquire the necessary skills to consistently implement the BSPs, description combined with feedback or roleplay are effective packages, as are more complex training packages that include more than four components. However, direct evaluation using experimental rigorous designs to identify which of these components are efficacious, necessary, efficient, feasible and cost-effective, should be conducted to identify evidenced based practice when training staff to implement BSPs.