Community sanctions in youth justice compared to other youth crime responses: A meta-analysis

This meta-analysis examines the official recidivism effects of two types of community sanctions in youth justice, namely community service and behavioural intervention programmes. Two analyses were conducted: a comparison between the effects of community sanctions and custodial sanctions, versus a comparison between the effects of community sanctions and dismissals. Following a systematic literature search, data extraction and analysis, mean effect sizes were calculated utilizing (log) odds ratio as the main effect measure. To explore heterogeneity, a meta-regression was conducted with four moderator variables: methodological rigour, referral stage, main focus of sanction and sample risk level. The hypotheses were that recidivism would be significantly lower for delinquent youth subject to community sanctions compared with those subject to custodial sanctions, but that differences in recidivism between delinquent youth subject to community sanctions versus dismissals would be insignificant. In total, 23 studies were deemed eligible for inclusion (Ncust = 7, Ndism = 16). Final results were in favour of the hypotheses, namely, significantly lower recidivism rates for community sanctions compared with custodial sanctions, and no significant differences for community sanctions compared with dismissals. For both comparisons, the 95% confidence interval indicated the effects varied from just below zero to substantially in favour of community sanctions. Finally, moderator analysis revealed that studies of lower methodological quality and mixed referral stages were more likely to report larger effect sizes.


Introduction
In various (Western) criminal justice systems, responses to youth crime are differentiated from criminal justice system responses to adult crime (Dünkel et al., 2010;Leenknecht et al., 2020;Winterdyk, 2002). As stated by article 40(1) of the United Nations Convention on the Rights of the Child, member countries must treat individuals under the age of 18 who come in contact with the criminal justice system in a manner that justly considers their age, reintegration and future participation in society (UN General Assembly, 1989). As a result, youth justice proceedings have different punishment aims and sentencing goals than adult justice proceedings. In particular, this applies to punishment involving custody (such as prison sentences and other incarceration measures), because a significant body of research suggests custodial sanctions can be more harmful than beneficial, especially amongst youth (Fagan and Kupchik, 2011;Harrison et al., 2020;Sampson and Laub, 1993).
On the contrary, researchers, lawyers and practitioners believe that youth crime responses delivered in non-custodial settings can eliminate the harmful effects of custodial sanctions, yet allow for rehabilitation, address the root causes of criminal behaviour, and perform better at preventing future delinquency (Cullen and Gendreau, 2001). Especially during the 1970s and 1980s, this notion led to the development of custodial punishment alternatives (Carter and Klein, 1976;Junger-Tas, 1994;Mulligan, 2009;Newton, 1981;Pratt, 1986;Walgrave, 1998).
Perhaps the present most commonly imposed punishment alternative in youth justice is the 'community sanction', a mode of punishment that may contain work, learning, therapeutic and/or supervision components, intended as a viable alternative to custodial sanctions that can be performed in the community, instead of custodial settings (Aebi et al., 2014;Beijerse uit, 2019;Winterdyk, 2002). Exact figures on the prevalence of community sanctions are lacking, but in Western countries the majority of youth offending is dealt with by means of non-custodial sanctions, in which community sanctions play an important role. For example, in 2018 in the United States (US), 28 percent of adjudicated youth received a custodial sanction, while the remaining 72 percent received another type of sanction or measure not involving detention (Hockenberry and Puzzanchera, 2020). Other Western countries such as Canada (87 percent) and the Netherlands (90 percent) show even higher rates of youth with a community sanction or measure (Miladinovic, 2019;Weijters et al., 2019) and, overall, European conviction statistics indicate that 55 percent of adjudicated youth between 2007 and 2011 received a community sanction (Aebi et al., 2014).
What constitutes a community sanction varies widely between countries and jurisdictions, given that it may include community service and behavioural intervention programmes, but also out of court diversion, fines, (intensive) supervision orders and probation (Beijerse uit, 2019;McNeill and Robinson, 2013). In this article, the focus is on the first two types of community sanctions, namely community service and behavioural intervention programmes. In essence, these sanctions are different from traditional sanctions such as custody and fines because they do not mainly focus on retribution but predominantly emphasize other sanction goals, such as rehabilitation and reconciliation. They are community sanctions because the sanctioned juveniles are not held in custody but have to be active in their community to repair or to improve their personality and behaviour. Likewise, they constitute community sanctions because they are mandatory, with a clear beginning and end.
Overall, one must note that the evidence base on the effectiveness of community sanctions is relatively thin (McNeill and Robinson, 2013). Previous studies have examined the effects of one broad type of community sanction, namely diversion programmes and measures (in comparison with custodial sanctions) in the adult criminal justice system (Dowden and Andrews, 1999;Illescas et al., 2001;Lipsey and Cullen, 2007;Smith et al., 2002;Villettaz et al., 2015), and the effects of various prevention, intervention and (police) diversion programmes for juvenile offenders (Bouchard and Wong, 2018;Dowden and Andrews, 1999;Krisburg et al., 1995;Lipsey and Wilson, 1999;Wilson et al., 2018). However, research focused on the effectiveness of more formal community sanctions for youth, such as community service and mandatory behavioural intervention programmes, is scarce and a meta-analysis summarizing the existing state of evidence has yet to be conducted. As such, the current evidence base appears incomplete in terms of convincingly demonstrating that community sanctions are significantly more effective responses to youth crime than are other reactions, and deeper insights into for whom and why this would be the case. This knowledge gap is remarkable, considering the frequent application of community sanctions as a punishment response to youth crime. Therefore, with this article, we aim to contribute to the much-needed evidence base on community sanctions for youth offenders. In order to make an adequate evaluation, we compare the effects of two main types of community sanctions (community service and behavioural intervention programmes) with two other prominent reactions in the youth criminal justice system, namely custodial sanctions (involving confinement or detention) and dismissals (such as counsel and release practices) (Aebi et al., 2014;Beijerse uit, 2019;Winterdyk, 2002).

Prior research on the effects of community sanctions
Much of the evidence base on effective punishment in criminal justice occurred as a result of the 'what works' movement that emerged at the end of the 20th century (Cullen and Gendreau, 2001;McGuire, 1995). In essence, this research direction was a reaction to the 'nothing works' view, which argued that the causes of crime were unresolvedly embedded within societal structure (Cullen and Gendreau, 2001;Martinson, 1974). Throughout the mid-1970s, this pessimistic perspective led to significant decreases in support for alternative punishment within the (youth) criminal justice system and increased emphasis on the use of custodial sanctions for (youth) offenders (Fagan and Kupchik, 2011). Alternatively, the 'what works' movement called for a renewed focus on the search for effective punishment, as such would be beneficial to the treatment of offenders and, ultimately, beneficial to society as a whole by means of effectively reducing and preventing crime (Cullen and Gendreau, 2001). Likewise, the 'what works' movement stimulated efforts to integrate the results from various empirical studies in the domain of sanctioning in the criminal justice system by means of systematic reviews and meta-analyses, including reviews in the field of youth justice.
Although there are no systematic reviews on the effects of community sanctions overall in comparison with other youth justice reactions, several meta-analyses have been published on the effects of 'alternative treatment' for youth offenders, which are relevant to the current study. Generally, one can divide these into two types: meta-analyses that summarize the effects of different (diversion) programmes and interventions, versus meta-analyses that review the effects of differential justice system processing.
Reviews on programmes and interventions for youth typically analyse differences in recidivism amongst offenders subject to one type of intervention, compared with offenders subject to another intervention with different characteristics or focuses. Krisburg et al. (1995) reviewed the effectiveness of graduated sanction programmes for youth and found that behavioural skill and multi-model programmes were more effective than other programmes. Dowden and Andrews (1999) examined youth justice interventions by making comparisons between interventions with and without 'risk, needs and responsivity' principles. This provided support for the assumption that targeting criminogenic needs (as opposed to non-criminogenic needs), when tailored to the individual risk and response levels of offenders, is essential in sanction administration. Other reviews found larger reductions in subsequent recidivism for family-based interventions and programmes with family-focused components (Latimer, 2001), therapy-oriented programmes and interventions (such as counselling) (Lipsey, 2009), and programmes with restorative justice (RJ) components (Wong et al., 2016).
Meta-analyses and systematic reviews in the domain of differential criminal justice system processing typically examine differences in recidivism between youth offenders subject to minimal justice system contact, versus youth who are processed further into the criminal justice system. Schwalbe et al.'s (2012) meta-analysis of different types of diversion programmes for youth reported no significant differences in recidivism amongst youth subject to diversion versus youth subject to traditional criminal justice processing, except for family-treatment programme types.
Yet, ethnographic studies in youth justice have emphasized how any diversion from criminal justice system processing could significantly contribute to decreasing the future odds of custody and its harmful effects as mentioned before (Fagan and Kupchik, 2011;Harrison et al., 2020;Travers, 2012). Additionally, diversion could (temporarily) protect delinquent youth from enrolment in (and the possible harmful effects of) a system of continuous community supervision as a coercive measure of control (Paik, 2011). Indeed, Wilson and Hoge's meta-analysis (2013), covering a greater number of studies, did find differences in recidivism amongst youth subject to diversion, versus youth subject to traditional criminal justice system processing. The authors found that the odds of recidivism were lower for youth subject to diversion compared with youth subject to more traditional forms of criminal justice system processing. Likewise, a recent systematic review (Wilson et al., 2018) on the effects of police-led diversion programmes compared with traditional criminal justice system processing also found results in favour of diversion rather than more traditional forms of criminal justice system processing. It was found that police-led diversion was especially beneficial for low-risk youth with no prior system contact (Wilson et al., 2018).
In sum, a growing body of research exists suggesting that programmes tailored to offenders' needs and risk levels and focused on differential criminal justice system processing are related to a decreased risk of recidivism. This is in line with longitudinal studies suggesting that the key to effective sanctioning lies in maximum diversion. For example, in their analysis of the impact of the Scottish youth justice system, McAra and McVie (2007) found that recidivism rates were highest for youth drawn furthest into the criminal justice system, even when controlling for various risk factors (McAra and McVie, 2007). These findings are in line with various ethnographic studies that have demonstrated how diversion can contribute to decreasing the future odds of custody and its harmful effects, and protect delinquent youth from a system of continuous coercive supervision (Fagan and Kupchik, 2011;Harrison et al., 2020;Paik, 2011;Travers, 2012).
Nonetheless, the above-mentioned meta-analyses and reviews did not compare the effects of community sanctions with custodial sanctions for youth. Additionally, there are no meta-analyses or systematic reviews that have compared the effects of community sanctions with diversion (such as dismissals). Further noteworthy is that the majority of existing reviews highlight significant limitations regarding the methodological quality of the available studies, and some study findings appear to be contradicting one another. In short, it remains unclear which punishment option (including community sanctions) is more effective, for which offenders and in which circumstances.

Current study
The current study seeks to examine what is known about the effects of community sanctions as a type of punishment response administered to delinquent youth who come in contact with the criminal justice system. Moreover, we wish to analyse whether there are differences in positive versus negative effect findings depending on the type of comparison group, the methodological quality of studies, participant risk levels, intervention type and referral stage. In order to objectively collect, measure and interpret existing evidence on this topic (Kugley et al., 2017), we conduct a meta-analysis to calculate mean effect sizes of the recidivism rates for youth subject to community sanctions versus custodial sanctions (examination A), and youth subject to community sanctions versus dismissals without any additional treatment (examination B). Additionally, we conduct moderator analysis to investigate which of the above-mentioned variables play a role in moderating the effect size.
In the present study, community sanctions are defined as sanctions, measures, or modes of punishment that contain work, learning, therapeutic and/or supervision components, intended as a viable alternative to custodial sanctions that can be performed in the community instead of custodial settings. As mentioned before, we focus on two types of community sanctions, namely: community service and behavioural intervention programmes. Further, custodial sanctions are defined as sanctions that predominantly consist of detention, whether referred to as incarceration, confinement, custody or institutionalization. Dismissals are defined as any justice system response that does not involve the administration of a sanction, treatment or intervention, but concerns the (immediate) release of the suspect to the community (if applicable) or unconditional case dispositions. Lastly, effectiveness is defined as the extent to which a punishment response to youth crime encourages desistance from crime, as measured by registered recidivism.
Given previous study findings on the adverse effects of recidivism and the harmful nature of detention, we expect that the odds of recidivism are lower for youth subject to community sanctions compared with youth subject to custodial sanctions. Therefore, our hypothesis for examination A is that the recidivism rates of delinquent youth subject to a community sanction will be significantly lower than the recidivism rates of delinquent youth subject to a custodial sanction. However, because prior meta-analyses on different forms of diversion, including dismissals, have produced mixed results (some finding no differences in recidivism and some finding lower recidivism rates for diversion), we expect a failure to detect differences in recidivism rates for youth subject to community sanctions when compared with youth subject to dismissals. Thus, our hypothesis for examination B is that there is no significant difference between the recidivism rates of delinquent youth subject to a community sanction versus those subject to dismissals.

Systematic literature search
As first step in the current study, we conducted a systematic literature search following the Campbell Collaboration guidelines for systematic reviews (Kugley et al., 2017). A set of key search terms, databases and other (grey) sources of information were established prior to the search 1 and contained the following four elements: (1) population type (youth), (2) population characteristic (delinquents), (3) sanction type (community), and (4) outcome (effect on official recidivism). Consequently, a search was formulated utilizing these four elements in conjunction with Boolean Logic and Truncation techniques.
The language of studies was limited to the authors' competencies, namely studies published in either English or Dutch, and the publication date ranged from no start date up to studies published in 2019. A minimum start date was not determined because we expected a relatively low number of studies that would meet our inclusion criteria. An extensive list of databases to be searched was constructed, including strategies for gathering grey and unpublished literature and conducting cross-referencing (as per recommendations in Kugley et al., 2017).
Next, we developed a strict set of inclusion criteria: (1) the primary target in each study must be a youth in the sense that they had to be tried (or dismissed) under youth justice provisions (9-23), (2) the study must report on at least one type of delinquent behaviour, (3) the study must have a (quasi-)experimental design with a minimum of two comparison groups, (4) one comparison group must be composed of youth who received either a custodial sanction or dismissal, (5) the delivery of the community sanction must be outside an institutional setting, (6) the study must report recidivism rates (in some form) for both groups, and (7) the study must report enough information to be able to compute an effect size.
Studies were automatically excluded if they did not meet any of the above-mentioned criteria. Specific attention was paid to the application of criterion 4. When considering dismissal studies, we paid particular attention to ensure the control group genuinely constituted a 'no treatment' group, without any intervention or referral to services. Upon conducting the database search with the established set of key search terms, the total number of search results retrieved per database were recorded (see Figure 1). Subsequently, we reviewed the titles and abstracts of each search result; studies that clearly did not fit the topic of the present meta-analysis were not exported. All other studies that (at slightest glance) appeared to fit the current topic were exported to a Mendeley database. Any doubt in relation to the eligibility of a study resulted into a second review at a later date with two other researchers as to their (non)eligibility for the current study, upon which all qualified studies were reviewed once again to double-check their eligibility.
At this stage, we determined additional restrictions that needed to be placed upon the type of sanctions examined in the experimental groups of eligible studies. Given that the sanctions had to fit the definition of community sanctions as defined in the current study, it was concluded that bootcamp studies should be excluded, because they may contain an element of confinement or detention. Studies on probation and intensive supervision programmes administered in isolation -without (referral to) treatment or intervention but solely a supervision order -were also excluded, because they are neither clearly community sanctions nor dismissals (although they are frequently administered as such) as defined in the current study.

Data extraction
As second step in the current meta-analysis, we extracted data from the eligible studies. According to guidelines provided by the Cochrane Handbook for Systematic Reviews of Interventions (CHSRI), a data collection form was drafted prior to commencing the data extraction and data coding process (Higgins et al., 2019). The data collection form included all relevant characteristics that had to be extracted from the studies deemed eligible for inclusion, such as (A) methodological characteristics, (B) participant characteristics, (C) sanction (and implementation) characteristics, (D) study findings and (E) study conclusions. In later stages, these data were used to investigate effect sizes for analysis and to construct a limited number of moderator variables: (1) methodological rigour, (2) referral stage, (3) main focus of sanction, and (4) sample risk level. Methodological rigour was determined based on the group assignment procedures used in each study. Experimental studies with random assignment received the highest score for methodological quality (3), quasi-experimental studies with matched comparisons received a middle-range score (2), and weakly/unmatched comparisons received the lowest score (1). The average risk level of participants, low or medium/high, was determined based on the study sample's proportion with prior system contact and the seriousness of the delinquent behaviour engaged in (if reported).
Two studies were trial coded to test the completeness of the data collection form. Minor adjustments were made to the form to further streamline the extraction process, upon which the remainder of the eligible studies were coded and the extracted data were recorded in an Excel file. For interrater reliability purposes, error sensitive variables (such as implementation fidelity) were trial coded by multiple researchers. Subsequently, the subscribed codes were compared between each other to resolve any discrepancies and establish a set way of interpretation.

Data analysis
The third step involved data analysis, and preparations for this were made following the CHSRI instructions (Higgins et al., 2019), which included selecting the correct calculation model (fixed effects or random effects), choosing a common effect measure, and establishing heterogeneity. As recommended by Borenstein et al. (2009), the selection of a calculation model was made in consideration of the sampling frame of included studies. The random effects (RE) model was chosen as appropriate for our meta-analytical calculations because (1) the current study assesses a sample comparable to the universe of populations (not a sample from one population), and (2) the studies in the present analysis are based on populations that differ from each other (rather than being identical to one another).
Consequently, an effect measure had to be chosen to calculate summary statistics for each study, and to subsequently calculate a summary statistic for the combined effects of studies. Provided that all studies reported effect sizes constructible in a 2×2 dichotomous table, and given that relative effect measures are typically more consistent than absolute effect measures, the odds ratio (OR) was chosen as the main effect measure. ORs describe the odds of the occurrence of an event, and thereby can assist in providing an indication of the strength of association between the type of sanction received and the incidence of recidivism.
For all studies, the 2×2 dichotomous data (number of events in the experimental group versus number of events in the control group) were converted into ORs utilizing the Comprehensive Meta-Analysis (CMA) software program. However, for comprehension purposes, we reported the results in this article in log odds ratio (LOR), where negative LORs are indicative of lower odds of recidivism (and, thus, in favour of the experimental group) and positive LORs are indicative of higher odds of recidivism (and, thus, an adverse effect). As to the calculations of LORs, the generation of forest plots and summary effect sizes, the CMA software program was used.
In the CMA software program, study weights are typically assigned based on the inverse of within-study error variance, where studies with more variance are assigned less weight than studies with lower variance (Borenstein et al., 2009). However, within the RE model, study weights were based on two sources of total error variance (withinstudy and between-study), meaning effect sizes were weighted based on the inverse of two types of error variance of studies (1/(V+T 2 )). Upon the calculation of summary effect sizes, studies were tested for publication bias by means of a triangulated procedure, namely a visual examination of funnel plots, Begg's rank correlation test for publication bias (Begg's test) and Egger's regression intercept test (Egger's test) (Begg and Mazumdar, 1994;Egger et al., 1997). Furthermore, studies were tested for the presence of heterogeneity. As highlighted by Higgins et al. (2019), heterogeneity refers to variation in either methodological or statistical measures amongst studies and must be examined in meta-analysis. As a result, the presence of heterogeneity was investigated utilizing the Cochran's Q test, and the I 2 statistic was used to interpret the heterogeneity results after the calculation of Cochran's Q. Given that the significance of the I2 statistic depends on the extent and the direction of the effect, and the strength of evidence in relation to the presence of heterogeneity (Higgins et al., 2019), the following guidelines of interpretation were used: 0-40 percent = 'might not be important', 30-60 percent = 'may represent moderate heterogeneity', 50-90 percent = 'may represent substantial heterogeneity', 75-100 percent = 'considerable heterogeneity' (Higgins et al., 2019). Because significant heterogeneity was detected, it was warranted to conduct moderator analyses utilizing meta-regression techniques (with the CMA software program). According to Borenstein et al. (2009), meta-regression in meta-analysis is similar to multiple regression analysis in primary studies. However, in the latter case one examines relationships between covariates and the dependent variable of the study, whereas in meta-regression one examines relationships between the presence of covariates in each study and the outcomes (effect size) of each study (Borenstein et al., 2009).

General search
The search of all relevant databases resulted in 2939 documents, including cross-reference and grey literature sources. Consequently, these studies were scanned and assessed for eligibility in light of the pre-specified inclusion/exclusion criteria. Ultimately, 19 studies with 23 effect sizes were deemed eligible for inclusion, of which seven qualified as community versus custodial sanction studies, and 16 qualified as community versus dismissal studies. Figure 1 provides a visual display of the selection process, as described in the Methodology section. Table 1 provides an overview of the characteristics of all included studies, including sanction characteristics, outcome measurement and participant characteristics. Most studies were conducted in the US. Remarkably, the majority of studies are journal articles published prior to the 21st century, regardless of whether they belonged to the custodial sanction or dismissal category. Overall, most studies had sample sizes ranging from 100 to 500 participants, and the total number of participants in all studies together amounted to 8997, comprising 1280 participants for the group of custodial sanction studies and 7717 participants for the group of dismissal studies. Most studies with custodial sanction as the comparison group (85.7 percent) were rated as having medium methodological quality, whereas most studies with dismissal as the comparison measure (68.8 percent) were rated as having high methodological quality.

Systematic overview of included studies
As indicated in Table 1, the referral stage of sanctions was divided into police, prosecutorial, court or mixed categories. Most studies fell into either the police or the court category. Court referral was most common for comparison with the custodial sanction studies, whereas police referral was most common for comparison with dismissal studies.
The included community sanctions were further divided into four categories, namely community service/restorative focus, cognitive treatment, family systems/social support enhancement, and skills training. Most community sanctions in the dismissal comparison studies were classified as cognitive treatment, whereas studies in the custodial sanction comparison group were most likely to have skills training and enhancement as the main focus of the community sanction in the experimental condition.
The majority of dismissal studies utilized registered data as the outcome measurement source and defined recidivism as any new justice system contact. Custodial sanction studies were almost equally as likely to report utilizing either registered or both registered and self-report data as the outcome measurement source. It was found that most studies targeted medium-/high-risk participants (with prior system contact and/or serious offences) -about half of the dismissal studies but 100 percent of the custodial sanction studies.
Finally, when examining the direction and significance of effects in each study, it was found that most studies found effects in favour of community sanctions, but with a standard error that made them statistically non-significant (at the .05 level). In sum, 16 of the included studies found non-significant differences in effect sizes, six studies found significant effects in the expected direction, and one study found a significant effect in the opposite (adverse) direction. Amongst the group of custodial sanction studies, six out of seven found non-significant effects, and only one study had a significant effect in the expected direction. Out of the 17 dismissal studies, 10 yielded non-significant findings, five yielded significant findings that indicated lower odds of recidivism for community sanctions, and one yielded opposite effects.

Meta-analytic results
Pooled results of the meta-analysis will be presented in the following paragraphs and, given that we distinguished between two types of control groups (custodial sanction and dismissal), the results of each examination are presented separately. Figure 2 presents a forest plot with statistics for each of the seven custodial sanction studies, including a pooled effect estimate. LOR represents the summary statistic. LCL and UCL are the lower and upper 95% confidence intervals (CI), and WGHT refers to the weight of each study in terms of its contribution to the total, summary effect size. The point estimate (LOR) of the latter was −0.24, with a standard error (SE) of 0.13. Because we hypothesized the direction of the effect to be in favour of community sanctions, we interpreted a one-tailed p-value. The two-tailed test yielded z = −1.91, with p = .06, translating into a onetailed p = .03, indicating that the average effect of community sanctions is significantly larger than the effect of custodial sanctions. However, the 95% CI of the effect estimate ranged from −0.49 to 0.01, which includes a lower limit of a strong, negative effect, as well as an upper limit of an effect slightly above zero. Additionally, the 95% CI differs between the individual studies as indicated by the lower and upper limits, but, in line with the overall confidence interval, most studies have a CI ranging from a strong negative effect of community sanctions on recidivism rates (in comparison with custodial sanctions) to virtually no effect. The Cochran's Q test did not provide evidence for substantial heterogeneity (q = −5.32, df = 6, p = .503). This result may be due to the low number of studies (N = 7), but even if there was evidence of heterogeneity it would not be feasible to conduct moderator analyses because of the small number of included studies.  Figure 2 indicates one significant outlier, namely the study from Van der Laan and Essers (1990). Upon careful examination of the original report and CMA output, it was determined that no mistakes were made in the data entry process. However, we noted that the self-report data in the original study indicated a negative effect on recidivism (in line with our hypothesis) rather than an adverse effect. The authors also concluded in favour of community sanctions themselves. Yet the current meta-analysis extracted only the registered data results, because most other studies reported their results in terms of registered rather than self-report data, and, in the study of Van der Laan and Essers (1990), the registered data results were opposite to our expectations. Nonetheless, the striking difference with the self-report data suggests that caution is warranted in interpreting the findings from this study. Figure 3 presents a funnel plot of possible publication bias. When examining Figure  3, the funnel plot appears fairly symmetrical, which is not indicative of publication bias. Consequently, Begg's test was conducted to examine publication bias in light of the inverse correlation in publication bias between-study sample size and study effect size, and no evidence of publication bias was found: Kendall's tau (corrected) was 0.09 and p = .38 (Begg and Mazumdar, 1994). In addition, Egger's test results were also generated, given that in some circumstances the Egger's test may be more powerful than Begg's test (Egger et al., 1997). Again, no evidence of publication bias was found, as the intercept (B0) was 3.23, with p = .12. Figure 4 presents a forest plot with statistics for all the dismissal studies, including a pooled effect estimate. The point estimate (LOR) was −0.21, with an SE of 0.11, and a 95% CI ranging from −0.43 to 0.02. As we did not expect to find differences between community sanctions and dismissals in either direction, we conducted a two-tailed significance test for this group of studies. The test of the null yielded z = −1.81, with p = .07, just above the significance rate of .05. However, as with the group of custodial sanction studies, the 95% CI of individual studies varied widely. As also demonstrated by the forest plot in Figure 4, effect sizes ranged from strong, negative effects (lower recidivism rates for community sanctions) to slightly positive effects. Overall, it is clear that there are some studies with clearly negative effects (in favour of community sanction), but most studies are grouped near a null effect. Only one study displayed a displayed a positive effect (in favour of dismissal).

Pooled results examination B (dismissals).
Again, visual outliers were examined to ensure the data were entered correctly. The studies from Collingwood and Genthner (1980) and Klein (1986) were both examined in this regard, but no data entry mistakes were detected. Upon closer examination of the study characteristics, it was found that Klein (1986) reported multiple intervals for the length of recidivism measurement. Because most studies in the current meta-analysis reported one-year recidivism rates, we generally extracted the one-year recidivism data. In Klein's (1986) study we extracted the measurement closest in time to one year (the 15-month recidivism data), and this choice may have affected its position on the forest plot. In the study of Collingwood and Genthner (1980), there was a large difference between the size of the experimental sample versus the size of the control sample, which may have affected the results. Figure 5 presents a funnel plot of publication bias test results. Again, the funnel plot appears fairly symmetrical, meaning there is no evidence of publication bias. Begg's test was conducted to examine publication bias by means of a statistical test and, indeed, no evidence of publication bias was found (Kendall's tau = 0.06, with p = .38). Egger's test results provided a similar conclusion: the intercept (B0) was 1.31, with p = .10.

Moderator analysis examination B (dismissals).
Based on the Cochran's Q test for heterogeneity, evidence of heterogeneity was found amongst the dismissal studies (q = 54.23, df = 15, p = .00, I 2 = 72.34). Because p is statistically significant, it is unlikely that all study variance is due to sampling error, meaning one should explore heterogeneity. Likewise, the I2 statistic yields a value of 72.34, which may be interpreted as the presence of 'substantial heterogeneity' (Higgins et al., 2019). Therefore, meta-regression techniques were employed to assess whether variation in effect sizes amongst the individual studies might be explained by covariates. In this meta-analysis, possible moderator variables (pre-defined in the protocol stage) were: (1) methodological rigour, (2) referral stage, (3) main focus of sanction, and (4) sample risk level. To create substantial group sizes for the meta-regression, each variable was divided into subcategories with a substantial number of studies, combining some of the subcategories reported in Table 1.
Upon entering these four covariates into a meta-regression model, the test of the (entire) model yielded q = 16.59 with df = 5 and p = .01. Because p < .05, we can reject the null hypothesis that there is no relationship between effect size and any of the covariates in the model. Rather, at least one covariate is likely related to the effect size. The r-squared value for the model = .66, or 66 percent, which implies that 66 percent of the initial variance between studies is accounted for with the covariates included in this regression model. Consequently, the goodness of fit (GoF) test statistics were interpreted. The GoF test examines the completeness of the regression model, namely, whether the true effect size still varies even if the studies are identical on all covariates assessed in the current meta-regression model. The results yielded q = 17.47, with df = 10 and p = .06. Because p > .05, we conclude that there is no longer statistically significant evidence that the effect size varies when studies share the same values on all covariates. In other words, the regression model may be complete, but, given that p is near statistical significance, this conclusion should be interpreted with caution. Figure 6 presents the meta-regression results of the individual moderator variables. Methodological rigour yielded a statistically significant result with p = .03. When examining a plot of the results (see Figure 6), it is evident that studies with low to medium methodological quality were more likely to yield a positive effect (LOR = −0.55), whereas studies of high methodological quality were more likely to yield a null effect (LOR = −0.02). Referral stage also yielded a statistically significant result at the .05 level (p = .04). Studies in the police/prosecutorial (LOR = −0.31) and court (LOR −0.34) categories were likely to yield negative LORs, which is indicative of a positive effect in favour of the experimental group. However, studies in the mixed subgroup (LOR = 0.44) were more likely to yield positive effect sizes, which are indicative of adverse treatment effects. Yet, caution is advised when interpreting these results considering the low number of studies in this particular subgroup (N = 3). The moderator variables' 'predominant focus' (p = .87) and 'sample risk level' (p = .58) yielded non-significant results. Thus, for these two variables there is no evidence to suggest that belonging to one subgroup or another would yield significant differences in effect sizes.

Discussion and conclusion
The current meta-analysis included 23 effect sizes of studies where the experimental group received a community sanction, and the control group received either a custodial sanction or a dismissal (without any additional treatment). The pooled effect estimate of the comparison between community sanctions and custodial sanctions was negative (LOR = −0.24), and translates into an OR < 1 (OR = 0.78), which indicates a negative association. This suggests that the odds of recidivism are smaller amongst youth who received a community sanction compared with youth who received a custodial sanction. This estimate translates into an OR of 0.78, and assuming a 50 percent base rate for recidivism, we conclude that the recidivism rates for youth subject to a community sanction would be 44 percent, compared with 50 percent for youth subject to custodial sanctions.
Furthermore, the pooled effect estimate is significantly lower than zero (p = .03), which is in line with our hypothesis that community sanctions would yield lower odds of recidivism than custodial sanctions. However, a more precise estimate of what one can state with relative certainty about the effect is offered by the 95% CI of the pooled effect estimate, which ranged from a substantial lower limit of −0.49 to slightly above zero (0.01). This implies that the difference in effect between the two groups could be none, but it may also be substantially positive. These findings are relatively similar to the results of reviews that have examined differences in recidivism between adult delinquents receiving either a custodial or a community sanction (Nagin et al., 2009;Smith et al., 2002;Villettaz et al., 2015). In these reviews, the pooled effect was also in favour of non-institutional treatment, although the different studies often did not yield statistically significant results.
The pooled effect estimate for the comparison between community sanctions and dismissals (LOR = −0.21), which also indicates a negative association, suggests that the odds of recidivism are smaller for youth who receive a community sanction compared with youth who receive a dismissal. Moreover, this estimate translates into an OR of 0.82, and assuming a 50 percent base rate for recidivism, this implies that the recidivism rates for youth subject to a community sanction would be 45 percent, compared with 50 percent for youth subject to dismissals.
As hypothesized, the pooled effect estimate amongst the group of dismissal studies was not significantly different from zero (p = .07). However, again it is perhaps more informative to look at the range of potential effects, which display a 95% CI ranging from a substantial −0.43 to almost zero (0.02), which indicates that the effect could be zero but also substantially in favour of the experimental condition, namely community sanctions. This suggests that administering a community sanction, with some level of programming or intervention, is not necessarily more harmful but rather could be more effective than administering a dismissal.
Interestingly, this finding differs from reviews that examined differences in recidivism amongst youth offenders processed further into, versus diverted from, the criminal justice system. Similar to the current study, most (although not all) of these reviews found positive results in favour of diversion, even though they tended to be non-significant (Schwalbe et al., 2012;Wilson and Hoge, 2013;Wilson et al., 2018). If community sanctions are indeed related to less recidivism than are dismissals, this may be the result of learning mechanisms as part of the more effective content of the punishment received, but it may also be that youth receiving a community sanction are more likely to come in contact with (professional) services that aim to address the underlying characteristics that play a role in their involvement in delinquency. In contrast, youth who receive a complete dismissal (without any referral to services, treatment or programmes) may miss out on access to (professional) services, and thus the underlying causes of their involvement in delinquent behaviour remain unaddressed.
Yet, because significant heterogeneity was detected amongst the dismissal studies, moderator analysis was conducted by means of a meta-regression, which indicated that methodological rigour and referral stage were significant moderators in effect size. Clearly, studies with weaker methodologies were more likely to report a positive sanction effect, which is consistent with moderator analysis findings in similar meta-reviews (Latimer, 2001;Lipsey, 2009;Schwalbe et al., 2012;Wilson and Hoge, 2013;Wong et al., 2016). Studies where sanction referral occurred by means of mixed stages (a combination of school, police, prosecutor, court) were also significantly more likely to report positive effect sizes than were studies with single referral stage options, suggesting a multi-agency referral approach could yield increased effectiveness.
Findings on sample risk level were non-significant, contrary to study findings suggesting that high-risk youth benefit more than low-risk youth from sanction interventions or programmes (McAra and McVie, 2007;Wilson and Hoge, 2013). Likewise, findings on sanction focus were also non-significant. Yet studies have shown that certain types of focus within sanctions, such as a focus on (family) therapy or the presence of RJ elements, are more effective than sanctions without such focus (Lipsey, 2009;Schwalbe et al., 2012;Wong et al., 2016). It is possible that the non-significant findings in the current study imply that the findings from previous meta-analyses about moderator variables cannot be translated to the differences in effect between community sanctions and dismissals amongst youth. However, the different findings may also be due to substantial within-group variation amongst the subgroups for both moderator variables for which we did not find an effect. Further research would be needed to explore this finding.

Limitations and directions for future research
Various limitations were present in the current study that should be discussed. Firstly, the moderator variables entered into the regression model explained 66 percent of the between-study variance. Although this is a significant percentage, it should be noted that 34 percent of the variance remained unexplained, possibly because there are other moderating variables that we failed to explore in the current study. Additionally, some moderator variable subcategories had to be re-grouped due to an insignificant number of studies per category; this may have affected the within-subgroup variation and, therefore, could have affected the moderator analysis results.
Secondly, selection choices made during the data extraction process may have biased the results we obtained. For example, we chose to extract only registered recidivism data and not self-report or a combination of registered and self-report data. If a study measured recidivism at multiple intervals, we extracted only the one-year recidivism data, or the data closest to a one-year recidivism measurement. Selection choices such as these may have affected the final results.
In relation to this, the current study focused on (registered) recidivism data as the outcome measure. However, the results of the current study do not take into account other effect measurements that could yield alternative results. Not enough studies were available that used self-report data as the outcome measure, and we did not encounter studies that included youth's personal perceptions and experiences, or practitioners' perspectives on sanction effects as the dependent variable. Future studies are needed to shed more light on these, and other potentially important outcomes.
Thirdly, although the strict inclusion/exclusion criteria are a strength of the current study, it is possible the inclusion criteria were too specific, leading to the exclusion of relevant studies that otherwise would have been included. Likewise, it is possible we missed relevant studies, even though a thorough search of (grey) literature sources was conducted. A greater number of included studies would have yielded more power to the final results and, perhaps, would have allowed for moderator analysis amongst the group of custodial sanction studies, which we could not conduct at this time.
Fourthly, over half of the included studies were of low to medium methodological quality and, as shown by the meta-regression, this had influenced differences in effect sizes amongst the studies comparing community sanctions with dismissals. If we could have found more methodologically sound studies, our effect estimates might have been different and also more reliable. Likewise, most included studies were conducted in the US (74 percent), and most of these were published prior to the 21st century (70 percent). Needless to say, the youth justice landscape has gone through significant changes since then; see, for example, the 'what works' movement that emerged in the late 20th century (Cullen and Gendreau, 2001;McGuire, 1995).
Together, the low number of included studies and the methodological impurities amongst these studies highlight a significant research gap in relation to the existence of (high-)quality experimental studies comparing the effects of different sanctions on recidivism. Effect studies comparing youth subject to custodial sanctions versus youth subject to community sanctions (as per the definition in the current study) were found to be especially scarce and sometimes dated. These findings are consistent with scholars calling for more effect studies in the domain of effective punishment options in youth justice for over a decade (Bouchard and Wong, 2018;Gensheimer et al., 1986;Latimer, 2001;Schwalbe et al., 2012;Wilson and Hoge, 2013;Wong et al., 2016). We join this chorus of scholars and emphasize the great need to conduct more research on the effects of community sanctions, in particular for youth, in comparison with the effects of custodial sanctions or dismissal options as youth crime responses.
Finally, the current study results highlight that the evidence base for community sanctions appears thin in comparison with its frequent application as a punishment response to youth crime. Most certainly, a greater volume of rigorous studies is needed to make a warranted judgement on which youth justice punishment options adequately address the root causes of crime and therefore are most effective at preventing subsequent offending. More importantly, effect studies on this topic should seek to examine for whom, and in which circumstances, certain sanctions are (highly) effective.
However, conducting high-quality effect studies, such as randomized experiments, in youth justice can be difficult due to legal constraints or for ethical reasons. Yet one could conduct quasi-experimental research, for example by means of employing sophisticated matching procedures, such as propensity score or coarsened exact matching (Blackwell et al., 2010;Shadish, 2013). High-quality studies of this nature allow one to create two comparison groups as if it were a randomized experiment, analyse (effect) differences between the two groups, and thereby contribute to expanding the evidence base for community sanctions in youth justice -in line with the decade-long scholarly call. Ideally, the results of objective (quantitative) measurements such as those suggested here are considered in light of other types of research, such as studies from an emic or ethnographic perspective. As was demonstrated in this meta-analysis, even in the (quasi)experimental studies that have been conducted to date, it is not clearly evident what specific impact community sanctions, compared with custodial sanctions or dismissals, have on the lives of delinquent youth who come in contact with the criminal justice system.