A Rubric to Assess the Design and Intervention Quality of Randomized Controlled Trials in Health and Wellness Coaching

Objective To collect health and wellness coaching (HWC) literature related to treatment of obesity and Type 2 Diabetes (T2D) for systematic assessment using a novel rubric. Data Source Pubmed, CINAHL, and PsychInfo Study Inclusion and Exclusion Given 282 articles retrieved, only randomized and controlled trials meeting a HWC criteria-based definition were included; studies with intervention <4 months or <4 sessions were excluded. Data Extraction Rubric assessment required details of two theoretical frameworks (i.e., study design and HWC intervention design) be extracted from each included paper. Data Synthesis Data were derived from a 28-item rubric querying items such as sampling characteristics, statistical methods, coach characteristics, HWC strategy, and intervention fidelity. Results 29 articles were reviewed. Inter-rater rubric scoring yielded high intraclass correlation (r = .85). Rubric assessment of HWC literature resulted in moderate scores (56.7%), with study design scoring higher than intervention design; within intervention design, T2D studies scored higher than obesity. Conclusions A novel research design rubric is presented and successfully applied to assess HWC research related to treatment of obesity and T2D. Most studies reported beneficial clinical findings; however, rubric results revealed moderate scores for study and intervention design. Implications for future HWC research are discussed.

2][3] More specifically, HWC is a patient-centric process supporting individualized goals often related to eating well, exercising regularly, managing stress effectively, and identifying important resources to promote healthful living. 4An extensive and growing evidence-base for HWC exists describing prospects for both treating and preventing these disorders. 5,6However, methodological questions related to HWC research exist.While a systematic review has shed some light on the definition of HWC, 7 the HWC strategies applied during intervention are of particular concern in the present study. 8A systematic and effective assessment in this area of extant HWC literature is lacking.
There are over 100 randomized controlled trials (RCTs) of HWC interventions. 5,6RCTs are widely considered the gold standard of original research required for advancing medical knowledge.However, RCTs can contain design shortcomings (e.g., uncontrolled concomitant treatments, underpowered, selection bias) impacting study quality and the unbiased results that are sought. 9,10here are many evaluations of the quality of RCTs (e.g., psychological interventions by Temple et al. 11 ).Such evaluations of research allow policymakers, and clinicians, to make informed decisions about implementing psychological treatments in clinical services.Though assessment of quality for psychological interventions exists, [9][10][11] such inspection of HWC research does not.As HWC moves toward becoming an important and credible health and medical intervention, it seems reasonable and potentially valuable to develop a systematic means to assess relevant RCTs.
The purpose of this review is to evaluate the quality of RCTs in HWC research.Because a review of all available RCTs would not be feasible, this work will focus on some of the most pressing present health crises, obesity and T2D. 12,13besity increases the risk of various cardiovascular diseases and other comorbid symptoms and diseases, while T2D is a chronic disease leading to life-threatening health consequences (e.g., atherosclerosis, renal failure, blindness). 14,15oreover, these disorders are linked, with obesity and subsequent insulin resistance leading to the development of T2D. 15 Effective prevention and management of these conditions is necessary for optimization of well-functioning and successful contemporary healthcare systems.Therefore, it is essential to have credible research describing effective interventions for obesity and T2D.Such evidence may enable future HWC researchers and practitioners to learn from best practices in HWC research and provide valuable information for evidence-based practice.
To achieve the purpose of the present study, a comprehensive scoring system for HWC research was developed and applied.The HWC Research Design Rubric (HWC-RDR) with supportive criteria is introduced, and the evaluative review of obesity and T2D literature is presented.We chose obesity and T2D because there is a critical mass of research with these conditions being the most frequently studied in HWC literature. 5,6Data from the HWC-RDR are used to illustrate the strengths of, and challenges to, HWC literature.

Overview
This project was completed in three phases.First, a systematic review of the health coaching literature produced RCTs describing HWC as an intervention for obesity and T2D.Second, a research scoring rubric was developed and then selected RCTs were assessed (for study design and HWC intervention design) using the rubric.Finally, rubric-generated data were inspected and analyzed to show trends and allow discussion of HWC research.Greater detail on these three methodological phases is provided below.

Literature Search Strategy
A two-part strategy was used to locate relevant HWC literature.First, all eligible articles in the Compendium of HWC Literature related to obesity and T2D were identified. 5,6In total, 41 randomized studies in obesity and T2D were part of the original compendium.This set of studies was complemented by articles from personal libraries, for a total of 105.Second, a professional librarian using search strategies previously applied for the Compendium review, 5 identifying recent articles (i.e., published after the compendium) to add to that literature.In the search, 177 additional abstracts were identified.All papers retrieved from the compendium and the new search were filtered using HWC inclusion criteria as applied previously. 5After the review of abstracts and removal of duplicates, 209 records were excluded.The remaining 73 articles were screened for eligibility.Inclusion criteria were as follows: The study had to be an RCT and the intervention of the RCTs had to have at least four coaching sessions over a four-month period allowing ample time for behavior change and providing an opportunity for coachpatient relationship development. 4s seen in the accompanying PRISMA flow chart (Figure 1), ultimately, 29 RCTs (18 obesity and 11 T2D) were included for review.

Development of HWC Research Design Rubric
In general, the rubric was developed using previously established research quality assessment tools and input from HWC subject-matter experts.The HWC-RDR was designed for use with all HWC research and questions are not specific to obesity and T2D.There are two theoretical frameworks shaping the HWC-RDR: (A) Study Design, and (B) Intervention Design with each section having three subcategories.
A. Study Design Rubric.The study design framework was developed by reviewing recent articles on the evaluation of RCTs in psychological disorders (i.e., emotional distress in breast cancer; 11 eating disorder prevention; 9 depression and neurosis. 10The aim was to generate a brief set of questions covering the most important aspects of HWC study design.The main conceptual areas included (1) the acquisition of participants (i.e., recruitment type, clear inclusion and exclusion criteria, reporting sample characteristics, sample size determination), (2) RCT design (i.e., allocation, concealment, blinding, control of confounding treatment, outcome measure quality), and (3) statistical analyses (i.e., baseline comparison, intent-to-treat analysis, appropriate analysis, control for covariates).The study design section included 15 questions-scoring range 0-1 for 7 questions and 0-2 for 8 questions, with a total maximum design score of 23 (see Table 1).
B. Intervention Design Rubric.Most items for the HWC intervention design framework were derived from the job task analysis of HWC. 4 Input and discussion from HWC subject-matter experts were also sought and incorporated into item formation.The intervention design questions represented three main conceptual areas: (1) Coach qualities (i.e., education, training, and experience); (2) HWC program definition and design (i.e., client centered, goal setting, accountability, personal relationship, session length and number, and program duration); and (3) Adherence to HWC strategy (i.e., using established behavior change principles and ensuring fidelity of HWC application).The intervention design section included 13 questions with 0-2 response scores for a maximum score of 26 (see Table 2).

Reviewing Process and Reliability Check
All studies were reviewed by all authors and general scoring system questions clarified before applying the HWC Research Design Rubric.Subsequently, the first author scored all articles using the study design questions from the rubric.The second author scored all articles using the rubric's intervention design questions.The third author scored all included articles using all rubric questions.The third author scores were compared to the first two authors' scores using intraclass correlation coefficients which yielded very good ratings of agreement (ICC = .85;CI95% = .68-.93).

Data Summary and Statistical Analysis
After scoring all included articles with the HWC-RDR, variables represented with continuous data were summarized as means, medians, and standard deviations.All categorical data were expressed in counts and percentages.Before parametric statistics were performed, normality and appropriate assumptions for each test were checked.For comparison of continuous data between obesity and diabetes studies, independent t-test with Cohen's d effect sizes were calculated.Relationships between variables were assessed via Pearson's correlation coefficients.Alpha was set at .05 and statistical analyses were conducted using JASP .14.1 or SPSS 27.To assist discussion of best research practices, the top three rubric scoring obesity and T2D papers were identified and used to illustrate study and intervention design highlights.

HWC Research Design Rubric Scores
Total HWC-RDR score is the sum of study design and intervention design scores from all rubric items.Average total HWC-RDR score across all studies was 27.8/49 (SD = 4.73, range = 16-35, median = 28) or 56.7%.This overall score indicates moderate ranking for this HWC literature base describing intervention for obesity and T2D.The evaluated studies were better at general study design characteristics than planning and describing application of HWC.This is not surprising given that HWC is a relatively new clinical approach and evolving as an allied healthcare field. 45,46Strengths and weaknesses for study design and intervention design, are described and discussed below.
The mean overall HWC-RDR score for obesity studies was 26.3 (SD = 4.99) and for diabetes was 30.1 (SD = 3.24).

Table 1.
Health and Wellness Coaching Research Design Rubric: Study Design Criteria.To allow clinical application or methodological replication, it is necessary to elaborate details of an intervention trial.Specific to HWC, Hill et al. 8 and Olsen and Nesbit, 47 called for the need to clearly describe the intervention.Simpson et al 31 provided an excellent description of intervention components and coaching theory.For many papers, however, a low intervention design score was often related to inadequate description of the HWC coaches and strategy.Regarding coaching background, very few studies described coaching experience, and none had coaches with certification.National board certification (NBC-HWC 48 ) was only established in 2017, so it was not surprising none of the included RCTs mentioned this valued but recently recognized national credential.Only 2/18 (11.1%) obesity and 2/11 (18.2%)T2D studies described employing health coaches with ample coaching experience.Wolever et al. 43 had a very thorough description of coach background and training.This is valuable because using well-trained and experienced coaches seems essential to providing a quality intervention.
It is interesting to note that the best scoring articles from this review were conducted in the late 2000s and early 2010s.A negative correlation between year of publication and rubric score implies a decline in quality of study design in recent publications.One possible explanation is the recent increase in internet publications contributing to quality decline 49 while pressure to publish may also be lowering research quality. 50lternatively, this may represent a small sample anomaly with a timeline of only 10 years reflected in the analysis.

Discussion
The purpose of the present study was to evaluate the quality of RCTs focusing on HWC in obesity and T2D.Using the generated HWC-RDR, the evaluation revealed key strengths and areas of improvements for future research.The HWC-RDR is a usable instrument in this context.A key strength is that evaluates the study design element as well as the intervention design, which are both critical to the quality of an RCT in HWC.The usability of the rubric extends past the fields of obesity and T2D and may also be given consideration when designing new RCTs in the field across a range of disorders.
The reviewed RCTs in the present studies revealed several study design strengths.For example, with over 300 participants on average, most examined studies were well powered to detect small to medium effect sizes.Sample sizes were, in most studies, well justified using a priori power analyses.While participant recruitment was mostly via convenient sampling, most studies used random assignment to groups as opposed to cluster random sampling.It is also worth noting that top scoring articles recruited from primary care 19,31 or clinical settings. 20Recruiting from a place of sufficient patient flow rather than a less centralized approach (e.g., posters, online ads) appears an important study design characteristic.One consideration for future improvement may be the inclusion of underrepresented or minoritized groups, as only 3 studies 36,41,42 focused on a lower economic status population.Data from diverse samples may provide a more externally valid picture of the effectiveness of HWC.
The review of the studies also included areas for study design improvements.For example, strategies to blind researchers and participants for allocation or measurement purposes were rarely applied in these studies.An exception was Ball et al., 20 who had a biostatistician perform randomization to groups and only revealed this information to intervention providers, but not to the participants or research team.Appel and colleagues 19 also blinded assessors to condition by training these staff without allocation knowledge to perform necessary measurements.While a HWC study can never be fully blinded (i.e., coaches and participants inherently know they are involved in treatment), it is valuable to apply available blinding strategies to limit potential for bias towards intervention effectiveness.For most included studies, statistical analyses were well justified and often controlled for important covariates.
Intent-to-treat analysis was performed in many of these RCTs.These studies were conducted in several countries (e.g., US, Canada, Denmark, Turkey), enhancing crosscultural generalizability of findings.In summary, many of these RCTs applied well-conceived study design characteristics.
In terms of intervention design, the reviewed studies generally scored well on number and duration of HWC sessions and program length.This finding is undoubtedly related to our requirement of a minimal program length (i.e., 4 months) and number of sessions for inclusion consideration.These variables provided good potential for a successful coaching process while allowing development of a coachpatient relationship. 3,4Adherence to a health coaching strategy and monitoring of health coaching integrity/fidelity were questions receiving moderate rubric scores.Monitoring HWC integrity involves assessing coaching quality (e.g., use of open-ended questions, active listening skills, reflection techniques) and use of specific behavioral change techniques (e.g., motivational interviewing, cognitive behavioral therapy). 4,31,51Only 4/18 (22.2%) obesity studies and 2/11 (18.2%)T2D studies reported assessing coaching intervention fidelity which is best done by recording and rating coaching sessions.This process was occasionally done 20,39 with Nashita et al. 38 transcribing randomly identified sessions and having coach quality rating of sessions by three independent researchers.Wolever et al. 43 also described and performed a form of HWC intervention check.Fidelity evaluation of motivational interviewing, 52 applied in several MI-focused HWC studies, 20,31,53  Application of these research design recommendations should improve the consistency and quality of future HWC research.Accordingly, the HWC-RDR can be useful to not only assess existing research, but also for informing the design of future study.Ideally, the HWC-RDR will evolve and become a better tool for these purposes.The hope is an expanding, high-quality literature base will better define the scope and best practices required to optimize effectiveness of HWC intervention for obesity, T2D, and other lifestyle-related disorders.

Limitations
The present work presented a novel rubric, which may benefit from further testing.While we assessed the reliability of the rubric through inter-rater scoring and expert input, efforts to provide a higher level of validation should be made with future use.Additionally, we chose to limit our analysis to obesity and T2D studies.This may have jeopardized the generalizability of the present findings as it is also possible studies of other patient populations (e.g., depression or cancer) may score differently using the HWC-RDR.Future research in those patient populations may reveal more generalizable information and should be the target of future research.Scoring bias may also have been introduced by limiting the included studies to RCTs.Other research designs (e.g., case series or pre-post cohort studies) may provide important information about the effectiveness of HWC interventions.However, rubric scores may be different for non-RCT designs and we did not examine these studies.In summary, the HWC-RDR should be a useful tool that will benefit from additional validation and more widespread application.We invite others to test the HWC-RDR and make modifications if needed.For example, some may argue that the concept of coaching fidelity is only marginally captured in the scale (e.g., by the item "Supervision/Fidelity") because not all components of fidelity may be covered (e.g., autonomy, agency). 51uture work in this area may be warranted.Nonetheless, it is our hope that the systematic examination of published HWC research in the present study and the HWC-RDR will be provide practical implications to HWC practitioners and researchers.

Conclusions
The primary objective of this study was to assess HWC research done with obese and T2D patients.Rather than simply making general comments, we sought to develop a tool to bring consistency and objectivity to assessing HWC.The HWC-RDR is a 28-item rubric permitting systematic evaluation of HWC research strengths and weaknesses within two theoretical frameworks (i.e., study design and intervention design).][11] Assessment of literature can be valuable to future research, particularly with an emerging field such as HWC.

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Abbreviations: HWC, Health and Wellness Coaching.

Table 2 .
Health and Wellness Coaching Research Design Rubric: Intervention Design Criteria.
Abbreviations: HWC, Health and Wellness Coaching; NB, National Board.

Table 3 .
Overview of Included Studies.

Table 4 .
Health and Wellness Coaching Research Design Rubric Scores by Item.

Table 5 .
Health and Wellness Coaching Research Design Rubric Scores by Randomized Controlled Trials.