Variability in quantifying the Hill-Sachs lesion: A scoping review

Background Currently, is there no consensus on a widely accepted measurement technique for calculating the Hill-Sachs lesion (HSL). The purpose of this review is to provide an overview of the techniques and imaging modalities to assess the HSL pre-operatively. Methods Four online databases (PubMed, Embase, MEDLINE, and COCHRANE) were searched for literature on the various modalities and measurement techniques used for quantifying HSLs, from data inception to 20 November 2021. The Methodological Index for Non-Randomized Studies tool was used to assess study quality. Results Forty-five studies encompassing 3413 patients were included in this review. MRA and MRI showed the highest sensitivity, specificity, and accuracy values. Intrarater and interrater agreement was shown to be the highest amongst MRA. The most common reference tests for measuring the HSL were arthroscopy, radiography, arthro-CT, and surgical techniques. Conclusion MRA and MRI are reliable imaging modalities with good test diagnostic properties for assessment of HSLs. There is a wide variety of measurement techniques and imaging modalities for HSL assessment, however a lack of comparative studies exists. Thus, it is not possible to comment on the superiority of one technique over another. Future studies comparing imaging modalities and measurement techniques are needed that incorporate a cost-benefit analysis.


Introduction
A Hill-Sachs lesion (HSL) is categorized as a bony defect of the posterosuperolateral humeral head, often caused by prior episodes of anteroinferior glenohumeral dislocation. 1,2Recurrent instability at the glenohumeral joint is often observed after a HSL due to anterior glenoid impact by the posterolateral aspect of the humeral head resulting in subsequent pain and difficulty moving the shoulder joint. 1,3][6][7] Measurement of the HSL has been an area of interest for clinicians as quantification of bone loss is crucial in treatment decisions for patients with shoulder instability.Currently, various modalities can be used to measure a HSL such as computed tomography (CT) scans, magnetic resonance imaging (MRI), 3D CT and 3D MRI, magnetic resonance arthrography (MRA), among many others. 8,9][10] Measurement methods such as the renowned "on-track" and "off-track" concept utilizes the glenoid track, which consists of the contact area between the humeral head and glenoid during shoulder abduction and external rotation.This method determines whether the HSL engages the anterior glenoid rim resulting in shoulder dislocation, where it is termed "off-track," or does not engage, known as "on-track." 11urrently, there is extensive literature reporting imaging agreement of other specific measurements such as glenoid bone loss.Most notably, Walter et al. determined the most accurate imaging techniques in measuring glenoid bone loss in anterior glenohumeral instability. 124][15][16] This acts as a major barrier as minor changes in measurements in the context of HSLs when dealing with bipolar bone lesions in shoulder instability can have significant implications in patient's surgical treatment.Henceforth, the purpose of this review is to provide an overview of the imaging modalities and techniques to measure the HSL and to assess their diagnostic properties.It was hypothesized that 3D-computed tomography (3D-CT) and/or 3D MRI would be the most prevalent and reliable imaging modality to quantify the HSL.

Search strategy
The search terms included "shoulder," "Hill-Sachs," "bone loss," and similar phrases (Appendix Table 1).PUBMED, EMBASE, MEDLINE, and COCHRANE databases were searched for literature on the reliability of imaging modalities and measurement techniques for quantifying the HSL from database inception to 20 November 2021.The search terms were then entered into Google Scholar to ensure that articles were not missed.Inclusion criteria were (1) HSL; (2) quantification by imaging modalities; (3) present a method for measuring HSLs; (4) human studies; and (5) English language.The exclusion criteria were: (1) measurement of other major shoulder pathologies (e.g.glenohumeral, Bankart lesions) without mention of an HSL; (2) review articles; (3) non-imaging studies; (4) cadaver studies; (5) case reports and editorials.

Study screening
Systematic screening was in compliance with Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) and Revised Assessment of Multiple Systematic Reviews (R-AMSTAR) guidelines. 17,18Two reviewers (S.K., H.F.) independently screened the titles and abstract, and full-texts in duplicate.Discrepancies were discussed and resolved with input of a third reviewer (A.S).The references of included studies were also screened using the same systematic approach to capture any additional relevant articles.

Data abstraction
Data were extracted independently by two reviewers (S.K., H.F.) who abstracted relevant data from included articles, recording onto a spreadsheet in Microsoft Excel (Version 2016; Microsoft, Redmond, Washington) designed a priori.Authors were contacted for clarification if data was unclear or not reported.Extracted data included, but were not limited to, year and journal of publication, sample size, study design, level of evidence, and patient demographics (e.g.gender, age, etc.).Information regarding modality and measurement techniques, reliability, and tests diagnostic properties when present were documented.

Statistical analysis
Due to high statistical and methodological heterogeneity, a meta-analysis could not be performed, and the results are summarized descriptively.Descriptive statistics such as mean, range, and measures of variance (e.g. standard deviations, 95% confidence intervals [CI]) are presented where applicable.The intraclass correlation coefficient (ICC) was used to evaluate inter-reviewer agreement for assessing study quality.A kappa (κ) statistic was used to evaluate inter-reviewer agreement at all screening stages.Agreement was categorized a priori as follows: ICC/κ of 0.81 to 0.99 was considered as almost perfect agreement; ICC/κ of 0.61 to 0.80 was substantial agreement; ICC/κ of 0.41 to 0.60 was moderate agreement; 0.21 to 0.40 fair agreement and a ICC/κ value of 0.20 or less was considered slight agreement. 19Statistics were performed using Microsoft Excel (Version 2016; Microsoft, Redmond, Washington).

Quality assessment
The methodological quality of non-randomized studies was evaluated using the methodological index for nonrandomized studies (MINORS). 14A score of 0, 1, or 2 is given for each of the 12 items on the MINORS checklist with a maximum score of 16 for non-comparative studies and 24 for comparative studies.Methodological quality was categorized a priori as follows: a score of 0-8 or 0-12 was considered poor quality, 9-12 or 13-18 was considered fair quality, and 13-16 or 19-24 was considered excellent quality, for non-comparative and comparative studies, respectively.

Study characteristics
The initial search on the topic yielded a total of 4250 articles.After removing 861 duplicates, a systematic screening process yielded 45 articles that met inclusion criteria (Figure 1).One study was found upon reviewing references of included studies.Of the included studies, there were 19 retrospective cohort (42%), 18 prospective cohort (40%), and seven other studies (16%).One of the included studies was a conference abstract (2%) (Table 1).

Study quality
There was substantial agreement between the reviewers for title and abstract screening (κ = 0.759; 95% CI 0.461-0.877),and almost perfect agreement for full-text screening (κ = 0.838; 95% CI 0.662-1.000).The majority of the studies (44%; n = 19) were level II evidence, whereas 15 studies (42%; n = 18) were level III evidence, and six studies (14%; n = 6) were level IV evidence.The mean MINORS's scores across comparative and non-comparative studies are 20.1 and 9.9, respectively, indicating excellent and fair quality of evidence, respectively.Furthermore, there was excellent agreement between the raters in their classification of these studies (ICC = 0.99; 95% CI 0.99-0.99)(Table 1).

Measurement techniques
Computed tomography (CT).Reported measurement techniques varied amongst the studies as well as modalities.

Clinically relevant bone loss
This scoping review was able to identify four studies (10%) that reported glenoid bone loss percentages ranging from 8.9% to 23.5%. 9,10,56,58Threshold values varied among the modalities used in the studies.Hardy et al. assessed a threshold value for making a precise risk factor for failure after an arthroscopic stabilization procedure. 27A ratio between depth of the Hill-Sachs lesion (D) and the humeral head radius (R) from conventional radiograph was analyzed, it was found that when the D/R ratio threshold was more than 15%, the failure rate was 56% contrary to only 16% failure when the D/R ratio was less than 15%.Stefaniak et al. determined that for CT measurements, good or moderate ICC values were observed and "reasonable" or above threshold values of 30% of minimal detectable change (MDC 95%). 8Beason et al. chose arbitrary threshold value ranges based on previously reported in the literature for glenoid (25%) and humeral head (<20%, 20-40%, >40%). 55In agreement with previous studies, Ozaki et al. state that many reports suggest a large HSL to be one of the most important risk factors for postoperative recurrence after arthroscopic Bankart repair. 22Critical sizes of these lesions have been reported as depths of more than 16% of the humeral head diameter, area more than 25% of the articular surface of the humeral head, and volume greater than 250 mm. 31,60-62Shijith et al. determined that CT is an effective modality for assessing the amount of bone loss on the glenoid side or head of the humerus, with glenoid width bone loss of more than 9.8% or Hill-Sachs defect of more than 14.8 mm being the critical defects after which the frequency of dislocations increases (Table 7). 39

Discussion
In the current review, there is significant variability in imaging modality and measurement techniques, with MRI and depth being the most prevalent, respectively.The current literature on the assessment of HSL demonstrates a wide range of measurement techniques and imaging modalities with support for MRI and MRA.However, results should be taken with caution due to the small number of included studies with each modality, the variability in study designs, and the lack of high-quality and comparative studies.In addition, the variety of measurement techniques corroborate the lack of standardization and agreement regarding the best modality and measurement method, suggesting that at this point there is no clear superiority of one imaging modality or measurement technique above the other.
MRA showed the highest sensitivity and specificity values amongst the different imaging modalities, but with values that range considerably from 69% to 100%. 24,26,32,34,46,49Accuracy (81-100%), and intra-rater and inter-rater agreement was highest amongst MRA      compared to all other modalities.MRA is reliable in diagnosing various shoulder pathologies such as intra-articular cartilage and ligaments injuries, labral tears, and rotator cuff disease amongst others. 63MRA can also be effective in measuring HSLs in adolescent patients and can help address bony complications of HSLs to accurately assess the lesion. 59Another consideration is the viewing angle of the MRI.For example, the abduction-external rotation view and the apprehension test position are both recommended as useful techniques for detection of anterior shoulder instability, with the latter being possibly more beneficial in HSL examination when using indirect arthrography. 35espite its numerous benefits, MRA imaging has its drawbacks, most notably it is an expensive imaging tool.In addition, metal in the vicinity of the lesion can interfere with the true signal. 30Furthermore, the reproducibility and accuracy of MRA assessments is moderate. 32any studies have shown success with measuring HSLs with other modalities, most notably standardized CT protocols.In fact, most clinicians use CT imaging as part of their preoperative assessment process when dealing with patients with shoulder instability.However, given the variability and inconclusive results among CT studies in this review, it is difficult to conclude its reliability despite its use by clinicians.Another consideration is the potential radiation exposure patients may experience during CT imaging. 58,59,64ortunately, newer CT protocols have shown a reduction in radiation exposure by developing low dose scans protocols, therefore decreasing this concern. 65,66There exists a need to conduct analyses to directly compare imaging modalities not only regarding accurate measuring of HSLs, but also their safety profile and potential exposure risks to patients.
In 3D-CT HSLs measurement methodologies, analysis of a two-dimensional image of a three-dimensional object leaves many discrepancies.This often leads to misinterpretation in raters from image imperfections or measurement errors. 8However, 3D imaging adds the benefit of modeling which can show the nature of the defect alongside the location for considerations of operative repair. 21MR imaging more accurately quantifies Hill-Sachs interval as the rotator cuff insertion is more clearly visible than with CT scans and allows for evaluation of soft-tissue injuries accompanying primary anterior shoulder instability. 58nfortunately, accurate measurement of HSLs volume is often difficult.The wide variety of measurement methods reflect the lack of agreement in this area, although imaging findings do not always reflect what is observed arthroscopically. 59,67Recently, arthroscopic evaluation has been questioned due to poor accuracy and reliability when compared to CT scans with the potential of overestimating bone defects in patients with glenoid bone loss. 68hus, even when the most commonly used reference or gold standard test in this scoping review was arthroscopy in around 85% of the studies, there are concerns about its precision.Preoperative planning in glenohumeral instability plays a pivotal role in determining appropriate treatment plans for patients, therefore analyzing imaging modalities and measurements of glenoid and humeral bone loss is essential for the treatment decision-making process.Among the 45 studies, CT-based tests and magnetic resonance-based tests were the most prevalent imaging modalities used as index tests.Currently there is no consensus on an accepted threshold value for HSL that will lead to a certain surgical treatment as its importance relies more in a bipolar defect concept.This creates numerous complications for quantifying bone loss, predicting engagement prior to surgery, and deciding the best treatment for anterior glenohumeral instability patients. 16In contrast, most of the attention has been given to quantifying glenoid bone defects.The threshold values of glenoid bone loss above which arthroscopic Bankart repairs may fail have been widely accepted as ≥25% glenoid width loss, equivalent to ≥19 % of the glenoid length and ≥20 % of the surface area created by a best-fit circle on the inferior surface of the glenoid. 16,69owever, a better understanding of shoulder instability as a bipolar problem reflected in the glenoid track concept and its potential treatment implication warrant a more precise quantification of the HSLs to offer patients the best treatment alternative.
An analysis of the quantification methods for HSLs identified in the included studies shows that measurement of the depth of the lesion is most prevalent.HSL depth measurements have been shown in addition to other quantification methods such as length and width measurements.Given the limited quantitative data and variability in modalities used among all different techniques, difficulties arise in identifying a "gold standard" for quantifying HSLs.To address the discrepancies between preoperative and intraoperative measurements of HSLs, a precise method for quantification of HSL needs to be established amongst clinicians and radiologists.Although our understanding of glenohumeral pathologies has grown exponentially, there remains a lack of consistency and agreement in the evaluation of this injury.For current surgeons, it is equally important that each technique's benefits and drawbacks are extensively studied and considered for each unique patient presentation to achieve the most accurate and best diagnosis of the HSL to dictate intervention planning.On the other hand, there is a need to establish the role of imaging modalities to optimize the decision-making process while reducing the economic burden of the healthcare system when using these resources.

Limitations
This review consists of limitations.Firstly, a meta-analysis was not performed as there was high statistical and methodological heterogeneity among the studies and thus, results are summarized descriptively.Furthermore, although multiple imaging modalities and measurement techniques were investigated, there was a lack of a good quality and quantity of evidence available for each.Thus, our ability to comprehensively comment on a "goldstandard" and provide meaningful recommendations is limited.
High-quality comparative studies with large sample sizes should be conducted in the future to determine an optimal imaging modality and to identify the best and more effective measurement technique.Therefore, future studies should standardize assessments of accuracy and reliability for imaging modalities and measurement techniques in quantifying the HSL.Future studies should also assess how treatment decisions can change based on the use of MRI with or without MRA.Lastly, an economic/costbenefit analysis of imaging modalities should be conducted to help guide clinicians and radiologists on what is the best modality to measure HSLs.

Conclusion
MRA and MRI are reliable imaging modalities with good test diagnostic properties for assessment of HSLs.There is a wide variety of measurement techniques and imaging modalities for HSL assessment, however a lack of comparative studies exists.Thus, it is not possible to comment on the superiority of one technique over another.Future studies should directly compare the accuracy and reliability of imaging modalities and measurement while also conducting cost-benefit analyses.All authors contributed substantially to the conception and design, or acquisition of data, or analysis and interpretation of data; drafted the article or revised it critically for important intellectual content; provided the final approval of the version to be published; and agreed to act as guarantor of the work (ensuring that questions related to any part of the work are appropriately investigated and resolved).
X-ray arthroscopy Measured depth of lesion and the radius ofhumeral head on the AP X-ray in 45degree

Table 5 .
Detecting the presence of Hill-Sachs lesions with magnetic resonance arthrography (MRA).al. Measurements of each of the Hill-Sachs lesions were undertaken on the second most superior transverse image, measuring the humeral circumference at this level, as well as the depth of the Hill-Sachs lesion This depth measurement was used in the analysis of Hill Sachs lesions as

Table 1 .
Study characteristics and methodological quality.

Table 2 .
Detecting the presence of Hill-Sachs lesions with computed tomography (CT).

Table 3 .
Detecting the presence of Hill-Sachs lesions with computed arthrotomography (CTA).

Table 4 .
Detecting the presence of Hill-Sachs lesions with magnetic resonance imaging (MRI).

Table 6 .
Detecting the presence of Hill-Sachs lesions with ultrasound (US).

Table 7 .
Detecting the presence of Hill-Sachs lesions compared with various modalities.