Introduction
Cerebrospinal fluid (CSF) rhinorrhea occurs when fistulas form between the subarachnoid space and sinonasal cavity. CSF rhinorrhea can be clinically overt or insidious, and suspicion should merit further investigation due to an increased risk of recurrent bacterial meningitis.
1,2 Preoperative localization aids clinical decision-making by facilitating the selection of approaches for repair, reducing adjuncts (ie, intrathecal fluorescein, postoperative lumbar drains), and identifying multiple leaks.
3Zapalac et al
4 proposed a diagnostic algorithm for skull base CSF leaks in 2002 involving confirmation with beta-2 transferrin and subsequent imaging to guide clinical decision-making. Further literature built on this work by discussing alternative algorithms and incorporating imaging fusion approaches.
5–8 Chemical tests (eg, beta-2 transferrin, beta trace) have excellent test characteristics and are relatively cheap but lack localizing capabilities. Despite the many localizing investigations available, evidence regarding their diagnostic characteristics remains suboptimal. Test characteristics range broadly in the literature and are derived from etiologically heterogeneous populations.
5Given the importance of preoperative localization, achieving a better understanding of test characteristics is vital. This systematic review aims to assess the diagnostic accuracy of various modalities for preoperative localization of CSF rhinorrhea.
Materials and Methods
Eligibility Criteria
This review was completed in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines (PRISMA Checklist) and the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy. Studies of patients with clinical suspicion for CSF rhinorrhea were included. Index tests were not prospectively selected; however, those with an insufficient number of studies or outdated techniques were later excluded. The target condition was CSF rhinorrhea. The reference standard was an intraoperative examination. However, in the context of a negative index test, unremarkable clinical follow-up only was also considered a negative reference standard. Original research studies published in peer-reviewed journals were included. Reviews, sample size of n < 3, case reports, letters, editorials, opinions, and unpublished abstracts were excluded. Spinal/otologic CSF leaks, non-CSF rhinorrhea, intraoperative or immediately postoperative CSF rhinorrhea (< 3 months), non-human, and non-English texts were also excluded.
Search Strategy
MEDLINE and EMBASE databases were searched by 2 authors (MX and KZ) from their inception dates to November 2019. The reference lists and existing reviews were also searched. Search terms combined “CSF rhinorrhea,” “leak,” or “fistula” with index tests. Medical subject heading “cerebrospinal fluid rhinorrhea” and its subheadings of “diagnosis’ and “diagnostic imaging” were searched. Duplicate studies were removed. Two authors (MX and KZ) screened titles, abstracts, and full-text articles for final inclusion using Covidence (Veritas Health Innovation, Melbourne, Australia). If a disagreement occurred, both authors discussed the article until consensus. A third author (DDS) was consulted if disagreement persisted.
Data Extraction and Analysis
Information on study design, selection criteria, recruitment, patient demographics, etiology and site of fistulas, investigations, and results were extracted. The primary outcome of interest was the number of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN) of each index test. Patients who did not receive a reference standard or fulfilled exclusion criteria were not extracted. Descriptive statistics were compiled using Microsoft Excel. Revman5 (The Cochrane Collaboration, Copenhagen, Denmark) was used to collate the sensitivity, specificity, and risk of bias. Cohort studies, which described specific inclusion criteria, exclusion criteria, and study periods, were analyzed separately from case series due to selection bias likely giving the latter higher sensitivities and specificities.
Statistical Analysis
Specificities were not derived for corresponding sensitivities where studies had inadequate TN and FP results. Therefore, meta-analysis using hierarchical summary receiver operating curve and bivariate random effects models was not performed. Qualitative analyses were performed by stratifying data into index tests, etiologies, and leak sites. Data is represented as the median and interquartile range (Q1-Q3).
Risk of Bias Analysis
The Quality Assessment of Studies of Diagnostic Accuracy – Revised (QUADAS-2) evaluated risk of bias in cohort studies based on patient selection, index test, reference standard, and flow and timing. Two authors (KZ and SK) applied the QUADAS-2 and resolved disagreements by discussion.
Results
We identified 5167 studies, from which 1128 duplicates were removed. Of the remaining, 3560 of 4039 unique studies were excluded after screening titles and abstracts with an interrater reliability kappa of .592 (95% CI 0.560-0.625), indicating moderate agreement. Of the 479 studies eligible for full-text screening, 77 met inclusion criteria with an interrater reliability kappa of .739 (95% CI 0.661-0.817), indicating substantial agreement. A total of 53 cohort studies and 24 case series describing 1433 and 189 patients, respectively, published between 1965 and 2019 were included (
Figure 1,
Summary of Studies). There were no randomized controlled trials.
The median number of participants in cohort studies was 19 (range: 4-259). Diagnostic tests included high-resolution CT, CT cisternography, MR cisternography, radionuclide cisternography, contrast-enhanced MR cisternography, office endoscopy, topical intranasal fluorescein, and intrathecal fluorescein with 32, 26, 23, 12, 7, 6, 2, and 2 studies, respectively.
Patient Demographics
A total of 52% of the 1001 patients with reported sex data were female. The most common CSF rhinorrhea etiologies were traumatic (44%), spontaneous (28%), and delayed postoperative/iatrogenic (12%) (
Figure 2). A total of 90% of patients underwent surgery; 10% had negative index tests and clinical follow-up only. Of the sites confirmed intraoperatively, the ethmoid roof was the most common (33%), followed by cribriform plate (28%), sphenoid sinus (22%), and frontal sinus (8%) (
Figure 3).
High-Resolution CT (HRCT)
A total of 26 cohort studies
9–34 and 6 case series
6,35–39 used HRCT in 622 patients. The cohort studies reported a sensitivity of 0.93 (0.65-1.00) (median [IQR]) and specificity of 0.50 (0.00-1.00), while the case series reported a sensitivity of 1.00 (0.79-1.00) and specificity of 1.00 (0.00-1.00). A total of 54% of cohort studies and 83% of case series reported sensitivities ≥ 90%. A total of 69% of cohort studies and 50% of case series had incalculable specificities.
MR Cisternography (MRC)
A total of 21 cohort studies
9,11,13,14,17,19,21,25,27,29,31,32,40–48 and 2 case series
6,35 used MRC in 396 patients. The sensitivity and specificity of the cohort studies were 0.94 (0.81-1.00) and 0.77 (0.17-0.97), respectively. The sensitivities of the case series were 1.00 and 0.82, while the specificities were 0.00 and 1.00. 67% of cohort studies and 1 case series reported sensitivities ≥ 90%. A total of 71% of cohort studies had incalculable specificities.
CT Cisternography (CTC)
A total of 14 cohort studies
14,15,17,19,25,30,33,40,44,49–53 and 12 case series
6,38,54–63 used CTC in 285 patients. Cohort studies and case reviews yielded high sensitivities of 0.95 (0.73-1.00) and 1.00 (1.00-1.00), respectively. Cohort studies also yielded a high specificity of 1.00 (0.75-1.00). Two case series had calculable specificities of 1.00.
Radionuclide Cisternography (RNC)
Eight cohort studies
19,27,30,64–68 and 4 case series
69–72 used RNC in 469 patients. The sensitivities of the cohort studies and case series were 0.90 (0.81-1.00) and 1.00 (0.94-1.00), respectively. The specificity of the cohort studies was 0.50 (0.00-1.00). Two case series had calculable specificities of 1.00.
Contrast-Enhanced MR Cisternography (CEMRC)
Six cohort studies
22,41,73–76 and 1 case series
35 used CEMRC in 154 patients. The cohort studies yielded near-perfect sensitivity and specificity of 0.99 (0.96-1.00) and 1.00 (1.00-1.00), respectively. The case series had a sensitivity and specificity of 1.00.
Office Endoscopy
Four cohort studies
12,14,22,23 and 2 case series
77,78 used endoscopy in 66 patients. The sensitivity of the cohort studies was 0.58 (0.13-1.00). One case series had a calculable sensitivity of 0.33. One cohort study and 1 case series had calculable specificities of 1.00.
Topical Intranasal Fluorescein (TIF)
Two cohort studies
79,80 used TIF in 40 patients with sensitivities of 1.00 and incalculable specificities.
Intrathecal Fluorescein (ITF)
Two cohort studies
52,81 used ITF in 76 patients. One study reported perfect sensitivity and specificity while the other reported a sensitivity of 0.92 and incalculable specificity.
Etiology-Specific Studies
Four cohort studies
40,43,67,73 and 2 case series
39,60 described 344 patients with traumatic CSF rhinorrhea; the most common leak site was ethmoid roof followed by cribriform plate. Nine cohort studies
9,13,15,18,26,28,32,33,53 and 6 case series
36,38,61,63,70,78 studied spontaneous CSF rhinorrhea in 180 patients; the most common leak site was cribriform plate followed by ethmoid roof. Spontaneous leaks were not routinely stratified based on intracranial pressure (ICP). A summary of test characteristics within each etiology are shown in
Table 2.
Site-Specific Studies
There are limited studies examining CSF rhinorrhea of a specific site. One cohort study and 1 case series studied only ethmoid roof leaks
39,73; 1 cohort study and 3 case series examined cribriform plate leaks
36,38,53,61; 2 cohort studies studied sphenoid sinus leaks.
14,28 However, due to the small sample sizes, no clinically significant conclusion was drawn.
Methodological Quality of Included Studies
The assessment of methodological quality for cohort studies is described in
Figure 4.
Domain 1: Patient Selection
A total of 41 studies used a consecutive or random sample, 52 avoided case-control design, and all avoided inappropriate exclusions. A total of 40 studies complied with all signaling questions and were low-risk for patient selection bias.
Domain 2: Index Test
Index tests were evaluated independently of the reference standard in 51 studies. Information was unclear in 1 study. Criteria for diagnosis were prespecified in 23 studies. All studies had a low concern that the index test, its conduct, or interpretation differed from the review question.
Domain 3: Reference Standard
The reference standard of surgical exploration or follow-up was likely to correctly classify CSF rhinorrhea. The results of the reference standard were always interpreted with knowledge of index tests.
Domain 4: Flow and Timing
The interval between imaging and surgery was usually unspecified (51 of 53). All patients received the same reference standard of surgical exploration in most studies (34 of 53). The remaining studies had participants that received follow-up only. All studies included all patients in the analysis. The risk of bias for patient flow was low in 3 studies, high in 19, and unclear in 53.
Discussion
The development of diagnostic tests and algorithms for CSF rhinorrhea is ongoing. This systematic review summarizes the literature regarding test characteristics of localizing investigations. Overall, invasive techniques demonstrate better diagnostic characteristics compared to noninvasive but come with the risks of lumbar puncture and intrathecal dyes. Preoperative ITF is an invasive diagnostic procedure that appears to be less useful without surgical dissection and performs worse than imaging based on the limited studies. Of all investigations, CEMRC was the best for diagnosing and localizing a CSF leak.
Oakley et al refined the diagnostic approach first proposed by Zapalac et al
4 Both authors recommended a two-stage approach involving confirmation with a chemical test and subsequent radiographic localization. HRCT is recommended as a first-line due to its sufficient diagnostic characteristics, low cost, and noninvasive nature. MRC, which had better test characteristics but greater cost, was reserved as a second-line investigation after a negative initial work-up. Unfortunately, small and intermittent leaks may not be amenable to chemical testing due to lack of analyzable specimen. Zapalac endorsed the use of RNC for confirmation in settings of high clinical suspicion but lack of chemical confirmation, whereas Oakley advocated substituting RNC with MRC as a noninvasive alternative with similar diagnostic potential and costs.
4In this systematic review, we qualitatively examined the characteristics of diagnostic modalities compared to surgically confirmed CSF leaks or follow-up for negative index tests. The data represent diagnostic performance in the setting of patients with and without initial/additional confirmatory chemical tests. HRCT had a good sensitivity and likely still stands as a good first-line imaging modality given its accessibility and low costs. As a second-line, MRC appears to have better sensitivity and offers better localizing information than RNC images or examination of oriented intranasal pledgets for radiotracers. Moreover, MRC has similar and likely higher sensitivity than CTC without the added risk of intrathecal dyes, but also has lower specificity. However, in the context of ruling out CSF leaks, MRC appears to be the better overall option. Previous studies did not differentiate between MRC and CEMRC. This review demonstrates that existing evidence indicates very strong test characteristics for CEMRC with a sensitivity of 0.99 with a narrow IQR (0.96-1.00) as well as a specificity of 1.00, but with the caveat of invasiveness and cost. This renders CEMRC a strong candidate as a third-line investigation in patients with low-to-moderate clinical suspicion of CSF rhinorrhea with negative HRCT and MRC. Furthermore, patients with a sufficiently high index of suspicion or CSF leak morbidity can be considered for ITF and/or surgical exploration with the repair of any identified sites. In contrast, those with low-to-moderate suspicion may avoid an unnecessary general anesthetic and surgical risk if CEMRC successfully rules out a leak, though there remain risks associated with lumbar puncture and intrathecal dye.
Although TIF and ITF demonstrated near-perfect sensitivities, the small number of studies limits the validity and precision of the data. Moreover, ITF is commonly used intraoperatively but is confounded by surgical manipulation and such studies were excluded from this review. Further research is required to understand the effectiveness and practicality of preoperative ITF and TIF. Lastly, as expected, office endoscopy alone is inadequate for localization.
Understanding the risks and benefits of each investigation tool may assist selection in specific clinical scenarios. For instance, HRCT uses narrower slices (1-2 mm) to maximize resolution with the downside of increased radiation exposure and partial averaging artifacts with smaller defects. Additionally, HRCT depends on indirect leak visualization using bony defects, intracranial lesions, complications of leaks, and air-fluid levels. This makes it difficult to rule in a diagnosis and identify an active site in the context of multiple leaks.
35,41,48MR imaging directly visualizes fistulous tracts without ionizing radiation or interference from bony artifacts.
35 However, the presence of inflammation may cause false positives given its similarly hyperintense signal. Additionally, when compared to HRCT, MR has lower spatial resolution, higher cost, and more contraindications.
6Invasive imaging involves intrathecal contrast injection. Contrast enhances the signal: noise ratio, thus improving diagnostic accuracy but introduces risks including hypersensitivity, stroke, seizures, bleeding, and other neurological complications.
53,60 CTC has intrathecal contrast injection which passes through the fistula to provide direct visualization.
16,25 However, it requires leaks to be active during imaging and in cases of inactive leaks, overpressure techniques can be used.
64RNC uses radioactive material administered via lumbar puncture and tracks its subarachnoid distribution with serial scintillation scanning.
27,66,68 Similarly, it works best during active leaks and its localization loses spatial resolution in cases of large or vigorous leaks that contaminate adjacent areas.
19,27,65,69,74,76CEMRC combines MRC with intrathecal contrast (eg, gadopentetate dimeglumine). Gadopentetate's low viscosity and similar density to CSF allows for better subarachnoid distribution compared to CTC dyes.
75 It also remains in the subarachnoid space for up to 24 h, allowing for delayed imaging of insidious leaks.
73,75 CEMRC better highlights CSF-containing spaces and smaller defects.
22,28,36Specific Etiologies
A total of 44% of patients had traumatic leaks compared to 80% in literature.
60 However, 70% cease without intervention and these patients would have avoided treatment and been excluded from the current study.
82 Postoperative/iatrogenic etiologies are less represented in this review than the literature (12% vs 16%)
82 likely due to exclusion of intraoperative or immediate postoperative repairs (< 3 months). A total of 28% of patients had nontraumatic leaks compared to 4% in literature.
15,32 This could be attributed to the tendency for nontraumatic leaks to not respond to conservative management and necessitating surgery.
73 CEMRC, CTC, and HRCT for traumatic leaks and HRCT, MRC, CTC, RNC for spontaneous leaks all appear to have high diagnostic accuracy. However, the smaller number of patients and studies makes it difficult to draw satisfactory conclusions; more subgroup-specific data is needed.
11,Diagnosis by Site
The most common sites were ethmoid roof, cribriform plate, and sphenoid sinus, which agree with the literature.
39 Unfortunately, minimal conclusions could be drawn surrounding site-specific diagnostics because of sparse data.
Limitations
The nature of a systematic review lends itself to publication bias, where favorable studies are more likely to be published. The retrospective nature of included studies carries selection bias, where clinical decision-making influences the selection of index tests (eg, patients with high clinical suspicion forgo invasive tests, and patients with intermittent or harder to detect leaks receive more invasive tests). A total of 130 studies had unstratified results which prevented sensitivity and specificity calculations and were excluded. Most TN results are determined by clinical follow-up with resolution of symptoms or subsequent negative diagnostic tests, increasing the likelihood of missing FNs. The follow-up times were also generally not reported or standardized. Many inclusion criteria included a positive B2T test which is extremely sensitive for CSF. These studies had no TN and FP results, leading to incalculable specificities and precluding meta-analysis of data. Additionally, there is significant inter-study variation regarding sample sizes, etiologies, imaging protocols, and diagnostic criteria. This review does not consider the accessibility, cost, and risk associated with index tests. The authors appreciate the contribution of these factors in clinical decision-making from cost-effectiveness and cost-benefit perspectives and hope that future studies will expand on these specific areas.
Most included studies focused on treatment of CSF rhinorrhea, not its diagnosis or localization. Future studies should concentrate on directly comparing index tests or studying them in combination for specific clinical settings (eg, leak characteristics, site, etiology). Investigating the impact of etiology and site on diagnostic accuracy will help develop a standardized treatment algorithm. It would also be beneficial to study whether patient choice and satisfaction plays a role in test selection, particularly for invasive tests.