Inter-reader agreement of 18F-FDG PET/CT for the quantification of carotid artery plaque inflammation

Introduction A significant proportion of ischemic strokes are caused by emboli from unstable atherosclerotic carotid artery plaques. Inflammation is a key feature of plaque instability. Positron emission tomography/computed tomography (PET/CT) with 2-deoxy-2-(18F)-fluoro-D-glucose (18F-FDG) is a promising technique to quantify plaque inflammation, but a consensus on the methodology has not been established. High inter-reader agreement is essential if 18F-FDG PET/CT is to be used as a clinical tool for the assessment of unstable plaques and stroke risk. Methods We assessed the inter-reader variability of different methods for quantification of 18F-FDG uptake in 43 patients with carotid artery stenosis ≥70%. Two independent readers delineated the plaque and collected maximum standardized uptake value (SUVmax) from all axial PET slices containing the atherosclerotic plaque. Results Uptake values with and without background correction were calculated and intraclass correlation coefficients were highest for uncorrected uptake values (0.97–0.98) followed by those background corrected by subtraction (0.89–0.94) and lowest for those background corrected by division (0.74–0.79). Conclusion Quantification methods without background correction have the highest inter-reader agreement for 18F-FDG PET of carotid artery plaque inflammation. The use of the single highest uptake value (max SUVmax) from the plaque will facilitate the method’s clinical utility in stroke prevention.


Introduction
Ischemic strokes caused by thromboembolism from an unstable atherosclerotic plaque in the carotid artery can be prevented by carotid endarterectomy (CEA). [1][2][3] Patients are selected for CEA based on the degree of carotid artery stenosis and presence or absence of cerebral ischemic symptoms. In recent years it has become increasingly clear that the degree of stenosis alone is not the best predictor of stroke risk. This has led to the concept of the 'unstable plaque' describing carotid plaques that carry high risk of stroke irrespective of the degree of artery stenosis and increased focus on factors that destabilize the plaque. Inflammation plays a key role in the development of an unstable plaque. [4][5][6] macrophage rich areas in carotid artery plaques. 7 The goal of the imaging technique is to detect carotid plaques that are at high risk of rupture and therefore carry high risk of stroke. 18 F-FDG PET for the detection of unstable plaques is not in clinical use, 8 partly due to lack of feasible PET protocols and consensus regarding imaging procedure, method for 18 F-FDG uptake quantification and assessment of stroke risk, although several recommendations exist. 9,10 PET is an imaging modality with limited anatomical information, and it might therefore be challenging to define the vesselsegment-of-interest. Computed tomography angiography (CTA) is often used together with 18 F-FDG PET when assessing patients with carotid artery stenosis, but selection of the plaque area for uptake measurements varies. [11][12][13] A requirement for introducing a diagnostic method into clinical routine is high interreader agreement. Inter-reader agreement has been studied for a few selected uptake parameters with generalized vascular inflammation 14,15 and in patients with symptomatic carotid stenosis, 12,13 but to our knowledge no study has compared inter-reader agreement for different quantification methods.
The aim of this study was to assess inter-reader variability of different methods used for quantification of 18 F-FDG uptake at PET/CT of carotid artery plaques.

Study population
The study cohort consisted of forty-three patients with ultrasound-confirmed atherosclerosis with internal carotid artery stenosis !70% according to consensus criteria of the Society of Radiologists in Ultrasound. 16 Patient characteristics are summarized in Table 1. There were 30 men (66 AE 9 years) and 13 women (67 AE 8 years) with a mean age of 66.2 years. The study protocol conformed with the ethical guidelines of the 1975 Declaration of Helsinki and was approved by the Norwegian Regional Committee for Medical and Health Research Ethics South-East A. Written informed consent was obtained from all patients prior to study inclusion. 18

F-FDG PET/CT examination
After a minimum of six hours fasting the patients were injected with 5 MBq/kg 18 F-FDG and blood glucose, weight, and height were recorded. After approximately 90 minutes a two-bed position PET/CT from the base of the skull to the aortic arch was performed with 15 minutes per bed position using a hybrid PET/CT scanner (Siemens Biograph 64, Siemens Medical Systems, Erlangen, Germany). The PET images were acquired with a 256 Â 256 matrix and the images were reconstructed to two millimetre thick slices, with four iterations/eight subsets ordered subset expectationmaximization (OSEM) algorithm and Gaussian post-reconstruction filter with 3.5 mm full width half maximum (FWHM). In addition to a non-contrast CT for attenuation correction a CTA with contrast filling of the arteries (minimum 40 mL Iomeron (iodine 350 mg/ mL; Bracco Imaging S.P.A, Milan, Italy) or Visipaque (iodine 320 mg/mL); GE Healthcare, Chicago, USA) was acquired immediately after the PET when still lying in the scanner for 16 of the 43 patients. For 24 patients CTA was performed at other radiologic departments. For three patients no CTA was available when the PET images were analysed.

Image analyses and 18 F-FDG quantification
The images were assessed with Hybrid Viewer 2.0 software (Hermes Medical Solutions AB, Stockholm, Sweden). Two experienced nuclear medicine senior consultants independently evaluated the 18 F-FDG PET/CT examinations. The two readers (R1 and R2) did not undergo any joint training before assessing the images, but they agreed on how to perform the analyses. The instructions were to use the CTA as guide for drawing the region of interests (ROIs) on the fused slices (PET and non-contrast CT). The plaque was defined as vessel wall thickening and a lumen contrast-filling defect on CTA. 11 The ROIs were drawn around the entire vessel wall and lumen on all plaque-containing axial PET slices ( Figure 1). For patients without CTA available, the plaque was defined as vessel wall with calcification and fat deposits in the level of the carotid bifurcation. Uptake in structures close to the plaque (e.g. lymph nodes, paravertebral muscles or salivary glands) that could falsify the The patient material is included in previously published studies. 18,19 plaque uptake values were excluded from the ROI. The number of plaque-containing slices for each patient was recorded. The pixel values in the PET images were converted into SUV and normalized to lean body mass. 17 SUV max in all plaque containing ROIs were recorded. Background blood pool activity was obtained from four ROIs placed in the lumen of the jugular vein away from structures with 18 F-FDG uptake but preferably in the same craniocaudal level as the plaque. The background was calculated as the mean of the SUV mean in these four ROIs. Five different measures of 18  Blood background corrected values were calculated as the 18 F-FDG uptake values divided by the mean blood pool activity (TBR) and subtraction of the blood pool activity from the 18 F-FDG uptake values (corrected SUV (cSUV)).

Statistical analysis
The IBM SPSS Statistics software for Windows (version 25.0; IBM Corp., Armonk, USA) was used for data analyses. Groups of paired data were compared using the Wilcoxon signed rank test for non-normally distributed variables. Inter-reader agreement was calculated using intraclass correlation coefficients (ICC's; model two-way random, type absolute agreement). All statistical results were considered significant when p < 0.05.
There were no differences in 18 F-FDG uptake between the two readers ( Table 2). The ICC for the different 18 F-FDG quantification methods was highest for uncorrected SUVs (0.97-0.98) followed by cSUVs (0.89-0.94) and TBRs (0.74-0.79), and 0.77 for the background blood pool ( Table 2). The differences in the median for the uptake values between the readers ranged from 0.00 and 0.01 for the uncorrected SUVs to 0.04-0.14 for TBRs (0.14 for TBR MDS5). The difference for the background value was 0.02 (Table 2). Figure 2 shows the differences in max SUV max and mean SUV max for individual patients for the two readers without background correction (a and b), and the corresponding values when the 18 F-FDG uptake is corrected for background blood pool by division (TBR; 2 (c) and (d)) and by subtraction (cSUV; 2(e) and (f)). The difference in venous background is shown in Figure 2(g). The difference between the readers is highest for the uptake values corrected for background blood pool by division (2(c) and (d)), and lowest for the uptake values without background correction (2(a) and (b)).

Discussion
In this study we found high inter-reader agreement between different methods for 18 F-FDG uptake quantification of inflammation in high grade carotid artery stenosis. The inter-reader agreement was highest for the methods without background correction. Two studies in patients with carotid stenosis supports our finding that methods without correction for background blood activity have higher inter-reader agreement than background corrected values: Kwee et al. 12 reported an ICC of 0.61 for TBR mean SUV max and 0.65 for TBR max SUV max, and Marnane et al. 13 found an ICC of 0.99 for mean SUV max .
In our study the highest ICC was found for max SUV max (0.98). For the methods without background blood pool correction only 12% of the max SUV max and 14% of the mean SUV max measurements differed with more than AE0.10 (Figure 2(a), (b)). Patient number 42 is an outlier with an inter-reader difference of 0.38. This is probably due to different delineations of the plaque ROIs as this patient had high uptake in neighbouring muscle (Figure 3). Reader 1 can have excluded more of the plaque ROIs to be sure to avoid spill-in activity than reader 2. The problem with spill-in from neighbouring structures is due to the relatively low spatial resolution of PET combined with unspecific uptake of 18 F-FDG. Data are given as median (range). P-value from Wilcoxon signed ranks test. SUV, standardized uptake value; MDS, most diseased segment; TBR, targetto-background ratio; cSUV, background subtracted SUV; ICC, intraclass correlation coefficient.
For the background corrected values, the difference was larger with 40% of TBR max SUV max and 30% of TBR mean SUV max having a difference of AE0.25 or more (Figure 2(c), (d)). In our previous study exploring 18 F-FDG-uptake in symptomatic versus asymptomatic patients 19 the difference in median mean SUV max between the groups was 0.32 (1.75 versus 1.43). In two studies using TBR max SUV max as uptake parameter the difference was found to be 0.19 and 0.29. 20,21 Thus, methods with reader difference of 0.25 prohibit differentiation between symptomatic and asymptomatic patients.
We found an ICC for background blood pool activity of 0.77. This discordant assessment of background blood pool activity introduces variation in TBR and cSUVs due to methodology rather than biology. The background blood pool activity in our study was obtained from four ROIs within the lumen of the jugular vein preferably in the same craniocaudal level as the plaque. The vena jugularis has a small diameter and it was often challenging to draw reproducible ROIs within the vein that excluded contribution from neighbouring structures. In a 18 F-FDG PET study of generalized vascular inflammation in which the background blood pool activity was obtained from eight ROIs in the jugular vein the ICC for TBR mean SUV max of the carotid arteries was 0.94-0.96. 14 This suggests that including data from more slices or from a larger vessel segment such as the vena cava superior or atria of the heart could have reduced the inter-reader variability of measuring the blood pool activity. In this study the two readers also had trained together by co-reading several pilot studies before they established an analysis protocol. 14 This is optimal for research studies, but hard to accomplish in larger trials where the readers often are located in different departments.
There is a large amount of studies that quantifies the 18 F-FDG uptake in the vessel wall of patients with suspected generalized vascular inflammation (atherosclerosis not necessarily confirmed by other imaging methods). Although our findings cannot automatically be generalized, one might question the need for background correction for these patients.
Reader 2 included significantly more plaquecontaining slices than reader 1. This did not reduce the ICC of the 18 F-FDG measurements, supporting that the plaque slices with the highest uptake values all were included in both readers plaque area and that the number of slices included in the plaque area has minimal influence on mean SUV max . Our interpretation of this finding is that the plaque inflammation we can detect with 18 F-FDG PET is homogeneously spread out, and also present in the extreme tails of the plaque. This was also one of our main findings when we explored associations between different 18 F-FDG uptake parameters and plaque inflammation at histopathology. 18 Furthermore, this is in accordance with the study results from Kwee et al. 12 who found a strong correlation between TBRs of ipsilateral symptomatic plaques and contralateral asymptomatic plaques and supports the hypothesis that plaque inflammation is systemic to some extent.
A strength of our study is a relatively large patient population with a wide range of uptake values (max SUV max from 1.18 to 2.66) representing low to high plaque inflammatory activity confirmed by histology. 18 A weakness of the study is that only patients with highgrade stenosis !70% were included. In a clinical setting, 18 F-FDG PET will be used for decision making also in lower grade stenosis. However, a study by Marnane et al. 13 did not indicate that also including a lower stenosis degree would worsen the inter-reader agreement. They found an ICC of 0.99 for mean SUV max in symptomatic patients where the stenosis was 50 to 69% in 30 of 60 patients, and !70% in the rest. Another weakness is that no CTA was available for three patients. The lack of CTA could have affected the results but their uptake measures were in line with the values for the other 40 patients and there was also a high degree of concordance between the two readers.
In conclusion, our study confirms the reproducibility of quantification of 18 F-FDG uptake in carotid artery plaques and supports the superiority of quantification methods that do not include blood pool background. The ICC was highest for max SUV max (the single highest uptake value within the plaque) and thus, our suggestion is to further explore this parameter for atherosclerosis imaging.