Measuring intraobserver and intermethod reliability of endometriotic cyst volumes: A comparison between MRI and 3D transvaginal ultrasound in endometriosis

Objectives: To identify the intraobserver and intermethod reliability of three-dimensional transvaginal ultrasound (3D-TVUS) using the software VOCAL and XI VOCAL compared with magnetic resonance imaging (MRI) for volumetric measurement of ovarian endometrioma. Methods: The intermethod and intraobserver reliability of endometrioma volumes were assessed in 16 women diagnosed with endometriosis through laparoscopy with histologic confirmation and presenting with uni- or bilateral endometriomas. In total, volumes of 23 endometriomas were assessed with two-dimensional and three-dimensional transvaginal ultrasound and 6 mm magnetic resonance imaging. Examinations took place at two moments in one menstrual cycle: day 2–4 (T0) and day 20–22 (T1). Results: The intraclass correlation for intraobserver reliability is good to very good for all three techniques ranging from the lowest value of 0.953 to the highest of 1.000. MRI has the most narrow limits of agreement (−3.93 to 4.53), followed by XI VOCAL (−5.16 to 5.65) while VOCAL has the widest limits of agreement (−10.22 to 11.39). Intraclass correlations are poor in the comparison of XI VOCAL to MRI, moderate between VOCAL and XI VOCAL, and good for the comparison between VOCAL and MRI. Limits of agreement vary per technique. When comparing 3D imaging techniques with 2D TVUS, XI VOCAL versus 2D TVUS provide the smallest limits of agreement. Conclusions: MRI and XI VOCAL provide the best intraobserver reliability. The different imaging techniques are not interchangeable. As TVUS is a more readily available and cost-efficient imaging technique the usage of XI VOCAL is advised.


Introduction
Endometriosis is a benign estrogen-dependent disorder, affecting approximately 2%-10% of the female population of reproductive age. 1,2 Ovarian endometriosis is the most common presentation occurring in approximately 17%-44% of endometriosis patients. 3 In clinical practice, visualization of endometriomas is easily achieved with two-dimensional transvaginal ultrasound (2D-TVUS). 4 Major volumetric changes can be visualized in time by using three orthogonal planes, which can be useful for treatment choices in individual patients. 5 Determining (small) volumetric changes in endometriomas can give more insight into the development and circumstances under which they grow or decrease in size. Three dimensional transvaginal ultrasound (3D TVUS) might be more accurate to measure the volume of an endometrioma by using software of Virtual Organ Computeraided AnaLysis (VOCAL) and eXtended Imaging Virtual Organ Computer-aided AnaLysis (XI VOCAL). Both techniques provide volume calculations based on manual delineations of the contour of objects. 6 Previous research has demonstrated that 3D ultrasound has better accuracy in determining volumes of both regular and irregular shaped objects when compared to 2D-TVUS. 7 In vitro research has identified that intraobserver reliability is high for both VOCAL and XI VOCAL. 8 Splenic volume and parathyroid volume have previously been reliably and accurately measured using 3D ultrasound. 9,10 In the uterus, endometrial and uterine niche volumes can be measured adequately by 3D ultrasound as well. [11][12][13][14] In current practice, MRI has proven to be a diagnostic tool for various manifestations of endometriosis including endometriomas. 15,16 MRI can also be used to perform volumetric measurements. To date, no research has been performed using 3D-TVUS and MRI imaging techniques to determine endometrioma volumes. Therefore, the objective of this study was to investigate whether MRI and TVUS are reproducible and reliable methods for volume measurements of endometriomas by evaluating the intraobserver-and intermethod reliability of three-dimensional imaging techniques. In addition, we aim to determine whether endometriomas show volumetric changes during one menstrual cycle since we hypothesize that endometriomas may undergo a volumetric increase due to estrogenic stimulation during the follicular phase.

Methods and materials
This study was part of a single center, prospective casecontrol study investigating Sampson retrograde menstruation theory by modern imaging technologies, which was performed between January 2010 and September 2016. The study was approved by the institutional review board of the Amsterdam UMC, location VUmc (METC VUmc 2009/329) and was registered in the Dutch Trial Register (Trial NL2106 (NTR2223)).
Recruitment took place at the endometriosis and fertility department of the Amsterdam UMC, location VUmc. All women who were diagnosed with an endometriosis cyst at transvaginal ultrasound during their appointment at the outpatient clinic and met the inclusion criteria were approached. Women diagnosed with endometriosis through laparoscopy with histological confirmation and presenting with uni-or bilateral endometriomas on TVUS were included in this study after obtaining informed consent. Other inclusion criteria for this prospective study were: (1) age ranging between 18 and 43 years and (2) a regular menstrual cycle (28 days ± 3 days). Patients were excluded if they (1) had a positive pregnancy test, (2) had any contra-indications for undergoing MRI, (3) used hormonal treatment, and/or (4) had a suspicion of uterine, peritoneal, enteral, and/or ovarian malignancy after transvaginal ultrasound and/or MRI.

Procedures
All participating women were seen for examinations at two time points during one menstrual cycle. The first examination took place at the beginning of the menstrual cycle (cycle day 2-4, T0) and the second examination was planned post ovulatory (cycle day 20-22, T1). At both time points a 2D and 3D TVUS using an Accuvix XQ (Medison, Hoofddorp, The Netherlands) and a MRI at 1.5 T (Avanto, Siemens, Erlangen, Germany) were performed. Women were asked to empty their bladder prior to transvaginal ultrasound. Transvaginal ultrasound examinations were performed according to the principles as also advised by the IDEA group. 17 MRI included High Resolution T2-weighted Imaging in three planes and axial T1-weighted with fat saturation with a slice thickness of 6 mm. No intravenous contrast was used. All images were stored according to hospital policy. TVUS were performed by the same expert sonographer with experience in endometriosis. MRI images were described by the same radiologist specialized in endometriosis imaging. An overview of the procedures can be found in the supplementary data.

Volume measurements
The dataset collected according to the above stated procedures allowed for the determination endometrioma volumes at a later time point. Based on the three measurements recorded by 2D transvaginal ultrasound a volume could be derived using the following formula: The obtained 3D image dataset was assessed by one observer (LvdH) using VOCAL and VOCAL XI. The VOCAL method made use of an 18° rotational step. The axial plane was used as a reference to mark the extremities of the cyst after which the cyst was rotated along its axis to obtain 10 rotational views. The area of the cyst was manually traced on each plane. Subsequently a volume was automatically calculated by the program as well as a visual representation of the shape of the cyst ( Figure 1). The XI VOCAL method uses 10 planes. Manual demarcation of the contours was performed per slice. The software automatically calculated the volume as well as the distance between the first and last slice ( Figure 2). The use of the 18° rotational step and the 10 slices were chosen as both these techniques consist of 10 measurement steps. Endometriomas were assessed with MRI on axial T2-weighted images by one observer (JHvW). The surface area of the cyst was traced from which a volume was calculated using the following formula: In order to calculate intraobserver reliability all cyst volume measurements had to be performed twice by one observer. A period of at least 7 weeks separated the first and second cyst volume measurement.

Statistical analysis
Statistical analysis was performed using SPSS 24.0 software (IBM, USA). Patient characteristics were calculated using median, interquartile ranges, and percentiles. Intraobserver reliability and intermethod reliability were calculated with intraclass correlation (ICC) (two waymixed, absolute agreement, single measurement). The ICCs are reported together with 95% confidence intervals.
To assess the value of the intraclass correlation, cut-off values advised by Martins and Nastri 19 were used: a value of < 0.70 is considered a very poor correlation, a value of 0.70-0.90 is a poor correlation, 0.90-0.95 is a moderate correlation, 0.95-0.99 is a good correlation and a value of >0.99 is an excellent correlation. The limits of agreement for the intraobserver and intermethod reliability were calculated according to Bland and Altman. 18 The values presented in the Bland-Altman plots are the differences between the measurements. The limits of agreement are set to include 95% of the calculated values. (Supplemental Figures 4 to 12).

Results
In total, 23 cysts from 16 patients were included. Four patients were excluded due to missing or incomplete threedimensional imaging data. Patient characteristics are presented in Table 1.

Intraobserver reliability
The intraclass correlation for intraobserver reliability is good to very good for all three techniques ranging from the      Table 4 provides an oversight of the average changes in cyst volumes measured with MRI at T0 and T1. The difference in mean cyst diameter is added to show relative growth of the cyst compared to volume change.

Discussion
In this study, we investigated the reliability of 18° VOCAL and XI VOCAL (10 slices) 3D-TVUS and MRI in the measurement of endometriomas. Intraobserver reliability was excellent for MRI. In comparison, XI VOCAL demonstrated very good intraobserver reliability. The imaging modalities are not interchangeable due to poor intermethod reliability especially with increasing cyst size. We were unable to observe any significant changes in cyst volume within one menstrual cycle. To our knowledge, this is the first study to determine the intraobserver-and intermethod reliability of different 3D imaging techniques when measuring endometriomas. A strength of this study is that all patients included had a cystectomy after receiving both scans allowing for histological confirmation of all visualized cysts. In addition, we were able to obtain a relatively large dataset due to measurements at two time points during the cycle. A limitation in our study is the lack of an absolute volume to compare the imaging data to, as we were unable to measure the exact volume of the cysts during surgery. The lack of a comparison makes it difficult to determine an acceptable error margin for the 3D imaging techniques. However, a previous in vitro study described VOCAL and XI VOCAL to overestimate the size of irregular shaped objects by 7.8% and 4% respectively. 8 For the intraobserver reliability, XI VOCAL and MRI can be reliably used to reproduce measurements. We found that MRI provides the highest level of reliability when measuring endometriomas. XI VOCAL did produce smaller limits of agreement than VOCAL which could be explained by its ability to more accurately depict irregularly shaped objects. 8 Additionally, we did not repeat 2D transvaginal ultrasound measurements to perform a first and second assessment per timepoint (T0 and T1), and therefore cannot provide insight into intraobserver reliability for 2D transvaginal ultrasound. A comparison between XI VOCAL and VOCAL has previously only been  20 who reported similar results of high intraobserver reliability for both methods. They did however report smaller limits of agreement with both VOCAL and XI VOCAL. Intermethod reliability varies from poor to good between the 3D imaging techniques. In the comparison between 3D imaging techniques and 2D TVUS the smallest limits of agreement are found in the comparison between 2D TVUS and XI VOCAL. From the intermethod Bland-Altman plots it seems that the measurement differences increase with cyst size. The measurement differences relative to cyst size are however not statistically significant. For future research, we advise the use of one imaging modality to measure volumetric changes as the limits of agreement are smaller when using one measurement technique. XI VOCAL can be advised above VOCAL as XI VOCAL provides smaller limits of agreement and more accurate measurements with larger cyst sizes. The use of XI VOCAL can be a more accurate measurement technique in comparison to 2D-TVUS as it takes into account the irregular shape of the cyst yet has the same practical benefits as TVUS. The practical benefits being that TVUS is relatively cheap, less time-consuming, and readily available at most gynecological outpatient departments. The benefit of 3D-TVUS imaging techniques compared to 2D-TVUS for use in the clinical practice has yet to be determined.
With the current set up of the study, we developed the second objective to determine whether any cyclical variances occurred in cyst volume. During one menstrual cycle we found no difference in endometrioma volume, we were therefore unable to show endometriomal growth under estrogenic stimulation in a single natural cycle. It could also be hypothesized that the cyst grows due to a build-up of menstrual debris. However, as the measurements were limited to one menstrual cycle we were unable to show volumetric changes during menstruation. The lack of growth seen within one cycle can be explained by the slow progressive nature of endometriosis, making it unlikely that any significant anatomical changes can be measured within this timeframe. Future research could focus on tracking the development of endometrial cysts during multiple cycles, as it remains unclear under which circumstances endometriomas grow or decrease in size. In addition, the growth of endometriomas could be a source of pain in women with endometriosis.
In conclusion, for measuring cyst volumes MRI provides the most reliable measurements of all three techniques. However, accurate measurements can also be performed by using 3D TVUS (XI VOCAL) as a more readily available and cost-efficient imaging technique.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Ethical approval
Ethical approval for this study was obtained from the institutional review board of the Amsterdam UMC, location VUmc (METC VUmc 2009/329).

Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.

Informed consent to participate
Written informed consent was obtained from all subjects before the study.

Informed consent to publish
Written informed consent was obtained from the patients for their anonymized information to be published in this article.

Supplemental material
Supplemental material for this article is available online.