Pitfalls in EEG Analysis in Patients With Nonconvulsive Status Epilepticus: A Preliminary Study

Objective: Electroencephalography (EEG) interpretations through visual (by human raters) and automated (by computer technology) analysis were still not reliable for the diagnosis of nonconvulsive status epilepticus (NCSE). This study aimed to identify typical pitfalls in the EEG analysis and make suggestions as to how those pitfalls might be avoided. Methods: We analyzed the EEG recordings of individuals who had clinically confirmed or suspected NCSE. Epileptiform EEG activity during seizures (ictal discharges) was visually analyzed by 2 independent raters. We investigated whether unreliable EEG visual interpretations quantified by low interrater agreement can be predicted by the characteristics of ictal discharges and individuals’ clinical data. In addition, the EEG recordings were automatically analyzed by in-house algorithms. To further explore the causes of unreliable EEG interpretations, 2 epileptologists analyzed EEG patterns most likely misinterpreted as ictal discharges based on the differences between the EEG interpretations through the visual and automated analysis. Results: Short ictal discharges with a gradual onset (developing over 3 s in length) were liable to be misinterpreted. An extra 2 min of ictal discharges contributed to an increase in the kappa statistics of >0.1. Other problems were the misinterpretation of abnormal background activity (slow-wave activities, other abnormal brain activity, and the ictal-like movement artifacts), continuous interictal discharges, and continuous short ictal discharges. Conclusion: A longer duration criterion for NCSE-EEGs than 10 s that is commonly used in NCSE working criteria is recommended. Using knowledge of historical EEGs, individualized algorithms, and context-dependent alarm thresholds may also avoid the pitfalls.


Introduction
NCSE is characterized by its inconspicuous motor signs with prolonged electrographic seizure activities. 1 Given the subtle and variable clinical presentations, EEG that confirms ictal discharges (epileptiform EEG activity during a seizure) is an essential diagnostic tool of NCSE. The reliability of the diagnostic tool can be assessed by the interrater agreement of visual EEG interpretation. 2 A low agreement could indicate the presence of certain EEG misinterpretations and therefore less reliable diagnosis of NCSE. 3 Currently, NCSE-EEG interpretations through visual analysis by human raters and automated analysis by computer technology are still not very reliable. [4][5][6][7][8][9][10] The Salzburg consensus criteria (SCC) 11,12 achieved reasonably high accuracy and interrater agreement among human raters for the diagnosis of NCSE. 13,14 However, the accuracy and agreement are relatively low when human raters cannot assess the effect of intravenous antiepileptic drugs on the EEG of individuals during NCSE and when raters interpret short EEG recordings. 5,15,16 The visual EEG interpretation relies heavily on subjective judgments based on human raters' experience and knowledge about the characteristics of ictal discharges such as location, morphology, frequency, and persistence. EEG-readers who are inexperienced for NCSE-EEG patterns are more liable to misinterpret EEG, which results in an unreliable NCSE diagnosis. A previous study 10 summarized several pitfalls in the EEG interpretations of intensive care unit (ICU) patients with NCSE, including misinterpreting artifacts as ictal discharges, assuming that the stereotypical patterns of ictal discharges are observed on the same subject, and assuming that a dichotomy exists between ictal and interictal discharges (epileptiform EEG activity between seizures) in patients with encephalopathy. On the other hand, automated EEG analysis can assist clinicians and caregivers in the detection of ictal discharges because of its greater time efficiency and more objective judgments than visual analysis. [6][7][8][9] However, pitfalls of EEG misinterpretations (misclassifications) also exist in automated analysis: 8,9 the misclassification of preictal discharges (epileptiform EEG activity before a seizure), postictal discharges (epileptiform EEG activity after a seizure), high-frequency artifacts, or similar background EEG activity. These misinterpretations of EEG by both human raters and computer technology could lead to wrong diagnoses of NCSE and cause serious consequences. 17, 18 EEG readers need to be alert to potential pitfalls in EEG analysis for reliable NCSE diagnosis. Nevertheless, studies on the pitfalls in the visual and automated EEG analysis for NCSE are relatively scarce. 4,10 This preliminary study, therefore, addresses the pitfalls of unreliable EEG interpretations in the visual and automated EEG analysis for NCSE patients with chronic epilepsy and brain development disorders when a pharmacological test is not available. We explored whether the reliability of EEG interpretation can be predicted by the characteristics of ictal discharges and the clinical data of patients. Moreover, we further identified potential pitfalls which could cause EEG misinterpretations and therefore low reliability in the NCSE diagnosis. Strategies to avoid the pitfalls were also proposed in this study.

Study Design
This is a retrospective explorative study approved by the Medical Research Ethics Committee of Kempenhaeghe in the Netherlands. We retrieved and analyzed the EEG and the clinical records of 30 subjects with preceding seizures between 2008 and 2016 in this study. Twenty of the 30 subjects' data were recorded when they had clinically confirmed NCSE, which was diagnosed based on the clinical information, such as clinical signs, response to treatment, and EEG recordings. The other 10 subjects' data were recorded during clinically suspected NCSE, that is, the subjects were first suspected to have NCSE based on clinical signs, but the suspicion was disconfirmed based on the response to treatment and EEG recordings. We included subjects <65 years and excluded EEG recordings with excessive artifacts and those of subjects from whom we did not receive permission (from themselves and/or their legal guardians) to use their recordings for scientific research.
The study had 3 phases. (1) Subjects' data were screened according to the inclusion and exclusion criteria, and the clinical background was evaluated by an experienced epileptologist. (2) Ictal discharges in the EEG recordings were annotated by 2 independent raters. (3) Pitfalls in the visual and automated EEG analysis were investigated.

Data Sources
The EEG recordings were acquired by 3 different systems (BrainRT equipment from Onafhankelijke Software Groep, EEG Stellate from Natus Europe GmbH, and Micromed from SIGMA Medizin-Technik GmbH). The sampling rate of the EEG recordings was 100, 200, or 256 Hz. EEG electrodes were positioned in the 10 to 20 system. For the algorithm development of the automated EEG analysis, 21 electrodes were used (19 electrodes in the 10 to 20 system, plus electrodes F9 and F10). All recordings were continuous except in 9 subjects, where 5 min per hour were stored when no clinical abnormalities were recorded. From each subject's clinical data and the corresponding clinical report of EEG recordings, we collected information about age, sex, intellectual disability level, sleep status, preexisting epileptic encephalopathy (defined as severe brain dysfunction at an early age), clinical signs during NCSE, and seizure history.

EEG Visual Analysis and Interrater Agreement
The raters were asked to focus on the EEG visual analysis without checking corresponding videos, as NCSE is known for its difficult discrimination from normal behavior and the fact that different patients show variable clinical presentations. Annotations of ictal discharges were made with the assist of the open-source software "EDFbrowser." 19 Four types of EEG montages were available during the visual analysis, including longitudinal bipolar montage, transverse bipolar montage, average montage, and original archived data montage.
The raters annotated ictal discharges and labeled their characteristics duration, certainty, onset location, onset visibility, and morphology patterns according to the predefined criteria shown in Table 1. As NCSE takes a relatively long time to develop and disappear, we set a minimum of 20 s as the duration criterion for the EEG pattern. The presence of ictal discharges was primarily assessed according to the combination of both SCC and the American Clinical Neurophysiology Society's (ACNS) Standardized Critical Care EEG Terminology criteria. 11,12,20 The raters categorized the certainty of their annotations as either definite or possible ictal discharges, and the rest of the recordings was categorized as episodes without ictal discharges. In addition, the ictal discharges were labeled either generalized or focal according to their onset location, and their onset visibility was labeled either sudden or gradual according to the ACNS criteria. Five categories were used to describe the dominant morphology patterns: 3 of 5 categories-"Spike Wave," "Wave," and "Fast Spike" pattern-were summarized from several terms in ACNS criteria, and the other 2 empirical categories were added-"Seizure-related EMG Artifacts," 21 and "Unknown Type." We summarized the annotated ictal discharges by the 2 raters via an annotation code (Appendix A) and estimated the interrater agreement. Cohen's kappa was calculated to assess the agreement on episodes with and without ictal discharges. Fleiss' kappa was estimated to assess the agreement on episodes with definite ictal discharges, with possible ictal discharges, and without ictal discharges. The 95% confidence interval (CI) of the kappa statistics were also calculated. The level of the agreement was interpreted according to the suggestions of Landis and Koch. 22

Investigation of Unreliable EEG Interpretations Through Visual Analysis
We applied linear regression models to determine whether the reliability of EEG interpretations, quantified by the interrater agreement, was influenced by the characteristics of ictal discharges and the clinical data of subjects. Two models with Cohen's kappa and Fleiss' kappa as the dependent variables were fitted. The independent variables were the characteristics of ictal discharges (duration, certainty, onset location, onset visibility, and morphology patterns) and the clinical data (age, sex, intellectual disability level, sleep status, preexisting epileptic encephalopathy, clinical signs during NCSE, and seizure history). The analysis was implemented in SPSS Statistics version 25. A 2-tailed P-value < .05 indicated a significant linear relationship.

Analysis of EEG Misinterpretations Through Visual and Automated Analysis
Comparing the EEG interpretations from the visual and automated analysis can help us point out the hidden pitfalls. Therefore, we developed a customized multimodal viewer (a Matlab graphical user interface shown in Figure 1) that presents the EEG recordings, the annotated ictal discharges by raters, and the EEG signal classification results generated by our in-house automated analysis system. The automated analysis system included preprocessing, feature extraction, and synthetic 3-class classification (more details can be found in Appendix B). The system mainly analyzed features in time-frequency domain and built a synthetic 3-class classifier to classify EEG epochs into 3 categories ictal discharges, suspicious activity, and normal activity. The suspicious activity indicated the misinterpreted EEG signals (signals misclassified as ictal discharges) in the automated analysis. In the lower panel of the multimodal viewer (Figure 1), the distribution of the 3 categories was presented in units of 10 s.
Two epileptologists investigated and discussed the pitfalls of EEG misinterpretations for each subject. They used the multimodal viewer to compare the EEG interpretations through the visual and automated analysis. Meanwhile, they checked the corresponding clinical data and interrater agreement. The pitfalls of EEG misinterpretations in the automated analysis were investigated among the EEG episodes without annotated ictal discharges but automatically classified as definite or suspected ictal discharges. The pitfalls of EEG misinterpretations in the visual analysis were investigated among the EEG episodes with annotated ictal discharges but automatically classified as normal activity.

Subjects
The subject flow through the 3 study phases is shown in a recruitment tree (Figure 2). In the first phase, 2 subjects were  excluded, and 2 were moved from the diagnosed NCSE group to the suspected NCSE group. In the second phase, the EEG recordings of 2 subjects in the NCSE group and 9 subjects in the suspected NCSE group did not include any agreed annotated ictal discharges. All 16 subjects in the NCSE group were included in the third phase for the statistical analysis. Fourteen of the 16 subjects were used in the automated EEG analysis system development. In addition, 2, 9, and 5 of the 16 subjects' EEG recordings in the NCSE group were sampled at 100, 200, and 256 Hz, respectively. In the suspected NCSE group, 1 subject's recording was sampled at 200 Hz, and the others were at 256 Hz.
In the first phase, the median age of the 16 NCSE subjects was 21 years, ranging from 6 to 43 years, and the median age of the 12 suspected NCSE subjects was 19 years, ranging from 4 to 61 years. Figure 3 presents an overview of the demographics and the clinical characteristics of the subjects. Of note, 2 of 16 NCSE subjects had preexisting epileptic encephalopathy, whereas in the suspected NCSE group, the subjects with and without preexisting epileptic encephalopathy were equally distributed. Moreover, the EEG recording duration in the suspected NCSE group is generally shorter than that in the NCSE group. A more comprehensive list of the clinical signs, seizure histories, and other characteristics of individual NCSE and suspected NCSE subjects are provided in Appendix C.

Interrater Agreement of Visual EEG Interpretations
In the NCSE group, the interrater agreement for both 2 categories (episodes with ictal discharges and without ictal discharges) and 3 categories (episodes with possible ictal discharges, with definite ictal discharges, and without ictal discharges) was moderate, with Cohen's kappa = .53% and 95% CI [0.37, 0.69], and Fleiss' kappa = .41% and 95% CI [0.25, 0.57], respectively. The interrater agreement of the individual subjects are shown in Appendix D. In the suspected NCSE group, the interrater agreement was poor (Cohen's kappa = 0; Fleiss' kappa = −.08); hence, we did not further analyze the EEG recordings in this group.

Annotated Ictal Discharges in the NCSE Group
In the NCSE group, 338 ictal discharges were annotated among ∼183 h of EEG recordings. The annotated ictal discharges lasted for ∼14.7 h in total. The ictal discharges were  summarized ( Figure 4) according to their characteristics the certainty, onset location, onset visibility, and morphology patterns. The definite and possible ictal discharges each accounted for similar proportions. With respect to the onset location, generalized ictal discharges reached major proportions. In addition, the ictal discharges with a sudden onset, developing from absent in less than or equal to 3 s, accounted for a slightly higher proportion than the ictal discharges with a gradual onset, developing from absent in <3 s. Morphologically, the vast majority (66%) of the ictal discharges had a "Spike Wave" pattern.

Pitfalls of Unreliable EEG Interpretations Through Visual EEG Analysis
Linear models showed a significant relationship between the average duration of the ictal discharges with a gradual onset and interrater agreement measured by the kappa statistics. For Cohen's kappa, β = .001, F(1,14) = 10.861, P = .005, and the independent variable (the average duration) explained 43.7% of the variability. For Fleiss's kappa, β = .001, F(1,14) = 12.89, P = .003, and the independent variable explained 44.2% of the variability. In other words, an extra 2 min of the ictal discharges with a gradual onset contributed to an increase in the kappa statistics of > 0.1. This implies that human raters interpreted ictal discharges with a gradual onset less reliably when the duration was shorter. Using a short duration criterion in annotating these ictal discharges could be a pitfall in visual EEG analysis ( Table 2).

Expert Opinion of the Reasons of Misinterpretations by Visual and Automated EEG Analysis
The epileptologists summarized the pitfalls of EEG misinterpretations using the multimodal viewer. The pitfalls were the misinterpretation of the following: -abnormal background activity ( Figure 5A to C), -continuous interictal discharges ( Figure 5D), -continuous short ictal discharges ( Figure 5E) whose durations were shorter than 20 s.
The abnormal background activity was categorized into 3 types slow wave activities ( Figure 5A) regularly observed in a dysfunctional brain or in a drowsy stage, other abnormal brain activity ( Figure 5B) whose appearance was subjectdependent, and ictal-discharge-like movement artifacts caused by rhythmic movements, such as repetitive chewing movements ( Figure 5C). The pitfalls of misinterpretations are summarized in Table 2, and the remarks by the epileptologists are provided in Appendix E.

Discussion
A reliable and correct EEG interpretation to diagnose NCSE is currently still difficult from visual as well as automated

Pitfalls of unreliable EEG interpretations through visual analysis
Using a short duration criterion in annotating ictal with a gradual onset Pitfalls of EEG over-interpretations Misinterpretation of abnormal background activity: -Misinterpretation of slow wave activities observed in a dysfunctional brain or in a drowsy stage. -Assuming that the EEG background of individual subjects was consistently normal, and over-interpreting the abnormalities as ictal. -Misinterpretation of the envelope of ictal-like movement artifacts leads to false alarms in computer-assisted EEG analysis. Misinterpretation of continuous interictal discharges Misinterpretation of continuous short ictal discharges analysis. 5,18,23 In this retrospective study, we visually and automatically analyzed NCSE-EEG recordings, and the interrater agreement in the NCSE group was moderate (Cohen's kappa = 0.53 and Fleiss' kappa = 0.41), consistent with the findings by Goselink et al. 5 We found that using a short duration criterion for ictal discharges with a gradual onset contributed to an unreliable EEG interpretation. Moreover, we pointed out other factors for EEG misinterpretation: abnormal background activity, continuous interictal discharges, and continuous short ictal discharges, which extended the findings of a previous study. 10 One pitfall was to use a short ictal duration criterion for ictal discharges with a gradual onset. In this study, we set 20 s as the duration criterion for ictal discharges, but even then the interrater agreement was still low. The 10-s criterion in SCC, 12 defined according to commonly used EEG reading duration via software, could be too short to support the reliability of visual EEG interpretations in clinical practice. To increase the reliability of NCSE diagnosis and conform to the natural course of the disease, a longer duration criterion can be recommended in future, especially when EEG-readers rate ictal discharges with a gradual onset. However, using a too long duration criterion might increase the risk of ignoring intervals carrying essential information. Further research should be undertaken into optimal length. Until now, automated EEG analysis algorithms for NCSE have regularly used short (eg, 3 s) EEG signals as a classification epoch, but the epoch duration is too short to be used in reliably detecting the ictal discharges for NCSE diagnosis according to the earlier discussion. To be consistent with the way clinicians diagnose NCSE, future studies about automated EEG analysis algorithms should focus on ictal discharge detection using longer-duration EEG epochs. Moreover, other signals, such as electrocardiography (ECG) and respiratory signals, also contribute to the detection performance of ictal discharges. 24 A multimodal analysis of polygraphy signals: EEG, ECG, and respiratory patterns to increase the reliability of NCSE diagnosis in short ictal discharges should be further investigated.
The pitfalls of EEG misinterpretations in both visual and automated analysis were misinterpreting the following as ictal discharges: (1) abnormal background activity ([a] slow wave activities, [b] other abnormal brain activity, and [c] the ictal-like movement artifacts), (2) continuous interictal discharges, and (3) continuous short ictal discharges.
(1) For the 3 types of activities of the abnormal background activity. (a) Slow-wave activities are frequently observed in a damaged brain (eg, in epileptic encephalopathies) or during drowsiness. In visual analysis, EEG readers with insufficient training in reading EEGs from a dysfunctional brain or sleep-EEG may misinterpret the EEG. 18 In automated analysis, slow-wave activities and ictal discharges could be confused because their frequency bands are similar. An advanced signal processing technique, such as extracting features in the morphological besides the time-frequency domain features, 25 may be helpful in their distinction in future studies. (b) The other abnormal brain activity is individually variable and can be avoided by the use of historical EEG data. Its misinterpretation (false alarms) in automated analysis was also mentioned in the previous studies. 8,9 (c) The ictal-like movement artifacts could be misinterpreted in automated analysis when their repetitive movements have a similar frequency band as ictal discharges. The low-pass filter of the analysis algorithms helps filter out high-frequency components caused by muscle activities but keeps the envelope of signals caused by the repetitive movements.
In summary, to avoid misinterpretation of abnormal background activity, we suggest reading more EEG recordings from the same subject to help human raters better recognize subject-specific ictal discharges and individualizing automated analysis algorithms.
(2) Interictal discharges are not always clearly distinct from ictal discharges, and misinterpretation of continuous interictal discharges occurs especially in patients with encephalopathy. 10 The duration and presence of repetitive spiking or bursting activity cannot be the only criteria to identify interictal discharges by EEG readers. 26 Further investigation of criteria for interictal discharges is needed.
(3) High concentrations of short ictal discharges can be observed in the EEG recording of a subject presenting many short and unstable epileptic activities. Raters with hyper-sensitivity may misinterpret them as ictal discharges. We would suggest that EEG readers carefully interpret such EEG recordings and consult other readers if in doubt. In addition, we recommend tuning alarm thresholds according to subject-dependent clinical practice in the automated EEG analysis for these particular recordings.
One unanticipated finding is that the interrater agreement in the suspected NCSE group was poor. Based on our previous discussions and the clinical data of the suspected NCSE subjects, we could assume that 2 factors may explain the unreliable EEG interpretations in this group. (1) Almost all the recordings were <1 h, and the raters may not have had enough EEG recordings from the same subject to correctly distinguish ictal discharges from the abnormal background activity.
(2) Half of the subjects had preexisting epileptic encephalopathy. The slow wave activities and the interictal discharges in the subjects with encephalopathy were probably misinterpreted as ictal discharges. Additional studies are needed to confirm the causes of the poor interrater agreement in the EEG interpretations of patients with suspected NCSE. This study has several limitations. Given that the number of recordings from patients with NCSE at Kempenhaeghe is relatively small, we also included several discontinuous archived EEG recordings. These discontinuous recordings may hinder human readers and automatic algorithms from correctly interpreting EEG. Continuous recordings from more data sources should be used in future. The EEG recordings were acquired by 3 systems, and their sampling rates were different (the lowest sampling rate was 100 Hz [n = 2]). The heterogeneity of the sampling rate was not expected to affect our results, such as the problem of the aliasing, because the frequency band of interest, such as the frequency range occupied by Spike Wave and Wave patterns, is much lower than 50 Hz. Nevertheless, future works should be undertaken to further confirm the influence of the different system hardware and setting-up on our results. Given that this is an explorative study, the number of recruited subjects (n = 30), EEG readers (n = 2), and research centers (n = 1) was small; hence, the generalization of the conclusions is limited. In addition, given that our study is explorative and a large population of the patient requires a large sample size, the subjects included in this study were outpatients with chronic seizures and brain development disorders, and their age ranged between 4 and 61 years; thus, the conclusions do not extend beyond this population. Future work should include more subjects, especially those of neonatal age, elderly patients, patients without preexisting epilepsy, and ICU patients, and more EEG readers from different research centers for generalizability. In the development of the automated EEG analysis algorithm, we primarily used time-frequency domain features, which limited our conclusion of the pitfalls in the automated analysis.
In future, morphological features should be added to the automated analysis.

Conclusion
We visually and automatically analyzed NCSE-EEG recordings to explore the causes of EEG misinterpretations. To avoid the pitfalls in NCSE-EEG analysis, a longer duration criterion than the one suggested by the Salzburg criteria is recommended. Using knowledge of historical EEGs, individualized algorithms, and context-dependent alarm thresholds may also avoid mistakes.