Weighted Performance Metrics for Automatic Neonatal Seizure Detection Using Multiscored EEG Data

Amir Hossein Ansari; Perumpillichira Joseph Cherian; Alexander Caicedo Dorado; Katrien Jansen; Anneleen Dereymaeker; Leen De Wispelaere; Charlotte Dielman; Jan Vervisch; Paul Govaert; Maarten De Vos; Gunnar Naulaers; Sabine Van Huffel

doi:10.1109/JBHI.2017.2750769

Weighted Performance Metrics for Automatic Neonatal Seizure Detection Using Multiscored EEG Data

Amir Hossein Ansari, Perumpillichira Joseph Cherian, Alexander Caicedo Dorado, Katrien Jansen, Anneleen Dereymaeker, Leen De Wispelaere, Charlotte Dielman, Jan Vervisch, Paul Govaert, Maarten De Vos, Gunnar Naulaers, Sabine Van Huffel

Research output: Contribution to journal › Article › peer-review

14 Scopus citations

Abstract

In neonatal intensive care units, there is a need for around the clock monitoring of electroencephalogram (EEG), especially for recognizing seizures. An automated seizure detector with an acceptable performance can partly fill this need. In order to develop a detector, an extensive dataset labeled by experts is needed. However, accurately defining neonatal seizures on EEG is a challenge, especially when seizure discharges do not meet exact definitions of repetitiveness or evolution in amplitude and frequency. When several readers score seizures independently, disagreement can be high. Commonly used metrics such as good detection rate (GDR) and false alarm rate (FAR) derived from data scored by multiple raters have their limitations. Therefore, new metrics are needed to measure the performance with respect to the different labels. In this paper, instead of defining the labels by consensus or majority voting, popular metrics including GDR, FAR, positive predictive value, sensitivity, specificity, and selectivity are modified such that they can take different scores into account. To this end, 353 hours of EEG data containing seizures from 81 neonates were visually scored by a clinical neurophysiologist, and then processed by an automated seizure detector. The scored seizures were mixed with false detections of an automated seizure detector and were relabeled by three independent EEG readers. Then, all labels were used in the proposed performance metrics and the result was compared with the majority voting technique and showed higher accuracy and robustness for the proposed metrics. Results were confirmed using a bootstrapping test.

Original language	English (US)
Pages (from-to)	1114-1123
Number of pages	10
Journal	IEEE Journal of Biomedical and Health Informatics
Volume	22
Issue number	4
DOIs	https://doi.org/10.1109/JBHI.2017.2750769
State	Published - Jul 2018
Externally published	Yes

All Science Journal Classification (ASJC) codes

Biotechnology
Computer Science Applications
Electrical and Electronic Engineering
Health Information Management

Access to Document

10.1109/JBHI.2017.2750769

Cite this

Ansari, A. H., Cherian, P. J., Caicedo Dorado, A., Jansen, K., Dereymaeker, A., De Wispelaere, L., Dielman, C., Vervisch, J., Govaert, P., De Vos, M., Naulaers, G., & Van Huffel, S. (2018). Weighted Performance Metrics for Automatic Neonatal Seizure Detection Using Multiscored EEG Data. IEEE Journal of Biomedical and Health Informatics, 22(4), 1114-1123. https://doi.org/10.1109/JBHI.2017.2750769

@article{e148e47421c04dd599c636831997e58c,

title = "Weighted Performance Metrics for Automatic Neonatal Seizure Detection Using Multiscored EEG Data",

abstract = "In neonatal intensive care units, there is a need for around the clock monitoring of electroencephalogram (EEG), especially for recognizing seizures. An automated seizure detector with an acceptable performance can partly fill this need. In order to develop a detector, an extensive dataset labeled by experts is needed. However, accurately defining neonatal seizures on EEG is a challenge, especially when seizure discharges do not meet exact definitions of repetitiveness or evolution in amplitude and frequency. When several readers score seizures independently, disagreement can be high. Commonly used metrics such as good detection rate (GDR) and false alarm rate (FAR) derived from data scored by multiple raters have their limitations. Therefore, new metrics are needed to measure the performance with respect to the different labels. In this paper, instead of defining the labels by consensus or majority voting, popular metrics including GDR, FAR, positive predictive value, sensitivity, specificity, and selectivity are modified such that they can take different scores into account. To this end, 353 hours of EEG data containing seizures from 81 neonates were visually scored by a clinical neurophysiologist, and then processed by an automated seizure detector. The scored seizures were mixed with false detections of an automated seizure detector and were relabeled by three independent EEG readers. Then, all labels were used in the proposed performance metrics and the result was compared with the majority voting technique and showed higher accuracy and robustness for the proposed metrics. Results were confirmed using a bootstrapping test.",

author = "Ansari, {Amir Hossein} and Cherian, {Perumpillichira Joseph} and {Caicedo Dorado}, Alexander and Katrien Jansen and Anneleen Dereymaeker and {De Wispelaere}, Leen and Charlotte Dielman and Jan Vervisch and Paul Govaert and {De Vos}, Maarten and Gunnar Naulaers and {Van Huffel}, Sabine",

note = "Funding Information: Manuscript received March 15, 2017; revised August 24, 2017; accepted September 5, 2017. Date of publication September 10, 2017; date of current version June 29, 2018. The work of A. H. Ansari and S. Van Huffel was supported in part by the Bijzonder Onderzoeksfonds KU Leuven: Center of Excellence PFV/10/002 (OPTEC); in part by the SPARKLE—Sensor-based Platform for the Accurate and Remote monitoring of Kinematics Linked to E-health #: IDO-13-0358; The effect of perinatal stress on the later outcome in preterm babies #: C24/15/036; in part by the TARGID—Development of a novel diagnostic medical device to assess gastric motility #: C32-16-00364; in part by the Fonds voor Wetenschappelijk Onderzoek Vlaanderen (FWO) projects: G.0A5513N (deep brain stimulation); in part by the Agentschap Innoveren & On-dernemen (VLAIO) projects: STW 150466 - OSA+, O&O HBC 2016 0184 eWatch; in part by the imec: Strategic Funding 2017, ICON-HBC.2016.0167 SeizeIT; in part by the Belgian Federal Science Policy Office: IUAP P7/19/ (Dynamical Systems, Control, and Optimization, 2012–2017); in part by the Belgian Foreign Affairs–Development Co-operation: VLIR UOS programs (2013–2019); in part by the European Union{\textquoteright}s Seventh Framework Programme (FP7/2007-2013): EU MC ITN TRANSACT 2012, #316679; in part by the HIP Trial: #260777; in part by the ERASMUS +: NGDIVS 2016-1-SE01-KA203-022114; in part by the European Research Council Advanced Grant #339804 BIOTENSORS. The work of A. Dereymaeker was supported by the IWT PHD grant: TBM 110697-NeoGuard. A. Caicedo Dorado is a Postdoctoral Fellow from the FWO. (Corresponding author: Amir Hossein Ansari.) A. H. Ansari, A. Caicedo Dorado, and S. Van Huffel are with the Department of Electrical Engineering (ESAT), STADIUS, KU Leu-ven, Leuven 3000, Belgium, and also with the imec, Leuven 3001, Belgium (e-mail:, amirhossein.ansari@kuleuven.be; caicedodorado@ esat.kuleuven.be; sabine.vanhuffel@esat.kuleuven.be). Publisher Copyright: {\textcopyright} 2013 IEEE. Copyright: Copyright 2019 Elsevier B.V., All rights reserved.",

year = "2018",

month = jul,

doi = "10.1109/JBHI.2017.2750769",

language = "English (US)",

volume = "22",

pages = "1114--1123",

journal = "IEEE Journal of Biomedical and Health Informatics",

issn = "2168-2194",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "4",

}

Ansari, AH, Cherian, PJ, Caicedo Dorado, A, Jansen, K, Dereymaeker, A, De Wispelaere, L, Dielman, C, Vervisch, J, Govaert, P, De Vos, M, Naulaers, G & Van Huffel, S 2018, 'Weighted Performance Metrics for Automatic Neonatal Seizure Detection Using Multiscored EEG Data', IEEE Journal of Biomedical and Health Informatics, vol. 22, no. 4, pp. 1114-1123. https://doi.org/10.1109/JBHI.2017.2750769

TY - JOUR

T1 - Weighted Performance Metrics for Automatic Neonatal Seizure Detection Using Multiscored EEG Data

AU - Ansari, Amir Hossein

AU - Cherian, Perumpillichira Joseph

AU - Caicedo Dorado, Alexander

AU - Jansen, Katrien

AU - Dereymaeker, Anneleen

AU - De Wispelaere, Leen

AU - Dielman, Charlotte

AU - Vervisch, Jan

AU - Govaert, Paul

AU - De Vos, Maarten

AU - Naulaers, Gunnar

AU - Van Huffel, Sabine

N1 - Funding Information: Manuscript received March 15, 2017; revised August 24, 2017; accepted September 5, 2017. Date of publication September 10, 2017; date of current version June 29, 2018. The work of A. H. Ansari and S. Van Huffel was supported in part by the Bijzonder Onderzoeksfonds KU Leuven: Center of Excellence PFV/10/002 (OPTEC); in part by the SPARKLE—Sensor-based Platform for the Accurate and Remote monitoring of Kinematics Linked to E-health #: IDO-13-0358; The effect of perinatal stress on the later outcome in preterm babies #: C24/15/036; in part by the TARGID—Development of a novel diagnostic medical device to assess gastric motility #: C32-16-00364; in part by the Fonds voor Wetenschappelijk Onderzoek Vlaanderen (FWO) projects: G.0A5513N (deep brain stimulation); in part by the Agentschap Innoveren & On-dernemen (VLAIO) projects: STW 150466 - OSA+, O&O HBC 2016 0184 eWatch; in part by the imec: Strategic Funding 2017, ICON-HBC.2016.0167 SeizeIT; in part by the Belgian Federal Science Policy Office: IUAP P7/19/ (Dynamical Systems, Control, and Optimization, 2012–2017); in part by the Belgian Foreign Affairs–Development Co-operation: VLIR UOS programs (2013–2019); in part by the European Union’s Seventh Framework Programme (FP7/2007-2013): EU MC ITN TRANSACT 2012, #316679; in part by the HIP Trial: #260777; in part by the ERASMUS +: NGDIVS 2016-1-SE01-KA203-022114; in part by the European Research Council Advanced Grant #339804 BIOTENSORS. The work of A. Dereymaeker was supported by the IWT PHD grant: TBM 110697-NeoGuard. A. Caicedo Dorado is a Postdoctoral Fellow from the FWO. (Corresponding author: Amir Hossein Ansari.) A. H. Ansari, A. Caicedo Dorado, and S. Van Huffel are with the Department of Electrical Engineering (ESAT), STADIUS, KU Leu-ven, Leuven 3000, Belgium, and also with the imec, Leuven 3001, Belgium (e-mail:, amirhossein.ansari@kuleuven.be; caicedodorado@ esat.kuleuven.be; sabine.vanhuffel@esat.kuleuven.be). Publisher Copyright: © 2013 IEEE. Copyright: Copyright 2019 Elsevier B.V., All rights reserved.

PY - 2018/7

Y1 - 2018/7

N2 - In neonatal intensive care units, there is a need for around the clock monitoring of electroencephalogram (EEG), especially for recognizing seizures. An automated seizure detector with an acceptable performance can partly fill this need. In order to develop a detector, an extensive dataset labeled by experts is needed. However, accurately defining neonatal seizures on EEG is a challenge, especially when seizure discharges do not meet exact definitions of repetitiveness or evolution in amplitude and frequency. When several readers score seizures independently, disagreement can be high. Commonly used metrics such as good detection rate (GDR) and false alarm rate (FAR) derived from data scored by multiple raters have their limitations. Therefore, new metrics are needed to measure the performance with respect to the different labels. In this paper, instead of defining the labels by consensus or majority voting, popular metrics including GDR, FAR, positive predictive value, sensitivity, specificity, and selectivity are modified such that they can take different scores into account. To this end, 353 hours of EEG data containing seizures from 81 neonates were visually scored by a clinical neurophysiologist, and then processed by an automated seizure detector. The scored seizures were mixed with false detections of an automated seizure detector and were relabeled by three independent EEG readers. Then, all labels were used in the proposed performance metrics and the result was compared with the majority voting technique and showed higher accuracy and robustness for the proposed metrics. Results were confirmed using a bootstrapping test.

AB - In neonatal intensive care units, there is a need for around the clock monitoring of electroencephalogram (EEG), especially for recognizing seizures. An automated seizure detector with an acceptable performance can partly fill this need. In order to develop a detector, an extensive dataset labeled by experts is needed. However, accurately defining neonatal seizures on EEG is a challenge, especially when seizure discharges do not meet exact definitions of repetitiveness or evolution in amplitude and frequency. When several readers score seizures independently, disagreement can be high. Commonly used metrics such as good detection rate (GDR) and false alarm rate (FAR) derived from data scored by multiple raters have their limitations. Therefore, new metrics are needed to measure the performance with respect to the different labels. In this paper, instead of defining the labels by consensus or majority voting, popular metrics including GDR, FAR, positive predictive value, sensitivity, specificity, and selectivity are modified such that they can take different scores into account. To this end, 353 hours of EEG data containing seizures from 81 neonates were visually scored by a clinical neurophysiologist, and then processed by an automated seizure detector. The scored seizures were mixed with false detections of an automated seizure detector and were relabeled by three independent EEG readers. Then, all labels were used in the proposed performance metrics and the result was compared with the majority voting technique and showed higher accuracy and robustness for the proposed metrics. Results were confirmed using a bootstrapping test.

UR - http://www.scopus.com/inward/record.url?scp=85049447625&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85049447625&partnerID=8YFLogxK

U2 - 10.1109/JBHI.2017.2750769

DO - 10.1109/JBHI.2017.2750769

M3 - Article

C2 - 28910781

AN - SCOPUS:85049447625

SN - 2168-2194

VL - 22

SP - 1114

EP - 1123

JO - IEEE Journal of Biomedical and Health Informatics

JF - IEEE Journal of Biomedical and Health Informatics

IS - 4

ER -

Weighted Performance Metrics for Automatic Neonatal Seizure Detection Using Multiscored EEG Data

Abstract

All Science Journal Classification (ASJC) codes

Access to Document

Other files and links

Fingerprint

Cite this