The NMT Scalp EEG Dataset: An Open-Source Annotated Dataset of Healthy and Pathological EEG Recordings for Predictive Modeling

Front Neurosci. 2022 Jan 5:15:755817. doi: 10.3389/fnins.2021.755817. eCollection 2021.

Abstract

Electroencephalogram (EEG) is widely used for the diagnosis of neurological conditions like epilepsy, neurodegenerative illnesses and sleep related disorders. Proper interpretation of EEG recordings requires the expertise of trained neurologists, a resource which is scarce in the developing world. Neurologists spend a significant portion of their time sifting through EEG recordings looking for abnormalities. Most recordings turn out to be completely normal, owing to the low yield of EEG tests. To minimize such wastage of time and effort, automatic algorithms could be used to provide pre-diagnostic screening to separate normal from abnormal EEG. Data driven machine learning offers a way forward however, design and verification of modern machine learning algorithms require properly curated labeled datasets. To avoid bias, deep learning based methods must be trained on large datasets from diverse sources. This work presents a new open-source dataset, named the NMT Scalp EEG Dataset, consisting of 2,417 recordings from unique participants spanning almost 625 h. Each recording is labeled as normal or abnormal by a team of qualified neurologists. Demographic information such as gender and age of the patient are also included. Our dataset focuses on the South Asian population. Several existing state-of-the-art deep learning architectures developed for pre-diagnostic screening of EEG are implemented and evaluated on the NMT, and referenced against baseline performance on the well-known Temple University Hospital EEG Abnormal Corpus. Generalization of deep learning based architectures across the NMT and the reference datasets is also investigated. The NMT dataset is being released to increase the diversity of EEG datasets and to overcome the scarcity of accurately annotated publicly available datasets for EEG research.

Keywords: automated EEG analytics; computational neurology; computer aided diagnosis; convolutional neural networks; deep learning; generalization performance; open-source EEG dataset; pre-diagnostic EEG screening.