NeuroCORD: A Language Model to Facilitate COVID-19-Associated Neurological Disorder Studies

Int J Environ Res Public Health. 2022 Aug 12;19(16):9974. doi: 10.3390/ijerph19169974.

Abstract

COVID-19 can lead to multiple severe outcomes including neurological and psychological impacts. However, it is challenging to manually scan hundreds of thousands of COVID-19 articles on a regular basis. To update our knowledge, provide sound science to the public, and communicate effectively, it is critical to have an efficient means of following the most current published data. In this study, we developed a language model to search abstracts using the most advanced artificial intelligence (AI) to accurately retrieve articles on COVID-19-associated neurological disorders. We applied this NeuroCORD model to the largest benchmark dataset of COVID-19, CORD-19. We found that the model developed on the training set yielded 94% prediction accuracy on the test set. This result was subsequently verified by two experts in the field. In addition, when applied to 96,000 non-labeled articles that were published after 2020, the NeuroCORD model accurately identified approximately 3% of them to be relevant for the study of COVID-19-associated neurological disorders, while only 0.5% were retrieved using conventional keyword searching. In conclusion, NeuroCORD provides an opportunity to profile neurological disorders resulting from COVID-19 in a rapid and efficient fashion, and its general framework could be used to study other COVID-19-related emerging health issues.

Keywords: BERT model; COVID-19; information retrieval; language model; machine learning; neurological disorders; text mining.

MeSH terms

  • Artificial Intelligence
  • COVID-19*
  • Humans
  • Language
  • Nervous System Diseases* / epidemiology
  • Nervous System Diseases* / etiology

Grants and funding

This research received no external funding.