Leveraging Data Science to Combat COVID-19: A Comprehensive Review

IEEE Trans Artif Intell. 2020 Sep 2;1(1):85-103. doi: 10.1109/TAI.2020.3020521. eCollection 2020 Aug.

Abstract

COVID-19, an infectious disease caused by the SARS-CoV-2 virus, was declared a pandemic by the World Health Organisation (WHO) in March 2020. By mid-August 2020, more than 21 million people have tested positive worldwide. Infections have been growing rapidly and tremendous efforts are being made to fight the disease. In this paper, we attempt to systematise the various COVID-19 research activities leveraging data science, where we define data science broadly to encompass the various methods and tools-including those from artificial intelligence (AI), machine learning (ML), statistics, modeling, simulation, and data visualization-that can be used to store, process, and extract insights from data. In addition to reviewing the rapidly growing body of recent research, we survey public datasets and repositories that can be used for further work to track COVID-19 spread and mitigation strategies. As part of this, we present a bibliometric analysis of the papers produced in this short span of time. Finally, building on these insights, we highlight common challenges and pitfalls observed across the surveyed works. We also created a live resource repository at https://github.com/Data-Science-and-COVID-19/Leveraging-Data-Science-To-Combat-COVID-19-A-Comprehensive-Review that we intend to keep updated with the latest resources including new papers and datasets.

Keywords: Bibliometric analysis; COVID-19; SARS-CoV-2; data science; machine learning; medical image analysis; speech analysis; text mining.