Viral reverse engineering using Artificial Intelligence and big data COVID-19 infection with Long Short-term Memory (LSTM)

Environ Technol Innov. 2021 May:22:101531. doi: 10.1016/j.eti.2021.101531. Epub 2021 Apr 2.

Abstract

This research presents a reverse engineering approach to discover the patterns and evolution behavior of SARS-CoV-2 using AI and big data. Accordingly, we have studied five viral families (Orthomyxoviridae, Retroviridae, Filoviridae, Flaviviridae, and Coronaviridae) that happened in the era of the past one hundred years. To capture the similarities, common characteristics, and evolution behavior for prediction concerning SARS-CoV-2. And how reverse engineering using Artificial intelligence (AI) and big data is efficient and provides wide horizons. The results show that SARS-CoV-2 shares the same highest active amino acids (S, L, and T) with the mentioned viral families. As known, that affects the building function of the proteins. We have also devised a mathematical formula representing how we calculate the evolution difference percentage between each virus concerning its phylogenic tree. It shows that SARS-CoV-2 has fast mutation evolution concerning its time of arising. Artificial Intelligence (AI) is used to predict the next evolved instance of SARS-CoV-2 by utilizing the phylogenic tree data as a corpus using Long Short-term Memory (LSTM). This paper has shown the evolved viral instance prediction process on ORF7a protein from SARS-CoV-2 as the first stage to predict the complete mutant virus. Finally, in this research, we have focused on analyzing the virus to its primary factors by reverse engineering using AI and big data to understand the viral similarities, patterns, and evolution behavior to predict future viral mutations of the virus artificially in a systematic and logical way.

Keywords: COVID19; Healthcare; Long Short-term Memory (LSTM); Public health; SARS-CoV-2; Viral reverse engineering.