Proof of concept of the potential of a machine learning algorithm to extract new information from conventional SARS-CoV-2 rRT-PCR results

Sci Rep. 2023 May 13;13(1):7786. doi: 10.1038/s41598-023-34882-6.

Abstract

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been and remains one of the major challenges modern society has faced thus far. Over the past few months, large amounts of information have been collected that are only now beginning to be assimilated. In the present work, the existence of residual information in the massive numbers of rRT-PCRs that tested positive out of the almost half a million tests that were performed during the pandemic is investigated. This residual information is believed to be highly related to a pattern in the number of cycles that are necessary to detect positive samples as such. Thus, a database of more than 20,000 positive samples was collected, and two supervised classification algorithms (a support vector machine and a neural network) were trained to temporally locate each sample based solely and exclusively on the number of cycles determined in the rRT-PCR of each individual. Overall, this study suggests that there is valuable residual information in the rRT-PCR positive samples that can be used to identify patterns in the development of the SARS-CoV-2 pandemic. The successful application of supervised classification algorithms to detect these patterns demonstrates the potential of machine learning techniques to aid in understanding the spread of the virus and its variants.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • COVID-19 Testing
  • COVID-19* / diagnosis
  • Humans
  • Machine Learning
  • Reverse Transcriptase Polymerase Chain Reaction
  • SARS-CoV-2* / genetics