Methods for detecting probable COVID-19 cases from large-scale survey data also reveal probable sex differences in symptom profiles

Front Big Data. 2022 Nov 10:5:1043704. doi: 10.3389/fdata.2022.1043704. eCollection 2022.

Abstract

Background: Daily symptom reporting collected via web-based symptom survey tools holds the potential to improve disease monitoring. Such screening tools might be able to not only discriminate between states of acute illness and non-illness, but also make use of additional demographic information so as to identify how illnesses may differ across groups, such as biological sex. These capabilities may play an important role in the context of future disease outbreaks.

Objective: Use data collected via a daily web-based symptom survey tool to develop a Bayesian model that could differentiate between COVID-19 and other illnesses and refine this model to identify illness profiles that differ by biological sex.

Methods: We used daily symptom profiles to plot symptom progressions for COVID-19, influenza (flu), and the common cold. We then built a Bayesian network to discriminate between these three illnesses based on daily symptom reports. We further separated out the COVID-19 cohort into self-reported female and male subgroups to observe any differences in symptoms relating to sex. We identified key symptoms that contributed to a COVID-19 prediction in both males and females using a logistic regression model.

Results: Although the Bayesian model performed only moderately well in identifying a COVID-19 diagnosis (71.6% true positive rate), the model showed promise in being able to differentiate between COVID-19, flu, and the common cold, as well as periods of acute illness vs. non-illness. Additionally, COVID-19 symptoms differed between the biological sexes; specifically, fever was a more important symptom in identifying subsequent COVID-19 infection among males than among females.

Conclusion: Web-based symptom survey tools hold promise as tools to identify illness and may help with coordinated disease outbreak responses. Incorporating demographic factors such as biological sex into predictive models may elucidate important differences in symptom profiles that hold implications for disease detection.

Keywords: Bayesian network; infectious disease; mHealth; public health; sex as a biological variable.