Symptom Clustering Patterns and Population Characteristics of COVID-19 Based on Text Clustering Method

Front Public Health. 2022 Feb 4:10:795734. doi: 10.3389/fpubh.2022.795734. eCollection 2022.

Abstract

Background: Descriptions of single clinical symptoms of coronavirus disease 2019 (COVID-19) have been widely reported. However, evidence of symptoms associations was still limited. We sought to explore the potential symptom clustering patterns and high-frequency symptom combinations of COVID-19 to enhance the understanding of people of this disease.

Methods: In this retrospective cohort study, a total of 1,067 COVID-19 cases were enrolled. Symptom clustering patterns were first explored by a text clustering method. Then, a multinomial logistic regression was applied to reveal the population characteristics of different symptom groups. In addition, time intervals between symptoms onset and the first visit were analyzed to consider the effect of time interval extension on the progression of symptoms.

Results: Based on text clustering, the symptoms were summarized into four groups. Group 1: no-obvious symptoms; Group 2: mainly fever and/or dry cough; Group 3: mainly upper respiratory tract infection symptoms; Group 4: mainly cardiopulmonary, systemic, and/or gastrointestinal symptoms. Apart from Group 1 with no obvious symptoms, the most frequent symptom combinations were fever only (64 cases, 47.8%), followed by dry cough only (42 cases, 31.3%) in Group 2; expectoration only (21 cases, 19.8%), followed by expectoration complicated with fever (10 cases, 9.4%) in Group 3; fatigue complicated with fever (12 cases, 4.2%), followed by headache complicated with fever was also high (11 cases, 3.8%) in Group 4. People aged 45-64 years were more likely to have symptoms of Group 4 than those aged 65 years or older (odds ratio [OR] = 2.66, 95% CI: 1.21-5.85) and at the same time had longer time intervals.

Conclusions: Symptoms of COVID-19 could be divided into four clustering groups with different symptom combinations. The Group 4 symptoms (i.e., mainly cardiopulmonary, systemic, and/or gastrointestinal symptoms) happened more frequently in COVID-19 than in influenza. This distinction could help deepen the understanding of this disease. The middle-aged people have a longer time interval for medical visit and was a group that deserve more attention, from the perspective of medical delays.

Keywords: COVID-19; epidemiology; risk factor; symptom clustering patterns; time delay.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aged
  • Ambulatory Care
  • COVID-19*
  • Cluster Analysis
  • Humans
  • Middle Aged
  • Retrospective Studies
  • SARS-CoV-2