Leveraging Mann-Whitney U test on large-scale genetic variation data for analysing malaria genetic markers

Malar J. 2022 Mar 9;21(1):79. doi: 10.1186/s12936-022-04104-x.

Abstract

Background: The malaria risk analysis of multiple populations is crucial and of great importance whilst compressing limitations. However, the exponential growth in diversity and accumulation of genetic variation data obtained from malaria-infected patients through Genome-Wide Association Studies opens up unprecedented opportunities to explore the significant differences between genetic markers (risk factors), particularly in the resistance or susceptibility of populations to malaria risk. Thus, this study proposes using statistical tests to analyse large-scale genetic variation data, comprising 20,854 samples from 11 populations within three continents: Africa, Oceania, and Asia.

Methods: Even though statistical tests have been utilized to conduct case-control studies since the 1950s to link risk factors to a particular disease, several challenges faced, including the choice of data (ordinal vs. non-ordinal) and test (parametric vs. non-parametric). This study overcomes these challenges by adopting the Mann-Whitney U test to analyse large-scale genetic variation data; to explore the statistical significance of markers between populations; and to further identify the highly differentiated markers.

Results: The findings of this study revealed a significant difference in the genetic markers between populations (p < 0.01) in all the case groups and most control groups. However, for the highly differentiated genetic markers, a significant difference (p < 0.01) was present for most genetic markers with varying p-values between the populations in the case and control groups. Moreover, several genetic markers were observed to have very significant differences (p < 0.001) across all populations, while others exist between certain specific populations. Also, several genetic markers have no significant differences between populations.

Conclusions: These findings further support that the genetic markers contribute differently between populations towards malaria resistance or susceptibility, thus showing differences in the likelihood of malaria infection. In addition, this study demonstrated the robustness of the Mann-Whitney U test in analysing genetic markers in large-scale genetic variation data, thereby indicating an alternative method to explore genetic markers in other complex diseases. The findings hold great promise for genetic markers analysis, and the pipeline emphasized in this study can fully be reproduced to analyse new data.

Keywords: Descriptive statistics; Genetic markers; Malaria; Mann–Whitney U test; Single nucleotide polymorphisms.

MeSH terms

  • Genetic Markers
  • Genetic Variation
  • Genome-Wide Association Study*
  • Humans
  • Malaria* / genetics
  • Statistics, Nonparametric

Substances

  • Genetic Markers