Multilevel approach to male fertility by machine learning highlights a hidden link between haematological and spermatogenetic cells

Andrology. 2020 Sep;8(5):1021-1029. doi: 10.1111/andr.12826. Epub 2020 Jun 21.

Abstract

Background: Male infertility represents a complex clinical condition requiring an accurate multilevel assessment, in which machine learning technology, combining large data series in non-linear and highly interactive ways, could be innovatively applied.

Methods: A longitudinal, observational, retrospective, big data study was carried out, applying for the first time the ML in the context of male infertility. A large database including all semen samples collected between 2010 and 2016 was generated, together with blood biochemical examinations, environmental temperature and air pollutants exposure. First, the database was analysed with principal component analysis and multivariable linear regression analyses. Second, classification analyses were performed, in which patients were a priori classified according to semen parameters. Third, machine learning algorithms were applied in a training phase (80% of the entire database) and in a tuning phase (20% of the data set). Finally, conventional statistical analyses were applied considering semen parameters and those other variables extracted during machine learning.

Results: The final database included 4239 patients, aggregating semen analyses, blood and environmental parameters. Classification analyses were able to recognize oligozoospermic, teratozoospermic, asthenozoospermic and patients with altered semen parameters (0.58 accuracy, 0.58 sensitivity and 0.57 specificity). Machine learning algorithms detected three haematological variables, that is lymphocytes number, erythrocyte distribution and mean globular volume, significantly related to semen parameters (0.69 accuracy, 0.78 sensitivity and 0.41 specificity).

Conclusion: This is the first machine learning application to male fertility, detecting potential mathematical algorithms able to describe patients' semen characteristics changes. In this setting, a possible hidden link between testicular and haematopoietic tissues was suggested, according to their similar proliferative properties.

Keywords: big data; infertility; machine learning; male infertility.

Publication types

  • Observational Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Algorithms
  • Databases, Factual
  • Environment
  • Erythrocyte Count
  • Female
  • Hematopoiesis
  • Humans
  • Infertility, Male* / blood
  • Longitudinal Studies
  • Lymphocyte Count
  • Machine Learning*
  • Male
  • Retrospective Studies
  • Semen Analysis
  • Spermatogenesis