MAK: a machine learning framework improved genomic prediction via multi-target ensemble regressor chains and automatic selection of assistant traits

Mang Liang; Sheng Cao; Tianyu Deng; Lili Du; Keanning Li; Bingxing An; Yueying Du; Lingyang Xu; Lupei Zhang; Xue Gao; Junya Li; Peng Guo; Huijiang Gao

doi:10.1093/bib/bbad043

MAK: a machine learning framework improved genomic prediction via multi-target ensemble regressor chains and automatic selection of assistant traits

Brief Bioinform. 2023 Mar 19;24(2):bbad043. doi: 10.1093/bib/bbad043.

Authors

Mang Liang¹, Sheng Cao¹, Tianyu Deng¹, Lili Du¹, Keanning Li¹, Bingxing An¹, Yueying Du¹, Lingyang Xu¹, Lupei Zhang¹, Xue Gao¹, Junya Li¹, Peng Guo², Huijiang Gao¹

Affiliations

¹ Chinese Academy of Agricultural Sciences Institute of Animal Science.
² Tianjin Agricultural University.

PMID: 36752363
DOI: 10.1093/bib/bbad043

Abstract

Incorporating the genotypic and phenotypic of the correlated traits into the multi-trait model can significantly improve the prediction accuracy of the target trait in animal and plant breeding, as well as human genetics. However, in most cases, the phenotypic information of the correlated and target trait of the individual to be evaluated was null simultaneously, particularly for the newborn. Therefore, we propose a machine learning framework, MAK, to improve the prediction accuracy of the target trait by constructing the multi-target ensemble regression chains and selecting the assistant trait automatically, which predicted the genomic estimated breeding values of the target trait using genotypic information only. The prediction ability of MAK was significantly more robust than the genomic best linear unbiased prediction, BayesB, BayesRR and the multi trait Bayesian method in the four real animal and plant datasets, and the computational efficiency of MAK was roughly 100 times faster than BayesB and BayesRR.

Keywords: ensemble regressor chains; genomic prediction; machine learning; multi-target regression; multi-trait.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Animals
Bayes Theorem
Genomics / methods
Genotype
Humans
Infant, Newborn
Machine Learning
Models, Genetic*
Phenotype
Plant Breeding*