easyPheno: An easy-to-use and easy-to-extend Python framework for phenotype prediction using Bayesian optimization

Bioinform Adv. 2023 Mar 22;3(1):vbad035. doi: 10.1093/bioadv/vbad035. eCollection 2023.

Abstract

Summary: Predicting complex traits from genotypic information is a major challenge in various biological domains. With easyPheno, we present a comprehensive Python framework enabling the rigorous training, comparison and analysis of phenotype predictions for a variety of different models, ranging from common genomic selection approaches over classical machine learning and modern deep learning-based techniques. Our framework is easy-to-use, also for non-programming-experts, and includes an automatic hyperparameter search using state-of-the-art Bayesian optimization. Moreover, easyPheno provides various benefits for bioinformaticians developing new prediction models. easyPheno enables to quickly integrate novel models and functionalities in a reliable framework and to benchmark against various integrated prediction models in a comparable setup. In addition, the framework allows the assessment of newly developed prediction models under pre-defined settings using simulated data. We provide a detailed documentation with various hands-on tutorials and videos explaining the usage of easyPheno to novice users.

Availability and implementation: easyPheno is publicly available at https://github.com/grimmlab/easyPheno and can be easily installed as Python package via https://pypi.org/project/easypheno/ or using Docker. A comprehensive documentation including various tutorials complemented with videos can be found at https://easypheno.readthedocs.io/.

Supplementary information: Supplementary data are available at Bioinformatics Advances online.