Methodology for Good Machine Learning with Multi-Omics Data

Clin Pharmacol Ther. 2024 Apr;115(4):745-757. doi: 10.1002/cpt.3105. Epub 2024 Jan 31.

Abstract

In 2020, Novartis Pharmaceuticals Corporation and the U.S. Food and Drug Administration (FDA) started a 4-year scientific collaboration to approach complex new data modalities and advanced analytics. The scientific question was to find novel radio-genomics-based prognostic and predictive factors for HR+/HER- metastatic breast cancer under a Research Collaboration Agreement. This collaboration has been providing valuable insights to help successfully implement future scientific projects, particularly using artificial intelligence and machine learning. This tutorial aims to provide tangible guidelines for a multi-omics project that includes multidisciplinary expert teams, spanning across different institutions. We cover key ideas, such as "maintaining effective communication" and "following good data science practices," followed by the four steps of exploratory projects, namely (1) plan, (2) design, (3) develop, and (4) disseminate. We break each step into smaller concepts with strategies for implementation and provide illustrations from our collaboration to further give the readers actionable guidance.

MeSH terms

  • Artificial Intelligence*
  • Genomics
  • Humans
  • Machine Learning
  • Multiomics*