Data augmentation and multimodal learning for predicting drug response in patient-derived xenografts from gene expressions and histology images

Front Med (Lausanne). 2023 Mar 7:10:1058919. doi: 10.3389/fmed.2023.1058919. eCollection 2023.

Abstract

Patient-derived xenografts (PDXs) are an appealing platform for preclinical drug studies. A primary challenge in modeling drug response prediction (DRP) with PDXs and neural networks (NNs) is the limited number of drug response samples. We investigate multimodal neural network (MM-Net) and data augmentation for DRP in PDXs. The MM-Net learns to predict response using drug descriptors, gene expressions (GE), and histology whole-slide images (WSIs). We explore whether combining WSIs with GE improves predictions as compared with models that use GE alone. We propose two data augmentation methods which allow us training multimodal and unimodal NNs without changing architectures with a single larger dataset: 1) combine single-drug and drug-pair treatments by homogenizing drug representations, and 2) augment drug-pairs which doubles the sample size of all drug-pair samples. Unimodal NNs which use GE are compared to assess the contribution of data augmentation. The NN that uses the original and the augmented drug-pair treatments as well as single-drug treatments outperforms NNs that ignore either the augmented drug-pairs or the single-drug treatments. In assessing the multimodal learning based on the MCC metric, MM-Net outperforms all the baselines. Our results show that data augmentation and integration of histology images with GE can improve prediction performance of drug response in PDXs.

Keywords: data augmentation; drug response prediction; gene expression; histology whole-slide images; multimodal deep learning; patient-derived xenograft (PDX); preclinical drug studies.

Grants and funding

This has been funded in whole or in part with Federal funding by the NCI-DOE Collaboration established by the U.S. Department of Energy (DOE) and the National Cancer Institute (NCI) of the National Institutes of Health, Cancer Moonshot Task Order No. 75N91019F00134 and under Frederick National Laboratory for Cancer Research Contract 75N91019D00024. This work was performed under the auspices of the U.S. Department of Energy by Argonne National Laboratory under Contract DE-AC02-06-CH11357, Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344, Los Alamos National Laboratory under Contract 89233218CNA000001, and Oak Ridge National Laboratory under Contract DE-AC05-00OR22725. Argonne National Laboratory's work on the Preclinical Modeling High-Performance Computing Artificial Intelligence project was supported in whole or in part by Leidos Biomedical Research, Inc., under PRJ1009868.