Synthetic whole-slide image tile generation with gene expression profile-infused deep generative models

Francisco Carrillo-Perez; Marija Pizurica; Michael G Ozawa; Hannes Vogel; Robert B West; Christina S Kong; Luis Javier Herrera; Jeanne Shen; Olivier Gevaert

doi:10.1016/j.crmeth.2023.100534

Synthetic whole-slide image tile generation with gene expression profile-infused deep generative models

Cell Rep Methods. 2023 Jul 19;3(8):100534. doi: 10.1016/j.crmeth.2023.100534. eCollection 2023 Aug 28.

Authors

Francisco Carrillo-Perez^{1

2}, Marija Pizurica^{1

3}, Michael G Ozawa⁴, Hannes Vogel⁴, Robert B West⁴, Christina S Kong⁴, Luis Javier Herrera², Jeanne Shen⁴, Olivier Gevaert^{1

5}

Affiliations

¹ Stanford Center for Biomedical Informatics Research (BMIR), Stanford University, School of Medicine, 1265 Welch Road, Stanford, CA 94305-547, USA.
² Computer Engineering, Automatics and Robotics Department, University of Granada, C. Periodista Daniel Saucedo Aranda, s/n, Granada, 18014 Granada, Spain.
³ Internet Technology and Data Science Lab (IDLab), Ghent University, Technologiepark-Zwijnaarde 126, Gent, 9052 Gent, Belgium.
⁴ Department of Pathology, Stanford University School of Medicine, 300 Pasteur Dr, Palo Alto, CA 94304, USA.
⁵ Department of Biomedical Data Science, Stanford University, School of Medicine, Medical School Office Building (MSOB), 1265 Welch Road, Stanford, CA 94305-547, USA.

Abstract

In this work, we propose an approach to generate whole-slide image (WSI) tiles by using deep generative models infused with matched gene expression profiles. First, we train a variational autoencoder (VAE) that learns a latent, lower-dimensional representation of multi-tissue gene expression profiles. Then, we use this representation to infuse generative adversarial networks (GANs) that generate lung and brain cortex tissue tiles, resulting in a new model that we call RNA-GAN. Tiles generated by RNA-GAN were preferred by expert pathologists compared with tiles generated using traditional GANs, and in addition, RNA-GAN needs fewer training epochs to generate high-quality tiles. Finally, RNA-GAN was able to generalize to gene expression profiles outside of the training set, showing imputation capabilities. A web-based quiz is available for users to play a game distinguishing real and synthetic tiles: https://rna-gan.stanford.edu/, and the code for RNA-GAN is available here: https://github.com/gevaertlab/RNA-GAN.

Keywords: artificial intelligence; deep learning; generative adversarial network; generative model; synthetic biomedical data; variational autoencoder.

Publication types

Research Support, Non-U.S. Gov't
Research Support, N.I.H., Extramural

MeSH terms

Brain*
Cerebral Cortex
Learning
RNA
Transcriptome*

Substances

RNA

Grants and funding

R01 CA260271/CA/NCI NIH HHS/United States