Systematic functional interrogation of human pseudogenes using CRISPRi

Genome Biol. 2021 Aug 23;22(1):240. doi: 10.1186/s13059-021-02464-2.

Abstract

Background: The human genome encodes over 14,000 pseudogenes that are evolutionary relics of protein-coding genes and commonly considered as nonfunctional. Emerging evidence suggests that some pseudogenes may exert important functions. However, to what extent human pseudogenes are functionally relevant remains unclear. There has been no large-scale characterization of pseudogene function because of technical challenges, including high sequence similarity between pseudogene and parent genes, and poor annotation of transcription start sites.

Results: To overcome these technical obstacles, we develop an integrated computational pipeline to design the first genome-wide library of CRISPR interference (CRISPRi) single-guide RNAs (sgRNAs) that target human pseudogene promoter-proximal regions. We perform the first pseudogene-focused CRISPRi screen in luminal A breast cancer cells and reveal approximately 70 pseudogenes that affect breast cancer cell fitness. Among the top hits, we identify a cancer-testis unitary pseudogene, MGAT4EP, that is predominantly localized in the nucleus and interacts with FOXA1, a key regulator in luminal A breast cancer. By enhancing the promoter binding of FOXA1, MGAT4EP upregulates the expression of oncogenic transcription factor FOXM1. Integrative analyses of multi-omic data from the Cancer Genome Atlas (TCGA) reveal many unitary pseudogenes whose expressions are significantly dysregulated and/or associated with overall/relapse-free survival of patients in diverse cancer types.

Conclusions: Our study represents the first large-scale study characterizing pseudogene function. Our findings suggest the importance of nuclear function of unitary pseudogenes and underscore their underappreciated roles in human diseases. The functional genomic resources developed here will greatly facilitate the study of human pseudogene function.

Keywords: CRISPR interference; Cancer; FOXA1; FOXM1; GTEx; Luminal A breast cancer; Nucleus; Pseudogene; TCGA; Transcriptional regulation; Unitary pseudogene.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Breast Neoplasms / genetics
  • Cell Nucleus / genetics
  • Cell Proliferation
  • Clustered Regularly Interspaced Short Palindromic Repeats / genetics*
  • Computational Biology
  • Forkhead Box Protein M1 / metabolism
  • Gene Expression Regulation, Neoplastic
  • Hepatocyte Nuclear Factor 3-alpha / metabolism
  • Humans
  • MCF-7 Cells
  • Promoter Regions, Genetic / genetics
  • Protein Binding
  • Pseudogenes / genetics*
  • RNA, Guide, CRISPR-Cas Systems / genetics
  • Reproducibility of Results
  • Up-Regulation / genetics

Substances

  • FOXA1 protein, human
  • FOXM1 protein, human
  • Forkhead Box Protein M1
  • Hepatocyte Nuclear Factor 3-alpha
  • RNA, Guide, CRISPR-Cas Systems