PGNneo: A Proteogenomics-Based Neoantigen Prediction Pipeline in Noncoding Regions

Cells. 2023 Mar 1;12(5):782. doi: 10.3390/cells12050782.

Abstract

The development of a neoantigen-based personalized vaccine has promise in the hunt for cancer immunotherapy. The challenge in neoantigen vaccine design is the need to rapidly and accurately identify, in patients, those neoantigens with vaccine potential. Evidence shows that neoantigens can be derived from noncoding sequences, but there are few specific tools for identifying neoantigens in noncoding regions. In this work, we describe a proteogenomics-based pipeline, namely PGNneo, for use in discovering neoantigens derived from the noncoding region of the human genome with reliability. In PGNneo, four modules are included: (1) noncoding somatic variant calling and HLA typing; (2) peptide extraction and customized database construction; (3) variant peptide identification; (4) neoantigen prediction and selection. We have demonstrated the effectiveness of PGNneo and applied and validated our methodology in two real-world hepatocellular carcinoma (HCC) cohorts. TP53, WWP1, ATM, KMT2C, and NFE2L2, which are frequently mutating genes associated with HCC, were identified in two cohorts and corresponded to 107 neoantigens from non-coding regions. In addition, we applied PGNneo to a colorectal cancer (CRC) cohort, demonstrating that the tool can be extended and verified in other tumor types. In summary, PGNneo can specifically detect neoantigens generated by noncoding regions in tumors, providing additional immune targets for cancer types with a low tumor mutational burden (TMB) in coding regions. PGNneo, together with our previous tool, can identify coding and noncoding region-derived neoantigens and, thus, will contribute to a complete understanding of the tumor immune target landscape. PGNneo source code and documentation are available at Github. To facilitate the installation and use of PGNneo, we provide a Docker container and a GUI.

Keywords: neoantigen; noncoding regions; prediction pipeline; proteogenomics; tumor immunotherapy.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Antigens, Neoplasm
  • Carcinoma, Hepatocellular*
  • Humans
  • Liver Neoplasms*
  • Peptides
  • Proteogenomics*
  • Reproducibility of Results
  • Ubiquitin-Protein Ligases

Substances

  • Antigens, Neoplasm
  • Peptides
  • WWP1 protein, human
  • Ubiquitin-Protein Ligases

Grants and funding

This work was supported by the National Natural Science Foundation of China (31870829), Shanghai Municipal Health Commission Collaborative Innovation Cluster Project (2019CXJQ02).