Identification of Novel Genes and Proteoforms in Angiostrongylus costaricensis through a Proteogenomic Approach

Pathogens. 2022 Oct 31;11(11):1273. doi: 10.3390/pathogens11111273.

Abstract

RNA sequencing (RNA-Seq) and mass-spectrometry-based proteomics data are often integrated in proteogenomic studies to assist in the prediction of eukaryote genome features, such as genes, splicing, single-nucleotide (SNVs), and single-amino-acid variants (SAAVs). Most genomes of parasite nematodes are draft versions that lack transcript- and protein-level information and whose gene annotations rely only on computational predictions. Angiostrongylus costaricensis is a roundworm species that causes an intestinal inflammatory disease, known as abdominal angiostrongyliasis (AA). Currently, there is no drug available that acts directly on this parasite, mostly due to the sparse understanding of its molecular characteristics. The available genome of A. costaricensis, specific to the Costa Rica strain, is a draft version that is not supported by transcript- or protein-level evidence. This study used RNA-Seq and MS/MS data to perform an in-depth annotation of the A. costaricensis genome. Our prediction improved the reference annotation with (a) novel coding and non-coding genes; (b) pieces of evidence of alternative splicing generating new proteoforms; and (c) a list of SNVs between the Brazilian (Crissiumal) and the Costa Rica strain. To the best of our knowledge, this is the first time that a multi-omics approach has been used to improve the genome annotation of A. costaricensis. We hope this improved genome annotation can assist in the future development of drugs, kits, and vaccines to treat, diagnose, and prevent AA caused by either the Brazil strain (Crissiumal) or the Costa Rica strain.

Keywords: RNA-Seq; genome annotation; mass spectrometry; ncRNAs; nematode.