The Integration of Data from Different Long-Read Sequencing Platforms Enhances Proteoform Characterization in Arabidopsis

Plants (Basel). 2023 Jan 22;12(3):511. doi: 10.3390/plants12030511.

Abstract

The increasing availability of massive omics data requires improving the quality of reference databases and their annotations. The combination of full-length isoform sequencing (Iso-Seq) with short-read transcriptomics and proteomics has been successfully used for increasing proteoform characterization, which is a main ongoing goal in biology. However, the potential of including Oxford Nanopore Technologies Direct RNA Sequencing (ONT-DRS) data has not been explored. In this paper, we analyzed the impact of combining Iso-Seq- and ONT-DRS-derived data on the identification of proteoforms in Arabidopsis MS proteomics data. To this end, we selected a proteomics dataset corresponding to senescent leaves and we performed protein searches using three different protein databases: AtRTD2 and AtRTD3, built from the homonymous transcriptomes, regarded as the most complete and up-to-date available for the species; and a custom hybrid database combining AtRTD3 with publicly available ONT-DRS transcriptomics data generated from Arabidopsis leaves. Our results show that the inclusion and combination of long-read sequencing data from Iso-Seq and ONT-DRS into a proteogenomic workflow enhances proteoform characterization and discovery in bottom-up proteomics studies. This represents a great opportunity to further investigate biological systems at an unprecedented scale, although it brings challenges to current protein searching algorithms.

Keywords: Iso-Seq; ONT-DRS; PacBio; long-read; nanopore; protein database; proteoform; proteogenomics; sequencing.

Grants and funding

This research was funded by the Spanish Ministry of Science, Innovation and Universities (MCI-21-PID2020-113896GB-I00). L.G.-C. is supported by the Government of the Principality of Asturias (Spain) through Severo Ochoa Programme (BP19-146). J.P. is supported by Juan de la Cierva Incorporación Programme (IJC-2019-040330-I).