Identifying transcript 5' capped ends in Plasmodium falciparum

PeerJ. 2021 Aug 25:9:e11983. doi: 10.7717/peerj.11983. eCollection 2021.

Abstract

Background: The genome of the human malaria parasite Plasmodium falciparum is poorly annotated, in particular, the 5' capped ends of its mRNA transcripts. New approaches are needed to fully catalog P. falciparum transcripts for understanding gene function and regulation in this organism.

Methods: We developed a transcriptomic method based on next-generation sequencing of complementary DNA (cDNA) enriched for full-length fragments using eIF4E, a 5' cap-binding protein, and an unenriched control. DNA sequencing adapter was added after enrichment of full-length cDNA using two different ligation protocols. From the mapped sequence reads, enrichment scores were calculated for all transcribed nucleotides and used to calculate P-values of 5' capped nucleotide enrichment. Sensitivity and accuracy were increased by combining P-values from replicate experiments. Data were obtained for P. falciparum ring, trophozoite and schizont stages of intra-erythrocytic development.

Results: 5' capped nucleotide signals were mapped to 17,961 non-overlapping P. falciparum genomic intervals. Analysis of the dominant 5' capped nucleotide in these genomic intervals revealed the presence of two groups with distinctive epigenetic features and sequence patterns. A total of 4,512 transcripts were annotated as 5' capped based on the correspondence of 5' end with 5' capped nucleotide annotated from full-length cDNA data.

Discussion: The presence of two groups of 5' capped nucleotides suggests that alternative mechanisms may exist for producing 5' capped transcript ends in P. falciparum. The 5' capped transcripts that are antisense, outside of, or partially overlapping coding regions may be important regulators of gene function in P. falciparum.

Keywords: 5′ capped nucleotide; Full-length cDNA; Malaria; Plasmodium falciparum; Transcriptomics; eIF4E.

Grants and funding

This work was supported by the Platform Technology Management section, National Center for Genetic Engineering and Biotechnology (BIOTEC), Thailand. [P1201270 and P1551103, both projects jointly to Philip J. Shaw and Jittima Piriyapongsa]; the Thailand Research Fund [RSA5880064 to Chairat Uthaipibull]; and the National Science and Technology Development Agency, (Thailand) [P1300832 to Chairat Uthaipibull, P1850116 (Research Chair Grant) and P1450883 to Sumalee Kamchonwongpaisan]. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.