Analysis of the compositional biases in Plasmodium falciparum genome and proteome using Arabidopsis thaliana as a reference

Gene. 2004 Jul 21;336(2):163-73. doi: 10.1016/j.gene.2004.04.029.

Abstract

Comparative genomic analysis of the malaria causative agent, Plasmodium falciparum, with other eukaryotes for which the complete genome is available, revealed that the genome from P. falciparum was more similar to the genome of a plant, Arabidopsis thaliana, than to other non-apicomplexan taxa. Plant-like sequences are thought to result from horizontal gene transfers after a secondary endosymbiosis involving an algal ancestor. The use of the A. thaliana genome and proteome as a reference gives an opportunity to refine our understanding of the extreme compositional bias in the P. falciparum genome that leads to a proteome-wide amino acid bias. A set of pairs of non-redundant protein homologues was selected owing to rigorous genome-wide sequence comparison methods. The introduction of A. thaliana as a reference was a mean to weight the magnitude of the protein evolutionary divergence in P. falciparum. The correlation of the amino acid proportions with evolutionary time supports the hypothesis that amino acids encoded by GC-rich codons are directionally substituted into amino acids encoded by AT-rich codons in the P. falciparum proteome. The long-term deviation of codons in malarial sequences appears as a possible consequence of a genome-wide tri-nucleotidic signature imprinting. Additionally, this study suggests possible working guidelines to improve the accuracy of P. falciparum sequence comparisons, for homology searches and phylogenetic studies.

Publication types

  • Comparative Study

MeSH terms

  • AT Rich Sequence
  • Algorithms
  • Amino Acids / genetics
  • Animals
  • Arabidopsis / genetics*
  • Arabidopsis Proteins / genetics
  • Base Composition
  • Codon / genetics
  • Databases, Genetic
  • Evolution, Molecular
  • GC Rich Sequence
  • Genome, Plant*
  • Genome, Protozoan*
  • Plasmodium falciparum / genetics*
  • Proteome / genetics*
  • Protozoan Proteins / genetics
  • Sequence Alignment

Substances

  • Amino Acids
  • Arabidopsis Proteins
  • Codon
  • Proteome
  • Protozoan Proteins