Redefining the structural motifs that determine RNA binding and RNA editing by pentatricopeptide repeat proteins in land plants

Plant J. 2016 Feb;85(4):532-47. doi: 10.1111/tpj.13121.

Abstract

The pentatricopeptide repeat (PPR) proteins form one of the largest protein families in land plants. They are characterised by tandem 30-40 amino acid motifs that form an extended binding surface capable of sequence-specific recognition of RNA strands. Almost all of them are post-translationally targeted to plastids and mitochondria, where they play important roles in post-transcriptional processes including splicing, RNA editing and the initiation of translation. A code describing how PPR proteins recognise their RNA targets promises to accelerate research on these proteins, but making use of this code requires accurate definition and annotation of all of the various nucleotide-binding motifs in each protein. We have used a structural modelling approach to define 10 different variants of the PPR motif found in plant proteins, in addition to the putative deaminase motif that is found at the C-terminus of many RNA-editing factors. We show that the super-helical RNA-binding surface of RNA-editing factors is potentially longer than previously recognised. We used the redefined motifs to develop accurate and consistent annotations of PPR sequences from 109 genomes. We report a high error rate in PPR gene models in many public plant proteomes, due to gene fusions and insertions of spurious introns. These consistently annotated datasets across a wide range of species are valuable resources for future comparative genomics studies, and an essential pre-requisite for accurate large-scale computational predictions of PPR targets. We have created a web portal (http://www.plantppr.com) that provides open access to these resources for the community.

Keywords: RNA binding; RNA editing; genome annotation; pentatricopeptide repeat motifs; pentatricopeptide repeat proteins; structural modelling.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Motifs
  • Amino Acid Sequence
  • Embryophyta / genetics*
  • Embryophyta / metabolism
  • Mitochondria / metabolism
  • Models, Molecular
  • Models, Structural*
  • Molecular Sequence Annotation
  • Plant Proteins / chemistry*
  • Plant Proteins / genetics
  • Plant Proteins / metabolism
  • Plastids / metabolism
  • Protein Transport
  • RNA Editing / genetics*
  • RNA Recognition Motif Proteins / chemistry
  • RNA Recognition Motif Proteins / genetics
  • RNA Recognition Motif Proteins / metabolism
  • RNA, Plant / genetics
  • Sequence Alignment

Substances

  • Plant Proteins
  • RNA Recognition Motif Proteins
  • RNA, Plant