Exon Elongation Added Intrinsically Disordered Regions to the Encoded Proteins and Facilitated the Emergence of the Last Eukaryotic Common Ancestor

Mol Biol Evol. 2023 Jan 4;40(1):msac272. doi: 10.1093/molbev/msac272.

Abstract

Most prokaryotic proteins consist of a single structural domain (SD) with little intrinsically disordered regions (IDRs) that by themselves do not adopt stable structures, whereas the typical eukaryotic protein comprises multiple SDs and IDRs. How eukaryotic proteins evolved to differ from prokaryotic proteins has not been fully elucidated. Here, we found that the longer the internal exons are, the more frequently they encode IDRs in eight eukaryotes including vertebrates, invertebrates, a fungus, and plants. Based on this observation, we propose the "small bang" model from the proteomic viewpoint: the protoeukaryotic genes had no introns and mostly encoded one SD each, but a majority of them were subsequently divided into multiple exons (step 1). Many exons unconstrained by SDs elongated to encode IDRs (step 2). The elongated exons encoding IDRs frequently facilitated the acquisition of multiple SDs to make the last common ancestor of eukaryotes (step 3). One prediction of the model is that long internal exons are mostly unconstrained exons. Analytical results of the eight eukaryotes are consistent with this prediction. In support of the model, we identified cases of internal exons that elongated after the rat-mouse divergence and discovered that the expanded sections are mostly in unconstrained exons and preferentially encode IDRs. The model also predicts that SDs followed by long internal exons tend to have other SDs downstream. This prediction was also verified in all the eukaryotic species analyzed. Our model accounts for the dichotomy between prokaryotic and eukaryotic proteins and proposes a selective advantage conferred by IDRs.

Keywords: eukaryote; evolution; exon; intrinsically disordered region; intron; protein structure.

MeSH terms

  • Animals
  • Eukaryota* / genetics
  • Evolution, Molecular
  • Exons
  • Intrinsically Disordered Proteins* / genetics
  • Mice
  • Proteins / genetics
  • Proteomics
  • Rats

Substances

  • Proteins
  • Intrinsically Disordered Proteins