Identification of Moonlighting Proteins in Genomes Using Text Mining Techniques

Proteomics. 2018 Nov;18(21-22):e1800083. doi: 10.1002/pmic.201800083. Epub 2018 Oct 10.

Abstract

Moonlighting proteins is an emerging concept for considering protein functions, which indicate proteins with two or more independent and distinct functions. An increasing number of moonlighting proteins have been reported in the past years; however, a systematic study of the topic has been hindered because the secondary functions of proteins are usually found serendipitously by experiments. Toward systematic identification and study of moonlighting proteins, computational methods for identifying moonlighting proteins from several different information sources, database entries, literature, and large-scale omics data have been developed. In this study, an overview for finding moonlighting proteins is discussed. Then, the literature-mining method, DextMP, is applied to find new moonlighting proteins in three genomes, Arabidopsis thaliana, Caenorhabditis elegans, and Drosophila melanogaster. Potential moonlighting proteins identified by DextMP are further examined by a two-step manual literature checking procedure, which finally yielded 13 new moonlighting proteins. Identified moonlighting proteins are categorized into two classes based on the clarity of the distinctness of two functions of the proteins. A few cases of the identified moonlighting proteins are described in detail. Further direction for improving the DextMP algorithm is also discussed.

Keywords: bifunctional proteins; functional genomics; moonlighting proteins; protein function annotation; text mining.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Arabidopsis / genetics
  • Caenorhabditis elegans / genetics
  • Data Mining / methods*
  • Drosophila melanogaster / genetics
  • Genomics / methods*