Multispecies genome-wide analysis defines the MAP3K gene family in Gossypium hirsutum and reveals conserved family expansions

BMC Bioinformatics. 2019 Mar 14;20(Suppl 2):99. doi: 10.1186/s12859-019-2624-9.

Abstract

Background: Gene families are sets of structurally and evolutionarily related genes - in one or multiple species - that typically share a conserved biological function. As such, the identification and subsequent analyses of entire gene families are widely employed in the fields of evolutionary and functional genomics of both well established and newly sequenced plant genomes. Currently, plant gene families are typically identified using one of two major ways: 1) HMM-profile based searches using models built on Arabidopsis thaliana genes or 2) coding sequence homology searches using curated databases. Integrated databases containing functionally annotated genes and gene families have been developed for model organisms and several important crops; however, a comprehensive methodology for gene family annotation is currently lacking, preventing automated annotation of newly sequenced genomes.

Results: This paper proposes a combined measure of homology identification, motif conservation, phylogenomic and integrated gene expression analyses to define gene family structures in multiple plant species. The MAP3K gene families in seven plant species, including two currently unexamined species Gossypium hirsutum, and Zostera marina, were characterized to reveal new insights into their collective function and evolution and demonstrate the effectiveness of our novel methodology.

Conclusion: Compared with recent reports, this methodology performs significantly better for the identification and analysis of gene family members in several monocots/dicots, diploid as well as polyploid plant species.

Keywords: Gene collinearity; Gene duplication; Gossypium hirsutum; HMM-profile gene search; MAP3K gene family; Orthologous gene search; Phylogenetic analysis; Sequence motif conservation.

MeSH terms

  • Gene Expression Regulation, Plant / genetics*
  • Genes, Plant / genetics*
  • Genome-Wide Association Study / methods*
  • Gossypium / chemistry*
  • MAP Kinase Kinase Kinase 1 / genetics*

Substances

  • MAP Kinase Kinase Kinase 1
  • MAP3K1 protein, human