Identification of Regulatory SNPs Associated with Vicine and Convicine Content of Vicia faba Based on Genotyping by Sequencing Data Using Deep Learning

Genes (Basel). 2020 Jun 5;11(6):614. doi: 10.3390/genes11060614.

Abstract

Faba bean (Vicia faba) is a grain legume, which is globally grown for both human consumption as well as feed for livestock. Despite its agro-ecological importance the usage of Vicia faba is severely hampered by its anti-nutritive seed-compounds vicine and convicine (V+C). The genes responsible for a low V+C content have not yet been identified. In this study, we aim to computationally identify regulatory SNPs (rSNPs), i.e., SNPs in promoter regions of genes that are deemed to govern the V+C content of Vicia faba. For this purpose we first trained a deep learning model with the gene annotations of seven related species of the Leguminosae family. Applying our model, we predicted putative promoters in a partial genome of Vicia faba that we assembled from genotyping-by-sequencing (GBS) data. Exploiting the synteny between Medicago truncatula and Vicia faba, we identified two rSNPs which are statistically significantly associated with V+C content. In particular, the allele substitutions regarding these rSNPs result in dramatic changes of the binding sites of the transcription factors (TFs) MYB4, MYB61, and SQUA. The knowledge about TFs and their rSNPs may enhance our understanding of the regulatory programs controlling V+C content of Vicia faba and could provide new hypotheses for future breeding programs.

Keywords: GBS; Vicia faba; convolutional neural network; promoter; rSNP; vicin/convicin.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Deep Learning
  • Genotype
  • Glucosides / genetics*
  • Polymorphism, Single Nucleotide / genetics
  • Pyrimidinones
  • Regulatory Sequences, Nucleic Acid / genetics*
  • Seeds / genetics
  • Synteny / genetics
  • Transcription Factors / genetics
  • Uridine / analogs & derivatives*
  • Uridine / genetics
  • Vicia faba / genetics*

Substances

  • Glucosides
  • Pyrimidinones
  • Transcription Factors
  • convicine
  • vicine
  • Uridine