New layers in understanding and predicting α-linolenic acid content in plants using amino acid characteristics of omega-3 fatty acid desaturase

Comput Biol Med. 2014 Nov:54:14-23. doi: 10.1016/j.compbiomed.2014.08.019. Epub 2014 Aug 26.

Abstract

α-linolenic acid (ALA) is the most frequent omega-3 in plants. The content of ALA is highly variable, ranging from 0 to 1% in rice and corn to >50% in perilla and flax. ALA production is strongly correlated with the enzymatic activity of omega-3 fatty acid desaturase. To unravel the underlying mechanisms of omega-3 diversity, 895 protein features of omega-3 fatty acid desaturase were compared between plants with high and low omega-3. Attribute weighting showed that this enzyme in plants with high omega-3 content has higher amounts of Lys, Lys-Phe, and Pro-Asn but lower Aliphatic index, Gly-His, and Pro-Leu. The Random Forest model with Accuracy criterion when run on the dataset pre-filtered with Info Gain algorithm was the best model in distinguishing high omega-3 content based on the frequency of Lys-Lys in the structure of fatty acid desaturase. Interestingly, the discriminant function algorithm could predict the level of omega-3 only based on the six important selected attributes (out of 895 protein attributes) of fatty acid desaturase with 75% accuracy. We developed "Plant omega3 predictor" to predict the content of α-linolenic acid based on structural features of omega-3 fatty acid desaturase. The software calculates the 6 key structural protein features from imported Fasta sequence of omega-3 fatty acid desaturase or utilizes the imported features and predicts the ALA content using discriminant function formula. This work unravels an underpinning mechanism of omega-3 diversity via discovery of the key protein attributes in the structure of omega-3 desaturase offering a new approach to obtain higher omega-3 content.

Keywords: Amino acids; Bioinformatics; Discriminant function; Feature selection; Machine learning: modelling; Omega-3; Prediction; Random Forest model; α-Linolenic acid.

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Amino Acids / chemistry*
  • Amino Acids / metabolism
  • Binding Sites
  • Fatty Acid Desaturases
  • Molecular Sequence Data
  • Plant Proteins / chemistry*
  • Plant Proteins / metabolism
  • Protein Binding
  • Protein Interaction Mapping / methods*
  • Sequence Analysis, Protein / methods*
  • Software*
  • alpha-Linolenic Acid / chemistry*
  • alpha-Linolenic Acid / metabolism

Substances

  • Amino Acids
  • Plant Proteins
  • alpha-Linolenic Acid
  • Fatty Acid Desaturases
  • omega-3 fatty acid desaturase