PredictFP2: A New Computational Model to Predict Fusion Peptide Domain in All Retroviruses

IEEE/ACM Trans Comput Biol Bioinform. 2020 Sep-Oct;17(5):1714-1720. doi: 10.1109/TCBB.2019.2898943. Epub 2019 Feb 12.

Abstract

Fusion peptide (FP) is a pivotal domain for the entry of retrovirus into host cells to continue self-replication. The crucial role indicates that FP is a promising drug target for therapeutic intervention. A FP model proposed in our previous work is relatively not efficient to predict FP in retroviruses. Thus in this work, we come up with a new computational model to predict FP domains in all the retroviruses. It basically predicts FP domains through recognizing their start and end sites separately with SVM method combing the hydrophobicity knowledge of the subdomain around furin cleavage site. The classification accuracy rates are 91.91, 91.20 and 89.13 percent respectively corresponding to jack-knife, 10-fold cross-validation and 5-fold cross-validation test. Second, this model discovered 69,753 and 493 putative FPs after scanning amino acid sequences and HERV DNA sequences both without FP annotations. Subsequently, a statistical analysis was performed on the 69,753 putative FP sequences, which confirms that FP is a hydrophobic domain. Lastly, we depicted the distribution of the 493 putative FP sequences on each human chromosome and each HERV family, which shows that FP of HERV probably has chromosome and family preference.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Computational Biology / methods*
  • Hydrophobic and Hydrophilic Interactions
  • Protein Domains*
  • Retroviridae / chemistry*
  • Retroviridae / genetics
  • Sequence Analysis, Protein / methods*
  • Support Vector Machine
  • Viral Fusion Proteins / chemistry*
  • Viral Fusion Proteins / genetics

Substances

  • Viral Fusion Proteins