A Survey of Data Mining and Deep Learning in Bioinformatics

J Med Syst. 2018 Jun 28;42(8):139. doi: 10.1007/s10916-018-1003-9.

Abstract

The fields of medicine science and health informatics have made great progress recently and have led to in-depth analytics that is demanded by generation, collection and accumulation of massive data. Meanwhile, we are entering a new period where novel technologies are starting to analyze and explore knowledge from tremendous amount of data, bringing limitless potential for information growth. One fact that cannot be ignored is that the techniques of machine learning and deep learning applications play a more significant role in the success of bioinformatics exploration from biological data point of view, and a linkage is emphasized and established to bridge these two data analytics techniques and bioinformatics in both industry and academia. This survey concentrates on the review of recent researches using data mining and deep learning approaches for analyzing the specific domain knowledge of bioinformatics. The authors give a brief but pithy summarization of numerous data mining algorithms used for preprocessing, classification and clustering as well as various optimized neural network architectures in deep learning methods, and their advantages and disadvantages in the practical applications are also discussed and compared in terms of their industrial usage. It is believed that in this review paper, valuable insights are provided for those who are dedicated to start using data analytics methods in bioinformatics.

Keywords: Bioinformatics; Biomedicine; Data mining; Deep learning; Machine learning.

Publication types

  • Review

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Computational Biology*
  • Data Mining*
  • Machine Learning
  • Surveys and Questionnaires