Inferring Function from Homology

Methods Mol Biol. 2017:1526:23-40. doi: 10.1007/978-1-4939-6613-4_2.

Abstract

Recent technological advances in sequencing and high-throughput DNA cloning have resulted in the generation of vast quantities of biological sequence data. Ideally the functions of individual genes and proteins predicted by these methods should be assessed experimentally within the context of a defined hypothesis. However, if no hypothesis is known a priori, or the number of sequences to be assessed is large, bioinformatics techniques may be useful in predicting function.This chapter proposes a pipeline of freely available Web-based tools to analyze protein-coding DNA and peptide sequences of unknown function. Accumulated information obtained during each step of the pipeline is used to build a testable hypothesis of function.The following methods are described in detail: 1. Annotation of gene function through Protein domain detection (SMART and Pfam). 2. Sequence similarity methods for homolog detection (BLAST and DELTA-BLAST). 3. Comparing sequences to whole genome data.

Keywords: BLAST; Comparative genomics; Ensembl; Homology; Orthology; Paralogy; Pfam; Protein domain; SMART; UCSC genome browser.

MeSH terms

  • Computational Biology / methods*
  • Databases, Genetic
  • Genomics / methods*
  • Proteins / chemistry
  • Proteins / genetics
  • Proteins / metabolism*
  • Sequence Alignment / methods*

Substances

  • Proteins