Flexible protein database based on amino acid k-mers

Sci Rep. 2022 Jun 1;12(1):9101. doi: 10.1038/s41598-022-12843-9.

Abstract

Identification of proteins is one of the most computationally intensive steps in genomics studies. It usually relies on aligners that do not accommodate rich information on proteins and require additional pipelining steps for protein identification. We introduce kAAmer, a protein database engine based on amino-acid k-mers that provides efficient identification of proteins while supporting the incorporation of flexible annotations on these proteins. Moreover, the database is built to be used as a microservice, to be hosted and queried remotely.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acids*
  • Databases, Protein
  • Sequence Analysis, DNA
  • Software*

Substances

  • Amino Acids