PZLAST: an ultra-fast amino acid sequence similarity search server against public metagenomes

Hiroshi Mori; Hitoshi Ishikawa; Koichi Higashi; Yoshiaki Kato; Toshikazu Ebisuzaki; Ken Kurokawa

doi:10.1093/bioinformatics/btab492

PZLAST: an ultra-fast amino acid sequence similarity search server against public metagenomes

Bioinformatics. 2021 Nov 5;37(21):3944-3946. doi: 10.1093/bioinformatics/btab492.

Authors

Hiroshi Mori¹, Hitoshi Ishikawa², Koichi Higashi¹, Yoshiaki Kato³, Toshikazu Ebisuzaki³, Ken Kurokawa¹

Affiliations

¹ Department of Informatics, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan.
² PEZY Computing, K. K., 5F Chiyoda Ogawamachi Crosta, Chiyoda-ku, Tokyo 101-0052, Japan.
³ Computational Astrophysics Laboratory, RIKEN, Wako, Saitama, Japan.

Abstract

Summary: : Similarity searches of amino acid sequences against the public metagenomic data can provide users insights about the function of sequences based on the environmental distribution of similar sequences. However, a considerable reduction in the amount of data or the accuracy of the result was necessary to conduct sequence similarity searches against public metagenomic data, because of the vast data size more than Terabytes. Here, we present an ultra-fast service for the highly accurate amino acid sequence similarity search, called PZLAST, which can search the user's amino acid sequences to several Terabytes of public metagenomic sequences in ∼10-20 min. PZLAST accomplishes its search speed by using PEZY-SC2, which is a Multiple Instruction Multiple Data many-core processor. Results of PZLAST are summarized by the ontology-based environmental distribution of similar sequences. PZLAST can be used to predict the function of sequences and mine for homologs of functionally important gene sequences.

Availability and implementation: PZLAST is freely accessible at https://pzlast.riken.jp/meta without requiring registration.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Amino Acid Sequence
Computers*
Metagenome*
Metagenomics / methods