ggmotif: An R Package for the extraction and visualization of motifs from MEME software

PLoS One. 2022 Nov 3;17(11):e0276979. doi: 10.1371/journal.pone.0276979. eCollection 2022.

Abstract

MEME (Multiple Em for Motif Elicitation) is the most commonly used tool to identify motifs within deoxyribonucleic acid (DNA) or protein sequences. However, the results generated by the MEMEare saved using file formats .xml and .txt, which are difficult to read, visualize, or integrate with other widely used phylogenetic tree packages, such as ggtree. To overcome this problem, we developed the ggmotif R package, which provides two easy-to-use functions that can facilitate the extraction and visualization of motifs from the results files generated by the MEME. ggmotif can extract the information of the location of motif(s) on the corresponding sequence(s) from the .xml format file and visualize it. Additionally, the data extracted by ggmotif can be easily integrated with the phylogenetic data. On the other hand, ggmotif can obtain the sequence of each motif from the .txt format file and draw the sequence logo with the function ggseqlogo from the ggseqlogo R package. The ggmotif R package is freely available (including examples and vignettes) from GitHub at https://github.com/lixiang117423/ggmotif or from CRAN at https://CRAN.R-project.org/package=ggmotif.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Phylogeny
  • Position-Specific Scoring Matrices
  • Software*

Associated data

  • figshare/10.6084/m9.figshare.20098922

Grants and funding

This study was supported by the National Natural Science Foundation of China (31801792) and Fok Ying Tung Education Foundation (171026).