Sequoia: A Framework for Visual Analysis of RNA Modifications from Direct RNA Sequencing Data

Methods Mol Biol. 2023:2624:127-138. doi: 10.1007/978-1-0716-2962-8_9.

Abstract

Oxford Nanopore-based long-read direct RNA sequencing protocols are being increasingly used to study the dynamics of RNA metabolic processes due to improvements in read lengths, increased throughput, decreasing cost, ease of library preparation, and convenience. Long-read sequencing enables single-molecule-based detection of posttranscriptional changes, promising novel insights into the functional roles of RNA. However, fulfilling this potential will necessitate the development of new tools for analyzing and exploring this type of data. Although there are tools that allow users to analyze signal information, such as comparing raw signal traces to a nucleotide sequence, they don't facilitate studying each individual signal instance in each read or perform analysis of signal clusters based on signal similarity. Therefore, we present Sequoia, a visual analytics application that allows users to interactively analyze signals originating from nanopore sequencers and can readily be extended to both RNA and DNA sequencing datasets. Sequoia combines a Python-based backend with a multi-view graphical interface that allows users to ingest raw nanopore sequencing data in Fast5 format, cluster sequences based on electric-current similarities, and drill-down onto signals to find attributes of interest. In this tutorial, we illustrate each individual step involved in running Sequoia and in the process dissect input data characteristics. We show how to generate Nanopore sequencing-based visualizations by leveraging dimensionality reduction and parameter tuning to separate modified RNA sequences from their unmodified counterparts. Sequoia's interactive features enhance nanopore-based computational methodologies. Sequoia enables users to construct rationales and hypotheses and develop insights about the dynamic nature of RNA from the visual analysis. Sequoia is available at https://github.com/dnonatar/Sequoia .

Keywords: Epitranscriptome; Nanopore signal analysis; RNA modifications; Single-molecule sequencing; Visual infrastructure.

MeSH terms

  • High-Throughput Nucleotide Sequencing / methods
  • Nanopores*
  • RNA / genetics
  • Sequence Analysis, DNA / methods
  • Sequence Analysis, RNA
  • Sequoia*
  • Software

Substances

  • RNA