Nanopore Data Analysis: Baseline Construction and Abrupt Change-Based Multilevel Fitting

Anal Chem. 2021 Aug 31;93(34):11710-11718. doi: 10.1021/acs.analchem.1c01646. Epub 2021 Aug 17.

Abstract

Solid-state nanopore technology delivers single-molecule resolution information, and the quality of the deliverables hinges on the capability of the analysis platform to extract maximum possible events and fit them appropriately. In this work, we present an analysis platform with four baseline fitting methods adaptive to a wide range of nanopore traces (including those with a step or abrupt changes where pre-existing platforms fail) to maximize extractable events (2× improvement in some cases) and multilevel event fitting capability. The baseline fitting methods, in the increasing order of robustness and computational cost, include arithmetic mean, linear fit, Gaussian smoothing, and Gaussian smoothing and regressed mixing. The performance was tested with ultra-stable to vigorously fluctuating current profiles, and the event count increased with increasing fitting robustness prominently for vigorously fluctuating profiles. Turning points of events were clustered using the dbscan method, followed by segmentation into preliminary levels based on abrupt changes in the signal level, which were then iteratively refined to deduce the final levels of the event. Finally, we show the utility of clustering for multilevel DNA data analysis, followed by the assessment of protein translocation profiles.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • DNA
  • Nanopores*
  • Nanotechnology
  • Sequence Analysis, DNA

Substances

  • DNA