Aro: a machine learning approach to identifying single molecules and estimating classification error in fluorescence microscopy images

BMC Bioinformatics. 2015 Mar 27:16:102. doi: 10.1186/s12859-015-0534-z.

Abstract

Background: Recent techniques for tagging and visualizing single molecules in fixed or living organisms and cell lines have been revolutionizing our understanding of the spatial and temporal dynamics of fundamental biological processes. However, fluorescence microscopy images are often noisy, and it can be difficult to distinguish a fluorescently labeled single molecule from background speckle.

Results: We present a computational pipeline to distinguish the true signal of fluorescently labeled molecules from background fluorescence and noise. We test our technique using the challenging case of wide-field, epifluorescence microscope image stacks from single molecule fluorescence in situ experiments on nematode embryos where there can be substantial out-of-focus light and structured noise. The software recognizes and classifies individual mRNA spots by measuring several features of local intensity maxima and classifying them with a supervised random forest classifier. A key innovation of this software is that, by estimating the probability that each local maximum is a true spot in a statistically principled way, it makes it possible to estimate the error introduced by image classification. This can be used to assess the quality of the data and to estimate a confidence interval for the molecule count estimate, all of which are important for quantitative interpretations of the results of single-molecule experiments.

Conclusions: The software classifies spots in these images well, with >95% AUROC on realistic artificial data and outperforms other commonly used techniques on challenging real data. Its interval estimates provide a unique measure of the quality of an image and confidence in the classification.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Artificial Intelligence
  • Caenorhabditis elegans / embryology
  • Caenorhabditis elegans / metabolism*
  • Computational Biology
  • Embryo, Nonmammalian / cytology
  • Embryo, Nonmammalian / metabolism*
  • Fluorescence
  • Fluorescent Dyes / metabolism*
  • Image Processing, Computer-Assisted / methods*
  • In Situ Hybridization, Fluorescence
  • Microscopy, Fluorescence / methods*
  • Nanotechnology
  • RNA, Messenger / analysis*
  • Software*
  • Staining and Labeling

Substances

  • Fluorescent Dyes
  • RNA, Messenger