AnnoFly: annotating Drosophila embryonic images based on an attention-enhanced RNN model

Bioinformatics. 2019 Aug 15;35(16):2834-2842. doi: 10.1093/bioinformatics/bty1064.

Abstract

Motivation: In the post-genomic era, image-based transcriptomics have received huge attention, because the visualization of gene expression distribution is able to reveal spatial and temporal expression pattern, which is significantly important for understanding biological mechanisms. The Berkeley Drosophila Genome Project has collected a large-scale spatial gene expression database for studying Drosophila embryogenesis. Given the expression images, how to annotate them for the study of Drosophila embryonic development is the next urgent task. In order to speed up the labor-intensive labeling work, automatic tools are highly desired. However, conventional image annotation tools are not applicable here, because the labeling is at the gene-level rather than the image-level, where each gene is represented by a bag of multiple related images, showing a multi-instance phenomenon, and the image quality varies by image orientations and experiment batches. Moreover, different local regions of an image correspond to different CV annotation terms, i.e. an image has multiple labels. Designing an accurate annotation tool in such a multi-instance multi-label scenario is a very challenging task.

Results: To address these challenges, we develop a new annotator for the fruit fly embryonic images, called AnnoFly. Driven by an attention-enhanced RNN model, it can weight images of different qualities, so as to focus on the most informative image patterns. We assess the new model on three standard datasets. The experimental results reveal that the attention-based model provides a transparent approach for identifying the important images for labeling, and it substantially enhances the accuracy compared with the existing annotation methods, including both single-instance and multi-instance learning methods.

Availability and implementation: http://www.csbio.sjtu.edu.cn/bioinf/annofly/.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Attention
  • Computational Biology*
  • Drosophila melanogaster*
  • Gene Expression
  • Gene Expression Profiling
  • Software