Feature Selection for Topological Proximity Prediction of Single-Cell Transcriptomic Profiles in Drosophila Embryo Using Genetic Algorithm

Genes (Basel). 2020 Dec 28;12(1):28. doi: 10.3390/genes12010028.

Abstract

Single-cell transcriptomics data, when combined with in situ hybridization patterns of specific genes, can help in recovering the spatial information lost during cell isolation. Dialogue for Reverse Engineering Assessments and Methods (DREAM) consortium conducted a crowd-sourced competition known as DREAM Single Cell Transcriptomics Challenge (SCTC) to predict the masked locations of single cells from a set of 60, 40 and 20 genes out of 84 in situ gene patterns known in Drosophila embryo. We applied a genetic algorithm (GA) to predict the most important genes that carry positional and proximity information of the single-cell origins, in combination with the base distance mapping algorithm DistMap. Resulting gene selection was found to perform well and was ranked among top 10 in two of the three sub-challenges. However, the details of the method did not make it to the main challenge publication, due to an intricate aggregation ranking. In this work, we discuss the detailed implementation of GA and its post-challenge parameterization, with a view to identify potential areas where GA-based approaches of gene-set selection for topological association prediction may be improved, to be more effective. We believe this work provides additional insights into the feature-selection strategies and their relevance to single-cell similarity prediction and will form a strong addendum to the recently published work from the consortium.

Keywords: DREAM challenge; Drosophila embryo; gene expression pattern; genetic algorithm; single-cell RNA sequencing; spatial organization.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Datasets as Topic
  • Drosophila / genetics*
  • Drosophila / growth & development
  • Embryo, Nonmammalian
  • Embryonic Development / genetics
  • Gene Expression Regulation, Developmental
  • Models, Genetic*
  • RNA-Seq / methods*
  • Single-Cell Analysis / methods*