Lineage EM algorithm for inferring latent states from cellular lineage trees

Bioinformatics. 2020 May 1;36(9):2829-2838. doi: 10.1093/bioinformatics/btaa040.

Abstract

Summary: Phenotypic variability in a population of cells can work as the bet-hedging of the cells under an unpredictably changing environment, the typical example of which is the bacterial persistence. To understand the strategy to control such phenomena, it is indispensable to identify the phenotype of each cell and its inheritance. Although recent advancements in microfluidic technology offer us useful lineage data, they are insufficient to directly identify the phenotypes of the cells. An alternative approach is to infer the phenotype from the lineage data by latent-variable estimation. To this end, however, we must resolve the bias problem in the inference from lineage called survivorship bias. In this work, we clarify how the survivorship bias distorts statistical estimations. We then propose a latent-variable estimation algorithm without the survivorship bias from lineage trees based on an expectation-maximization (EM) algorithm, which we call lineage EM algorithm (LEM). LEM provides a statistical method to identify the traits of the cells applicable to various kinds of lineage data.

Availability and implementation: An implementation of LEM is available at https://github.com/so-nakashima/Lineage-EM-algorithm.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Cell Lineage
  • Phenotype