Addressing annotation and data scarcity when designing machine learning strategies for neurophotonics

Catherine Bouchard; Renaud Bernatchez; Flavie Lavoie-Cardinal

doi:10.1117/1.NPh.10.4.044405

Addressing annotation and data scarcity when designing machine learning strategies for neurophotonics

Neurophotonics. 2023 Oct;10(4):044405. doi: 10.1117/1.NPh.10.4.044405. Epub 2023 Aug 24.

Authors

Catherine Bouchard^{1

2}, Renaud Bernatchez^{1

2}, Flavie Lavoie-Cardinal^{1

2

3}

Affiliations

¹ CERVO Brain Research Centre, Québec, Québec, Canada.
² Université Laval, Institute Intelligence and Data, Québec, Québec, Canada.
³ Université Laval, Département de psychiatrie et de neurosciences, Québec, Québec, Canada.

Abstract

Machine learning has revolutionized the way data are processed, allowing information to be extracted in a fraction of the time it would take an expert. In the field of neurophotonics, machine learning approaches are used to automatically detect and classify features of interest in complex images. One of the key challenges in applying machine learning methods to the field of neurophotonics is the scarcity of available data and the complexity associated with labeling them, which can limit the performance of data-driven algorithms. We present an overview of various strategies, such as weakly supervised learning, active learning, and domain adaptation that can be used to address the problem of labeled data scarcity in neurophotonics. We provide a comprehensive overview of the strengths and limitations of each approach and discuss their potential applications to bioimaging datasets. In addition, we highlight how different strategies can be combined to increase model performance on those datasets. The approaches we describe can help to improve the accessibility of machine learning-based analysis with limited number of annotated images for training and can enable researchers to extract more meaningful insights from small datasets.

Keywords: active learning; domain adaptation; image analysis; machine learning; weakly supervised learning.