Contactless Blood Oxygen Saturation Estimation from Facial Videos Using Deep Learning

Bioengineering (Basel). 2024 Mar 4;11(3):251. doi: 10.3390/bioengineering11030251.

Abstract

Blood oxygen saturation (SpO2) is an essential physiological parameter for evaluating a person's health. While conventional SpO2 measurement devices like pulse oximeters require skin contact, advanced computer vision technology can enable remote SpO2 monitoring through a regular camera without skin contact. In this paper, we propose novel deep learning models to measure SpO2 remotely from facial videos and evaluate them using a public benchmark database, VIPL-HR. We utilize a spatial-temporal representation to encode SpO2 information recorded by conventional RGB cameras and directly pass it into selected convolutional neural networks to predict SpO2. The best deep learning model achieves 1.274% in mean absolute error and 1.71% in root mean squared error, which exceed the international standard of 4% for an approved pulse oximeter. Our results significantly outperform the conventional analytical Ratio of Ratios model for contactless SpO2 measurement. Results of sensitivity analyses of the influence of spatial-temporal representation color spaces, subject scenarios, acquisition devices, and SpO2 ranges on the model performance are reported with explainability analyses to provide more insights for this emerging research field.

Keywords: blood oxygen saturation measurement; deep learning; facial videos; non-contact monitoring; remote health monitoring.

Grants and funding

This research was partially supported by the Innovation Technology Commission of Hong Kong (project number TSSSU/HKUST/21/10 and PsH/053/22). Authors are also grateful for the support of HKSTP incubation program.