SPLiT: Single Portrait Lighting Estimation via a Tetrad of Face Intrinsics

IEEE Trans Pattern Anal Mach Intell. 2024 Feb;46(2):1079-1092. doi: 10.1109/TPAMI.2023.3328453. Epub 2024 Jan 8.

Abstract

This paper proposes a novel pipeline to estimate a non-parametric environment map with high dynamic range from a single human face image. Lighting-independent and -dependent intrinsic images of the face are first estimated separately in a cascaded network. The influence of face geometry on the two lighting-dependent intrinsics, diffuse shading and specular reflection, are further eliminated by distributing the intrinsics pixel-wise onto spherical representations using the surface normal as indices. This results in two representations simulating images of a diffuse sphere and a glossy sphere under the input scene lighting. Taking into account the distinctive nature of light sources and ambient terms, we further introduce a two-stage lighting estimator to predict both accurate and realistic lighting from these two representations. Our model is trained supervisedly on a large-scale and high-quality synthetic face image dataset. We demonstrate that our method allows accurate and detailed lighting estimation and intrinsic decomposition, outperforming state-of-the-art methods both qualitatively and quantitatively on real face images.