PicassoNet: Searching Adaptive Architecture for Efficient Facial Landmark Localization

IEEE Trans Neural Netw Learn Syst. 2023 Dec;34(12):10516-10527. doi: 10.1109/TNNLS.2022.3167743. Epub 2023 Nov 30.

Abstract

Since recent facial landmark localization methods achieve satisfying accuracy, few of them enable fast inference speed, which, however, is critical in many real-world facial applications. Existing methods typically employ complicated network structure and predict all the key points through uniform computation, which is inefficient since individual facial part might take different computation to obtain the best performance. Taking both accuracy and efficiency into consideration, we propose the PicassoNet, a lightweight cascaded facial landmark detector with adaptive computation for individual facial part. Different from the conventional cascaded methods, PicassoNet integrates refinement submodules into a single network with group convolution, where each convolution group predicts landmarks from an individual facial part. Note that the groups' structures are flexible in the training process. Then, a novel grouping search algorithm is proposed to optimize the group division. With formulating the optimization as a network architecture search (NAS) problem, the grouping search adaptively allocates computation to each group and obtains an efficient structure. In addition, we propose a boundary-aware loss to optimize along tangent and normal of facial boundaries, instead of optimizing along horizontal and vertical as the conventional loss (L2, SmoothL1, WingLoss, and so on) do. The novel loss improves the joint locations of predicted keypoints. Experiments on three benchmark datasets AFLW, 300W, and WFLW show that the proposed method runs over 6× times faster than the state of the arts and meanwhile achieves comparable accuracy.