Face recognition system for set-top box-based intelligent TV

Sensors (Basel). 2014 Nov 18;14(11):21726-49. doi: 10.3390/s141121726.

Abstract

Despite the prevalence of smart TVs, many consumers continue to use conventional TVs with supplementary set-top boxes (STBs) because of the high cost of smart TVs. However, because the processing power of a STB is quite low, the smart TV functionalities that can be implemented in a STB are very limited. Because of this, negligible research has been conducted regarding face recognition for conventional TVs with supplementary STBs, even though many such studies have been conducted with smart TVs. In terms of camera sensors, previous face recognition systems have used high-resolution cameras, cameras with high magnification zoom lenses, or camera systems with panning and tilting devices that can be used for face recognition from various positions. However, these cameras and devices cannot be used in intelligent TV environments because of limitations related to size and cost, and only small, low cost web-cameras can be used. The resulting face recognition performance is degraded because of the limited resolution and quality levels of the images. Therefore, we propose a new face recognition system for intelligent TVs in order to overcome the limitations associated with low resource set-top box and low cost web-cameras. We implement the face recognition system using a software algorithm that does not require special devices or cameras. Our research has the following four novelties: first, the candidate regions in a viewer's face are detected in an image captured by a camera connected to the STB via low processing background subtraction and face color filtering; second, the detected candidate regions of face are transmitted to a server that has high processing power in order to detect face regions accurately; third, in-plane rotations of the face regions are compensated based on similarities between the left and right half sub-regions of the face regions; fourth, various poses of the viewer's face region are identified using five templates obtained during the initial user registration stage and multi-level local binary pattern matching. Experimental results indicate that the recall; precision; and genuine acceptance rate were about 95.7%; 96.2%; and 90.2%, respectively.