Design and Implementation of a Video/Voice Process System for Recognizing Vehicle Parts Based on Artificial Intelligence

Kapyol Kim; Incheol Jeong; Jinsoo Cho

doi:10.3390/s20247339

Design and Implementation of a Video/Voice Process System for Recognizing Vehicle Parts Based on Artificial Intelligence

Sensors (Basel). 2020 Dec 21;20(24):7339. doi: 10.3390/s20247339.

Authors

Kapyol Kim¹, Incheol Jeong¹, Jinsoo Cho¹

Affiliation

¹ Gachon University, Seongnam 1342, Korea.

Abstract

With the recent development of artificial intelligence along with information and communications infrastructure, a new paradigm of online services is being developed. Whereas in the past a service system could only exchange information of the service provider at the request of the user, information can now be provided by automatically analyzing a particular need, even without a direct user request. This also holds for online platforms of used-vehicle sales. In the past, consumers needed to inconveniently determine and classify the quality of information through static data provided by service and information providers. As a result, this service field has been harmful to consumers owing to such problems as false sales, fraud, and exaggerated advertising. Despite significant efforts of platform providers, there are limited human resources for censoring the vast amounts of data uploaded by sellers. Therefore, in this study, an algorithm called YOLOv3+MSSIM Type 2 for automatically censoring the data of used-vehicle sales on an online platform was developed. To this end, an artificial intelligence system that can automatically analyze an object in a vehicle video uploaded by a seller, and an artificial intelligence system that can filter the vehicle-specific terms and profanity from the seller's video presentation, were also developed. As a result of evaluating the developed system, the average execution speed of the proposed YOLOv3+MSSIM Type 2 algorithm was 78.6 ms faster than that of the pure YOLOv3 algorithm, and the average frame rate per second was improved by 40.22 fps. In addition, the average GPU utilization rate was improved by 23.05%, proving the efficiency.

Keywords: MSSIM; SSIM; YOLO V3; object recognition; speech recognition.