An Optimized DNN Model for Real-Time Inferencing on an Embedded Device

Jungme Park; Pawan Aryal; Sai Rithvick Mandumula; Ritwik Prasad Asolkar

doi:10.3390/s23083992

An Optimized DNN Model for Real-Time Inferencing on an Embedded Device

Sensors (Basel). 2023 Apr 14;23(8):3992. doi: 10.3390/s23083992.

Authors

Jungme Park¹, Pawan Aryal¹, Sai Rithvick Mandumula¹, Ritwik Prasad Asolkar¹

Affiliation

¹ College of Engineering, Kettering University, Flint, MI 48504, USA.

Abstract

For many automotive functionalities in Advanced Driver Assist Systems (ADAS) and Autonomous Driving (AD), target objects are detected using state-of-the-art Deep Neural Network (DNN) technologies. However, the main challenge of recent DNN-based object detection is that it requires high computational costs. This requirement makes it challenging to deploy the DNN-based system on a vehicle for real-time inferencing. The low response time and high accuracy of automotive applications are critical factors when the system is deployed in real time. In this paper, the authors focus on deploying the computer-vision-based object detection system on the real-time service for automotive applications. First, five different vehicle detection systems are developed using transfer learning technology, which utilizes the pre-trained DNN model. The best performing DNN model showed improvements of 7.1% in Precision, 10.8% in Recall, and 8.93% in F1 score compared to the original YOLOv3 model. The developed DNN model was optimized by fusing layers horizontally and vertically to deploy it in the in-vehicle computing device. Finally, the optimized DNN model is deployed on the embedded in-vehicle computing device to run the program in real-time. Through optimization, the optimized DNN model can run 35.082 fps (frames per second) on the NVIDIA Jetson AGA, 19.385 times faster than the unoptimized DNN model. The experimental results demonstrate that the optimized transferred DNN model achieved higher accuracy and faster processing time for vehicle detection, which is vital for deploying the ADAS system.

Keywords: ADAS; TensorRT; convolution neural network; deep neural network; embedded devices; object detection; transfer learning.

Grants and funding

This research received no external funding.