Data Fusion for Cross-Domain Real-Time Object Detection on the Edge

Sensors (Basel). 2023 Jul 4;23(13):6138. doi: 10.3390/s23136138.

Abstract

We investigate an edge-computing scenario for robot control, where two similar neural networks are running on one computational node. We test the feasibility of using a single object-detection model (YOLOv5) with the benefit of reduced computational resources against the potentially more accurate independent and specialized models. Our results show that using one single convolutional neural network (for object detection and hand-gesture classification) instead of two separate ones can reduce resource usage by almost 50%. For many classes, we observed an increase in accuracy when using the model trained with more labels. For small datasets (a few hundred instances per label), we found that it is advisable to add labels with many instances from another dataset to increase detection accuracy.

Keywords: edge computing; human-computer interaction; object detection; optimization; visual analysis.

MeSH terms

  • Gestures*
  • Hand
  • Neural Networks, Computer
  • Running*
  • Upper Extremity