Background: In crowded crowd images, traditional detection models often have the problems of inaccurate multiscale target count and low recall rate.
Methods: In order to solve the above two problems, this paper proposes an MLP-CNN model, which combined with FPN feature pyramid can fuse the feature map of low-resolution and high-resolution semantic information with less computation and can effectively solve the problem of inaccurate head count of multiscale people. MLP-CNN "mid-term" fusion model can effectively fuse the features of RGB head image and RGB-Mask image. With the help of head RGB-Mask annotation and adaptive Gaussian kernel regression, the enhanced density map can be generated, which can effectively solve the problem of low recall of head detection.
Results: MLP-CNN model was applied in ShanghaiTech and UCF_ CC_ 50 and UCF-QNRF. The test results show that the error of the method proposed in this paper has been significantly improved, and the recall rate can reach 79.91%.
Conclusion: MLP-CNN model not only improves the accuracy of population counting in density map regression, but also improves the detection rate of multiscale population head targets.
Copyright © 2022 Guoyin Ren et al.