Research on Vehicle Recognition Based on Unpacking 3D Bounding Boxes of Monocular Camera in Traffic Scene

Wei Wang; Xinyao Tang; Shangwei Tian; Chaoyang Zhang; Hua Cui

doi:10.4271/2020-01-5196

Currently, most of vehicle recognition methods are realized by deep convolutional neural networks (DCNNs) with input of images directly as training data. Due to the factor of perspective distortion and scale change of images taken by monocular camera, a large number of multi-scale images need to be used for training, and physical information of vehicles cannot be obtained at the same time. In order to improve the above problems, we present a method of vehicle recognition based on unpacking 3D bounding boxes in this paper. Firstly, camera calibration information and geometric constraints are used to build 3D bounding boxes around vehicles in monocular projection. Then, the 3D bounding boxes are unpacked to obtain 3D normalized spatial data without perspective distortion. Finally, VGG-16 is chosen as the backbone of our network, the output of which can be divided into five common vehicle types including hatchback, sedan, SUV, truck and bus. The experimental results indicate that the accuracy of our method is improved by 8.74% for hatchback and 7.49% for sedan with less training data, which outperforms traditional end-to-end deep learning methods of vehicle recognition and physical information can be obtained simultaneously.

Standards

SAE Mobilus®

Publications

News

Events

Professional Development

A World in Motion (PreK-12)

Participate with SAE

SAE Membership

Donate

Research on Vehicle Recognition Based on Unpacking 3D Bounding Boxes of Monocular Camera in Traffic Scene 2020-01-5196

SAE MOBILUS