基于关键点不确定性预测的单目3D目标检测

VIP文档

ID:38313

大小:20.20 MB

页数:16页

时间:2023-03-10

金币:10

上传者:战必胜

 
Citation: Chen, M.; Zhao, H.; Liu, P.
Monocular 3D Object Detection
Based on Uncertainty Prediction of
Keypoints. Machines 2022, 10, 19.
https://doi.org/10.3390/
machines10010019
Academic Editors: Xiaochun Cheng
and Daming Shi
Received: 18 November 2021
Accepted: 23 December 2021
Published: 26 December 2021
Publishers Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2021 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
machines
Article
Monocular 3D Object Detection Based on Uncertainty
Prediction of Keypoints
Mu Chen
1,2,3,4,
* , Huaici Zhao
1,2,3,
* and Pengfei Liu
1,2,4
1
Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China; liupengfei@sia.cn
2
Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110169, China
3
University of Chinese Academy of Sciences, Beijing 100049, China
4
Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences,
Shenyang 110016, China
* Correspondence: chenmu@sia.cn (M.C.); hczhao@sia.cn (H.Z.)
Abstract:
Three-dimensional (3D) object detection is an important task in the field of machine vision,
in which the detection of 3D objects using monocular vision is even more challenging. We observe
that most of the existing monocular methods focus on the design of the feature extraction framework
or embedded geometric constraints, but ignore the possible errors in the intermediate process of
the detection pipeline. These errors may be further amplified in the subsequent processes. After
exploring the existing detection framework of keypoints, we find that the accuracy of keypoints
prediction will seriously affect the solution of 3D object position. Therefore, we propose a novel
keypoints uncertainty prediction network (KUP-Net) for monocular 3D object detection. In this work,
we design an uncertainty prediction module to characterize the uncertainty that exists in keypoint
prediction. Then, the uncertainty is used for joint optimization with object position. In addition, we
adopt position-encoding to assist the uncertainty prediction, and use a timing coefficient to optimize
the learning process. The experiments on our detector are conducted on the KITTI benchmark. For
the two levels of easy and moderate, we achieve accuracy of 17.26 and 11.78 in
AP
3D
, and achieve
accuracy of 23.59 and 16.63 in AP
BEV
, which are higher than the latest method KM3D.
Keywords: keypoints; uncertainty prediction; monocular 3D detection
1. Introduction
The understanding of 3D properties of objects in the real world is critical for vision-
based autonomous driving and traffic surveillance systems [
1
5
]. Compared with a two-
dimensional (2D) object detection task, the 3D object detection task involves nine degrees
of freedom, in which the length, width, height, and pose of the 3D bounding box need to
be detected. Currently, there are three main methods for 3D object detection: monocular
3D object detection, stereo-based 3D object detection and LIDAR-based 3D object detection.
Among them, the LIDAR-based and the stereo-based detection methods can usually obtain
higher detection accuracy with the provision of reliable depth information. However, the
radar system has the disadvantages of high cost, high energy consumption, and short
service life. On the contrary, the monocular detection method, which is characterized by
low cost and low energy consumption, has received extensive attention and attracted re-
searchers to conduct studies in this field. Therefore, our work focuses on the improvements
in monocular 3D object detection techniques.
Monocular 3D object detection takes a single RGB image as input, and outputs the
pose and dimension of the object in the real world. Due to the lack of depth information,
this process is ill-conditioned, and the ambiguity will occur in the process of inverse
projection from the 2D image plane to 3D space. Obviously, compared with stereo-based
and LIDAR-based methods, the task of monocular 3D object detection is more challenging.
Thanks to the powerful feature extraction and parameter regression capabilities of the
neural network, some original monocular 3D object detection pipelines [
6
,
7
] regress the 3D
Machines 2022, 10, 19. https://doi.org/10.3390/machines10010019 https://www.mdpi.com/journal/machines
资源描述:

当前文档最多预览五页,下载文档查看全文

此文档下载收益归作者所有

当前文档最多预览五页,下载文档查看全文
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。
关闭