Citation: Park, B.-S.; Kim, W.; Kim,
J.-K.; Hwang, E.S.; Kim, D.-W.; Seo,
Y.-H. 3D Static Point Cloud Registration
by Estimating Temporal Human Pose
at Multiview. Sensors 2022, 22, 1097.
https://doi.org/10.3390/s22031097
Academic Editors: Zhihan Lv, Kai Xu
and Zhigeng Pan
Received: 18 December 2021
Accepted: 25 January 2022
Published: 31 January 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Article
3D Static Point Cloud Registration by Estimating Temporal
Human Pose at Multiview
Byung-Seo Park
1
, Woosuk Kim
1
, Jin-Kyum Kim
1
, Eui Seok Hwang
2
, Dong-Wook Kim
1
and Young-Ho Seo
1,
*
1
Department of Electronic Materials Engeering, Kwangwoon University, Kwangwoon-ro 20, Nowon-gu,
Seoul 01897, Korea; bspark@kw.ac.kr (B.-S.P.); kws@kw.ac.kr (W.K.); jkkim@kw.ac.kr (J.-K.K.);
dwkim@kw.ac.kr (D.-W.K.)
2
Yeshcompany, 18, Teheran-ro 43-gil, Gangnam-gu, Seoul 06151, Korea; ushwang@yesh.co.kr
* Correspondence: yhseo@kw.ac.kr
Abstract:
This paper proposes a new technique for performing 3D static-point cloud registration
after calibrating a multi-view RGB-D camera using a 3D (dimensional) joint set. Consistent feature
points are required to calibrate a multi-view camera, and accurate feature points are necessary to
obtain high-accuracy calibration results. In general, a special tool, such as a chessboard, is used
to calibrate a multi-view camera. However, this paper uses joints on a human skeleton as feature
points for calibrating a multi-view camera to perform calibration efficiently without special tools.
We propose an RGB-D-based calibration algorithm that uses the joint coordinates of the 3D joint set
obtained through pose estimation as feature points. Since human body information captured by the
multi-view camera may be incomplete, a joint set predicted based on image information obtained
through this may be incomplete. After efficiently integrating a plurality of incomplete joint sets into
one joint set, multi-view cameras can be calibrated by using the combined joint set to obtain extrinsic
matrices. To increase the accuracy of calibration, multiple joint sets are used for optimization through
temporal iteration. We prove through experiments that it is possible to calibrate a multi-view camera
using a large number of incomplete joint sets.
Keywords: point cloud; 3D registration; RGB-D; joint set; pose estimation
1. Introduction
Recently, RGB-D sensors (cameras) combining RGB and depth sensors have become
common and are widely used in various fields. The RGB-D camera helps to accurately
and quickly extract the shape of an object and the 3D structure of the surrounding environ-
ment. RGB-D cameras have developed various fields such as SLAM and navigation [
1
,
2
],
tracking [
3
,
4
], object recognition and localization [
5
], pose estimation [
6
], and 3D model
registration [
7
]. The color components of the RGB-D camera are obtained using the RGB
camera. On the other hand, depth information is obtained using various methods such
as time-of-flight (ToF) cameras, laser range scanners, and structured-light (SL) sensors [
8
].
RGB-D cameras include the Azure Kinect of Microsoft [
9
], the Phoxi 3D of Photoneo [
10
],
the Zivid Two of Zivid [
11
], the Helios of Lucid [
12
], and the RealSense of Intel [
13
]. These
cameras have various properties (operational time, depth accuracy, cost, sensing method)
according to their intended usage. Since human pose estimation is used for extrinsic cali-
bration, the sensing method of using a laser is not suitable for this study, although it has
a high degree of depth accuracy. The temporal calibration and registration for humans
in motion require a high frame rate to capture and calculate depth map and RGB image,
so a camera that uses a long operation time is not suitable for this study. For reliable and
accurate scene representation using RGB-D cameras, intrinsic calibration of each camera
and extrinsic calibration between two sensors are required. Recently, intrinsic parameter
sets are being determined in advance, and these values are stored in non-volatile memory
inside the device. In applications that perform imaging using multiple RGB-D cameras,
Sensors 2022, 22, 1097. https://doi.org/10.3390/s22031097 https://www.mdpi.com/journal/sensors