
Article
Comparative Study of Markerless Vision-Based Gait Analyses
for Person Re-Identification
Jaerock Kwon
1,
* , Yunju Lee
2
and Jehyung Lee
3
Citation: Kwon, J.; Lee, Y.; Lee, J.
Comparative Study of Markerless
Vision-Based Gait Analyses for
Person Re-Identification. Sensors 2021,
21, 8208. https://doi.org/10.3390/
s21248208
Academic Editors: YangQuan Chen,
Nunzio Cennamo, M. Jamal Deen,
Subhas Mukhopadhyay,
Simone Morais and Junseop Lee
Received: 15 October 2021
Accepted: 3 December 2021
Published: 8 December 2021
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2021 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
1
Department of Electrical and Computer Engineering, University of Michigan-Dearborn,
Dearborn, MI 48128, USA
2
School of Engineering, Department of Physical Therapy and Athletic Training, Grand Valley State University,
Grand Rapids, MI 49504, USA; leeyun@gvsu.edu
3
NAVER Corp., Seongnam-si 13561, Korea; trevor0513@gmail.com
* Correspondence: jrkwon@umich.edu; Tel.: +1-313-583-6590
Abstract:
The model-based gait analysis of kinematic characteristics of the human body has been used
to identify individuals. To extract gait features, spatiotemporal changes of anatomical landmarks
of the human body in 3D were preferable. Without special lab settings, 2D images were easily
acquired by monocular video cameras in real-world settings. The 2D and 3D locations of key joint
positions were estimated by the 2D and 3D pose estimators. Then, the 3D joint positions can be
estimated from the 2D image sequences in human gait. Yet, it has been challenging to have the
exact gait features of a person due to viewpoint variance and occlusion of body parts in the 2D
images. In the study, we conducted a comparative study of two different approaches: feature-based
and spatiotemporal-based viewpoint invariant person re-identification using gait patterns. The first
method is to use gait features extracted from time-series 3D joint positions to identify an individual.
The second method uses a neural network, a Siamese Long Short Term Memory (LSTM) network with
the 3D spatiotemporal changes of key joint positions in a gait cycle to classify an individual without
extracting gait features. To validate and compare these two methods, we conducted experiments
with two open datasets of the MARS and CASIA-A datasets. The results show that the Siamese
LSTM outperforms the gait feature-based approaches on the MARS dataset by 20% and 55% on the
CASIA-A dataset. The results show that feature-based gait analysis using 2D and 3D pose estimators
is premature. As a future study, we suggest developing large-scale human gait datasets and designing
accurate 2D and 3D joint position estimators specifically for gait patterns. We expect that the current
comparative study and the future work could contribute to rehabilitation study, forensic gait analysis
and early detection of neurological disorders.
Keywords:
markerless; vision-based; machine learning; person re-identification; gait; gait analysis;
motion capture; siamese neural networks
1. Introduction
Human gait is a noninvasive biometric feature that can be perceived from a distance,
and contact with subjects is not required [
1
]. There have been mounting studies that show
that the individuality of a subject is embedded in their gait, comprising spatiotemporal
features of the ankle, knee, pelvis, and trunk [
2
–
5
]. Yet, extracting dynamic gait patterns
is challenging because spatiotemporal gait representations are not easily acquired in real-
world settings. In general, two main approaches were taken for gait-based identification,
namely model-free (appearance-based) and model-based methods [
6
]. Model-free ap-
proaches are to identify an individual using one’s silhouette, clothes, anthropometrics, etc.
Even with very high accuracy, using appearances has many disadvantages. Appearance-
based methods must assume a subject wears unique clothes from others and works only
in similar time frames when a subject does not change clothes. Using silhouettes or Gait
Energy Image (GEI) [
7
] can mitigate the aforementioned problem. However, GEI inherently
Sensors 2021, 21, 8208. https://doi.org/10.3390/s21248208 https://www.mdpi.com/journal/sensors