
Article
FGFF Descriptor and Modified Hu Moment-Based Hand
Gesture Recognition
Beiwei Zhang
1,
* , Yudong Zhang
2
, Jinliang Liu
1
and Bin Wang
1
Citation: Zhang, B.; Zhang, Y.; Liu, J.;
Wang, B. FGFF Descriptor and
Modified Hu Moment-Based Hand
Gesture Recognition. Sensors 2021, 21,
6525. https://doi.org/10.3390/
s21196525
Academic Editor: Junseop Lee
Received: 20 August 2021
Accepted: 28 September 2021
Published: 29 September 2021
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2021 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
1
School of Information Engineering, Nanjing University of Finance and Economics, Nanjing 210023, China;
liujinliang@vip.163.com (J.L.); wangbin@nufe.edu.cn (B.W.)
2
School of Computing and Mathematical Sciences, University of Leicester, Leicester LE1 7RH, UK;
yudongzhang@ieee.org
* Correspondence: zhangbeiwei@nufe.edu.cn
Abstract:
Gesture recognition has been studied for decades and still remains an open problem.
One important reason is that the features representing those gestures are not sufficient, which may
lead to poor performance and weak robustness. Therefore, this work aims at a comprehensive and
discriminative feature for hand gesture recognition. Here, a distinctive Fingertip Gradient orientation
with Finger Fourier (FGFF) descriptor and modified Hu moments are suggested on the platform of
a Kinect sensor. Firstly, two algorithms are designed to extract the fingertip-emphasized features,
including palm center, fingertips, and their gradient orientations, followed by the finger-emphasized
Fourier descriptor to construct the FGFF descriptors. Then, the modified Hu moment invariants with
much lower exponents are discussed to encode contour-emphasized structure in the hand region.
Finally, a weighted AdaBoost classifier is built based on finger-earth mover’s distance and SVM
models to realize the hand gesture recognition. Extensive experiments on a ten-gesture dataset were
carried out and compared the proposed algorithm with three benchmark methods to validate its
performance. Encouraging results were obtained considering recognition accuracy and efficiency.
Keywords:
FGFF descriptor; Hu moment invariants; finger thickness; hand gesture recognition;
weighted AdaBoost classifier
1. Introduction
Hand gestures carry rich information and provide a natural yet important method for
different people to interact in their daily life. They have been used as a friendly interface
between humans and computer systems, which enables an intuitive and convenient human–
computer interaction, and have found many applications in natural human–computer
interaction, such as intelligent robot control, smart homing, virtual reality, computer games,
and some quietness-required environments. In [
1
], the authors explored the recognition
application of handwritten Arabic alphabets by tracking and modeling the motion of the
hand. To this end, recent years have witnessed an active research interest in the field of
hand gesture recognition and human action recognition.
Traditional vision-based recognition algorithms mainly utilize the information of color
or texture from 2D RGB camera, which is typically affected by external environments such
as illumination, skin color, and cluttered background. Their limitation is the loss of 3D
structure information, which obviously decreases their robustness and accuracy. In order
to improve the robustness and simplify the hand localization and segmentation, some
researchers suggested the use of a colored glove or black belt on the wrist of the gesturing
hand [
2
]. Furthermore, accelerometers, magnetic trackers, and data gloves are involved
in obtaining the three-dimensional information of gesture for easy image processing and
3D motion capturing at the granularity of the fingers. However, these strategies are only
suitable for handling some simple gestures. When the gesture becomes more complex, it
will obviously reduce the recognition accuracy. Furthermore, it impedes the invisibility of
Sensors 2021, 21, 6525. https://doi.org/10.3390/s21196525 https://www.mdpi.com/journal/sensors