Citation: Ramos-Cooper, S.;
Gomez-Nieto, E.; Camara-Chavez, G.
VGGFace-Ear: An Extended Dataset
for Unconstrained Ear Recognition.
Sensors 2022, 22, 1752. https://
doi.org/10.3390/s22051752
Academic Editors: M. Jamal Deen,
Subhas Mukhopadhyay, Yangquan
Chen, Simone Morais, Nunzio
Cennamo and Junseop Lee
Received: 14 January 2022
Accepted: 14 February 2022
Published: 23 February 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Article
VGGFace-Ear: An Extended Dataset for Unconstrained
Ear Recognition
†
Solange Ramos-Cooper
1
, Erick Gomez-Nieto
1
and Guillermo Camara-Chavez
1,2,∗
1
Department of Computer Science, Universidad Catolica San Pablo, Arequipa 04001, Peru;
solange.ramos@ucsp.edu.pe (S.R.-C.); emgomez@ucsp.edu.pe (E.G.-N.)
2
Computer Science Department, Federal University of Ouro Preto, Ouro Preto 35400-000, Brazil
* Correspondence: guillermo@ufop.edu.br
† This paper is an extended version of our paper published in: Ramos-Cooper, S.; Camara-Chavez, G.
Ear Recognition in The Wild with Convolutional Neural Networks. In Proceedings of the 2021 XLVII Latin
American Computing Conference (CLEI), Cartago, Costa Rica, 25–29 October 2021.
Abstract:
Recognition using ear images has been an active field of research in recent years. Besides faces
and fingerprints, ears have a unique structure to identify people and can be captured from a distance,
contactless, and without the subject’s cooperation. Therefore, it represents an appealing choice
for building surveillance, forensic, and security applications. However, many techniques used
in those applications—e.g., convolutional neural networks (CNN)—usually demand large-scale
datasets for training. This research work introduces a new dataset of ear images taken under
uncontrolled conditions that present high inter-class and intra-class variability. We built this dataset
using an existing face dataset called the VGGFace, which gathers more than 3.3 million images.
in addition, we perform ear recognition using transfer learning with CNN pretrained on image and
face recognition. Finally, we performed two experiments on two unconstrained datasets and reported
our results using Rank-based metrics.
Keywords:
ear recognition; ear biometrics; deep learning; convolutional neural networks;
mask-RCNN; VGGFace; transfer learning
1. Introduction
Identifying people is a persistent issue in society. Different areas such as forensic
science, surveillance, and security systems usually demand solutions for this issue. Most
identification systems implement biometrics to fulfill their requirements. A biometric trait
has unique and specific features that make people’s recognition possible. Among the most
common physical biometric traits such as fingerprints, palmprints, hand geometry, iris,
and face, the ear structure results in an excellent source to identify a person without their
cooperation. It provides three meaningful benefits, i.e., (i) the outer ear structure does
not drastically change over a person’s lifetime, (ii) it can be captured from a distance and,
(iii) is unique for everyone, even for identical twins [
1
]. When a person ages, the ear shape
does not show significant changes between 8 and 70 years [
2
]. After the age range of
70–79 years, the longitudinal size slightly increases while the person gets old; however,
the structure remains relatively constant. Facial expressions do not affect the ear shape
either. the ear image has uniform color distribution and is completely visible even in mask-
wearing scenarios. Moreover, the ear structure is less prone to injuries than hands or fingers.
These features make the ear structure a stable and reliable source of information used
in a biometric system.
The appearance of the outer ear is defined by the lobule, tragus, antitragus, helix, anti-
helix, concha, navicular fossa, scapha, and some other critical structural parts, as Figure 1
illustrates. These anatomical characteristics differ from person to person and have also
been recognized as a means of personal identification for criminal investigators. In 1890,
Sensors 2022, 22, 1752. https://doi.org/10.3390/s22051752 https://www.mdpi.com/journal/sensors