Citation: Zhang, S.; Wang, W.; Li, H.;
Zhang, S. EVtracker: An
Event-Driven Spatiotemporal
Method for Dynamic Object Tracking.
Sensors 2022, 22, 6090. https://
doi.org/10.3390/s22166090
Academic Editors: Yangquan Chen,
Petros Daras and Cosimo Distante
Received: 12 June 2022
Accepted: 12 August 2022
Published: 15 August 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Article
EVtracker: An Event-Driven Spatiotemporal Method
for Dynamic Object Tracking
Shixiong Zhang , Wenmin Wang * , Honglei Li and Shenyong Zhang
School of Computer Science and Engineering, Macau University of Science and Technology,
Avenida Wai Long, Taipa, Macau
* Correspondence: wmwang@must.edu.mo
Abstract:
An event camera is a novel bio-inspired sensor that effectively compensates for the short-
comings of current frame cameras, which include high latency, low dynamic range, motion blur,
etc. Rather than capturing images at a fixed frame rate, an event camera produces an asynchronous
signal by measuring the brightness change of each pixel. Consequently, an appropriate algorithm
framework that can handle the unique data types of event-based vision is required. In this paper,
we propose a dynamic object tracking framework using an event camera to achieve long-term stable
tracking of event objects. One of the key novel features of our approach is to adopt an adaptive
strategy that adjusts the spatiotemporal domain of event data. To achieve this, we reconstruct event
images from high-speed asynchronous streaming data via online learning. Additionally, we apply
the Siamese network to extract features from event data. In contrast to earlier models that only
extract hand-crafted features, our method provides powerful feature description and a more flexible
reconstruction strategy for event data. We assess our algorithm in three challenging scenarios: 6-DoF
(six degrees of freedom), translation, and rotation. Unlike fixed cameras in traditional object tracking
tasks, all three tracking scenarios involve the simultaneous violent rotation and shaking of both
the camera and objects. Results from extensive experiments suggest that our proposed approach
achieves superior accuracy and robustness compared to other state-of-the-art methods. Without
reducing time efficiency, our novel method exhibits a 30% increase in accuracy over other recent
models. Furthermore, results indicate that event cameras are capable of robust object tracking, which
is a task that conventional cameras cannot adequately perform, especially for super-fast motion
tracking and challenging lighting situations.
Keywords: event-based camera; object tracking; spatiotemporal method
1. Introduction
Event cameras have attracted more and more attention from researchers due to their
excellent capturing performance for moving targets [
1
–
4
]. An event-based camera, also
known as a neuromorphic camera or dynamic vision sensor (DVS), is a new type of sensor
closer to biological vision than conventional frame-based cameras. Therefore, it has advan-
tages such as low power consumption (1 mW), high dynamic range (140 db), extremely
high temporal resolution, and low latency (microsecond level) [5,6].
These capabilities enable event cameras to be widely used in autonomous driving
and intelligent transportation, drones, and so on [
7
–
9
]. Nevertheless, compared with the
large number of mature applications of conventional cameras, the related algorithms and
applications of event cameras are still very lacking. If we want event cameras to play a
significant role in real systems, we still have significant work to do. Fortunately, some
computer vision algorithms can be improved and applied to event cameras, especially
some algorithms based on video sequences, such as object tracking, optical flow, etc.
As shown in Figure 1, each event pixel (blue dot) captured by an event camera with a
different timestamp is distributed in a spatiotemporal domain; conventional computer
Sensors 2022, 22, 6090. https://doi.org/10.3390/s22166090 https://www.mdpi.com/journal/sensors