Data-driven Detection of Engine Faults in Infrequently-driven
Ground Vehicles
Ethan Kohrt
1
, Matthew Moon
2
, Matthew Sullivan
3
, Sri Das
4
, Michael Thurston
5
and Nenad G. Nenadic
6
1,2,3,4,5,6
Rochester Institute of Technology, Rochester, NY, 14623, USA
eakgis@rit.edu memgis@rit.edu mrsgis@rit.edu spdgis@rit.edu mgtasp@rit.edu nxnasp@rit.edu
ABSTRACT
We investigated the detection of engine faults in infrequently
driven ground vehicles using data-driven methods based on
neural network autoencoders. Multivariate time-series data
from the infrequently driven vehicles under investigation had
limited coverage of operating conditions. Hence, a consid-
erable part of this work focused on identifying suitable vehi-
cles, relevant signals, and pre-processing the data. We trained
autoencoder models on eight vehicles with known faults and
detected faults in six. Four of the faults were detectable un-
der idle conditions and four were detectable under driving
conditions. Model evaluations required human inspection to
distinguish fault detections from other anomalies. We detail
our procedures for pre-processing, model development, and
post-processing, and we include a discussion on our interpre-
tations of the model results.
1. INTRODUCTION
A fault detection system deployed on a ground vehicle pro-
cesses data from vehicle sensors and alerts appropriate stake-
holders (e.g., maintainers, operators, or logisticians) about a
developing fault. Early detection saves resources by reduc-
ing unnecessary maintenance, reducing unplanned downtime,
and lessening the need for redundancy, which in turn allows
for a smaller fleet size. Fault detection systems allow main-
tainers to correct issues before they become severe and can
even prevent injury by detecting the development of a catas-
trophic failure before it happens(Arena, Collotta, Luca, Rug-
gieri, & Termine, 2022). Such systems are usually only con-
cerned with detecting faults that are low-frequency and high
severity because high-frequency faults suggest that a funda-
mental design change is needed — or otherwise can be miti-
gated with regularly scheduled maintenance — and low-cost
events are generally not worth the expense of developing and
installing a detection system.
Ethan Kohrt et al. This is an open-access article distributed under the terms of
the Creative Commons Attribution 3.0 United States License, which permits
unrestricted use, distribution, and reproduction in any medium, provided the
original author and source are credited.
Ground vehicles are complex machines with many interre-
lated systems, and these systems are expected to perform a
wide range of tasks in a variety of environments. This com-
plexity makes it challenging for human experts to establish
indicators that can be used to consistently determine a vehi-
cle’s condition. There are a large number of sensor signals to
take into account, each with varying relevance (Giordano et
al., 2022). Furthermore, faults often manifest in the relation-
ship between signals rather than in the values of an individual
signal, meaning that checking whether a signal has exceeded
its nominal range is insufficient. Instead, one can learn the
patterns of a fault automatically with a data-driven approach.
This involves collecting large amounts of healthy and faulty
operating data and creating a model that can distinguish the
two.
Unfortunately, it is rare to find data that is cleanly labeled
as ‘healthy’ or ‘faulty’ because collecting this data often
involves running a machine to failure, which can be pro-
hibitively expensive and time-consuming(Theissler, P
´
erez-
Vel
´
azquez, Kettelgerdes, & Elger, 2021). Such a procedure
also results in a highly skewed dataset since nearly all of the
data will be ‘healthy,’ and it is usually infeasible to account
for all possible failure modes. For these reasons, a supervised
data-driven classification approach is often untenable for fault
detection.
The lack of labeled data motivated us to frame the problem
as anomaly detection instead of classification. In anomaly
detection, sensor data that is known to be from a period of
normal operation is used to model the baseline behavior, then
deviation from the baseline behavior can be used as an in-
dicator of the vehicle’s condition. If the deviation increases
past some threshold, an anomaly is detected - possibly indi-
cating the development of a fault. Broadly speaking, prog-
nostic health management (PHM) capabilities are, in increas-
ing order: anomaly detection, diagnostics, and prognostics
(Vachtsevanos, Lewis, Roemer, Hess, & Wu, 2006; Goebel
et al., 2017). While at the lowest level of PHM capability,
anomaly detection is very important in its own right and can,
1