Data-Driven Remaining Useful Life Estimation: Inference-Based
Versus Direct Prediction
Ark Ifeanyi
1
, Mattia Zanotelli
2
, and Jamie Coble
3
1,2,3
University of Tennessee, Knoxville, TN, 37996, USA
aifeanyi@vols.utk.edu
mzanotel@vols.utk.edu
jamie@utk.edu
ABSTRACT
This paper explores the development and application of data-
driven prognostic models for estimating the Remaining Use-
ful Life (RUL) of Nuclear Power Plant (NPP) condensers ex-
periencing tube fouling. Due to the unavailability of run-to-
failure industry sensor data, we utilized simulated data gen-
erated by the Asherah Nuclear Power Plant Simulator (ANS),
initially designed by the International Atomic Energy Agency
(IAEA) and programmed in Simulink for cyber security sim-
ulations. ANS’s adaptability allows it to simulate Pressurized
Water Reactor (PWR) behaviors given a time series of op-
erating conditions and to introduce degradation modules to
mimic fouling effects. Our study compares two primary ap-
proaches applied to data generated by ANS: inference-based
and direct prediction methods. The selected inference-based
approach estimates the health state of the condenser using
a pipeline formed by an Auto Associative Kernel Regressor
and a Hidden Markov Model (HMM), which subsequently
combines the state estimates with its parameters to predict
the RUL. The direct prediction method employs a Gradient
Boosting Regressor Decision Tree (GBRDT) to map input
variables directly to RUL. Our findings demonstrate the ef-
ficacy and limitations of each method through the case study,
providing valuable insights for the adoption of data-driven
RUL estimation techniques in industrial and energy applica-
tions.
1. INTRODUCTION
In industrial and energy settings, the ability to predict the Re-
maining Useful Life (RUL) of critical components is essential
for effective maintenance planning and operational efficiency.
Accurate prognostics can lead to significant economic bene-
fits by minimizing unplanned downtimes, optimizing mainte-
Ark Ifeanyi et al. This is an open-access article distributed under the terms of
the Creative Commons Attribution 3.0 United States License, which permits
unrestricted use, distribution, and reproduction in any medium, provided the
original author and source are credited.
nance schedules, and extending the service life of equipment
(Elattar et al., 2016). These advantages have spurred consid-
erable interest in data-driven methods for RUL estimation.
The motivation for exploring data-driven approaches stems
from their potential to leverage vast amounts of operational
data generated by modern industrial systems. Traditional
physics-based models, while useful, often require extensive
domain knowledge and may not fully capture the complex be-
haviors of advanced machinery (Cubillo et al., 2016). In con-
trast, data-driven methods can uncover hidden patterns and
relationships within the data, providing robust and scalable
solutions for RUL estimation (An et al., 2013).
In this context, two primary approaches have emerged:
inference-based and direct prediction methods. Inference-
based approaches involve a two-step pipeline where the
health state of the investigated component or system is first
estimated, and then the RUL is inferred from the estimated
health state (Yu, 2015; Peng et al., 2019; Sankavaram et al.,
2016). This method allows for a detailed understanding of
the degradation process and can be particularly useful when
there is a clear, interpretable path from an healthy state to
failure. On the other hand, direct prediction methods involve
mapping input variables directly to the available true RUL of
components over time (Khelif et al., 2016). This approach
simplifies the modeling process and can provide quick, accu-
rate predictions without the need for intermediate steps. Both
methods hold significant value for the industry. Inference-
based methods offer detailed insights into the degradation
mechanisms, which can be critical for diagnostic purposes
and for improving design and operational strategies. Direct
prediction methods, with their straightforward implementa-
tion and rapid results, are advantageous in scenarios where
quick decision-making is paramount.
The primary contribution of this paper is to address a com-
mon question among practitioners: which data-driven prog-
nostics approach should I adopt? This study offers a detailed
comparison of two major data-driven methodologies, using a
1