Assumption-based Design of Hybrid Diagnosis Systems: Analyzing
Model-based and Data-driven Principles
Daniel Jung
1
and Mattias Krysander
2
1,2
Department of Electrical Engineering, Link
¨
oping University, Link
¨
oping, SE-581 83, Sweden
daniel.jung@liu.se
mattias.krysander@liu.se
ABSTRACT
Hybrid diagnosis systems combine model-based and data-
driven methods to leverage their respective strengths and mit-
igate individual weaknesses in fault diagnosis. This paper
proposes a unified framework for analyzing and designing
hybrid diagnosis systems, focusing on the principles under-
lying the computation of diagnoses from observations. The
framework emphasizes the importance of assumptions about
fault modes and their manifestations in the system. The pro-
posed architecture supports both fault decoupling and clas-
sification techniques, allowing for the flexible integration of
model-based residuals and data-driven classifiers. Compara-
tive analysis highlights how classical model-based and pure
data-driven systems are special cases within the proposed hy-
brid framework. The proposed framework emphasizes that
the key factor in categorizing fault diagnosis methods is not
whether they are model-based or data-driven, but rather their
ability to decouple faults which is crucial for rejecting diag-
noses when fault training data is limited. Future research di-
rections are suggested to further enhance hybrid fault diagno-
sis systems.
1. INTRODUCTION
A diagnosis system can be described as a function that uses
observations from the monitored system to compute diag-
noses. A diagnosis is a statement about the system’s health
that is consistent with the observations. The output from a
diagnosis system is updated over time, as new observations
are collected, and used as input to other functionalities, e.g.,
fault-tolerant control and fault mitigation (Amin & Hasan,
2019), computer-assisted troubleshooting (Pernest
˚
al, Nyberg,
& Warnquist, 2012), and prognostics (Zio, 2022). Thus,
the diagnosis system must draw reliable conclusions about
the system’s health, at every time instance, even when there
are classification ambiguities, to be able to take appropriate
Daniel Jung et al. This is an open-access article distributed under the terms of
the Creative Commons Attribution 3.0 United States License, which permits
unrestricted use, distribution, and reproduction in any medium, provided the
original author and source are credited.
countermeasures. Drawing the wrong conclusions about de-
tected abnormal behavior can be both hazardous and expen-
sive. This can be summarized in the following design princi-
ples when developing a diagnosis system:
1. To avoid drawing the wrong conclusion about the sys-
tem’s health, the diagnosis system must not falsely reject
the true diagnosis candidate.
2. The diagnosis system should be as precise as possible re-
jecting diagnosis candidates that are not consistent with
system operation.
3. Faults should be detected and isolated at an early stage
for the system to act accordingly.
Designing a diagnosis system that fulfills these objectives is a
nontrivial problem and requires a good understanding of the
behavior of the system to be monitored and the characteristics
of the faults to be diagnosed.
1.1. Fault Diagnosis Methods
Because of its industrial and scientific relevance, the fault
diagnosis problem has been approached in many commu-
nities. Two common approaches are referred to as model-
based diagnosis and data-driven diagnosis (Jung, Ng, Frisk, &
Krysander, 2018). In model-based diagnosis, a mathematical
model of the system derived from physical insights is used to
detect inconsistencies between model predictions and obser-
vations, mainly by designing residual generators. Fault isola-
tion is then performed by matching residual patterns with dif-
ferent fault hypotheses derived from model analysis (Trav
´
e-
Massuy
`
es, 2014). Data-driven fault diagnosis refers to meth-
ods that use historical data from different fault scenarios to
learn the relation between observations and class labels (di-
agnoses). A common data-driven approach is to formulate a
classification problem where fault diagnosis is a matter of as-
signing which class label best explains the observations based
on previous (training) data (Qin, 2012).
Still, the main principle of both model-based and data-driven
methods is to compare measurements with expected, or pre-
1