Citation: Gao, Y.; Wang, M.; Zhang,
G.; Zhou, L.; Luo, J.; Liu, L.
Cluster-Based Ensemble Learning
Model for Aortic Dissection
Screening. Int. J. Environ. Res. Public
Health 2022, 19, 5657. https://
doi.org/10.3390/ijerph19095657
Academic Editors: Keun Ho Ryu and
Nipon Theera-Umpon
Received: 22 March 2022
Accepted: 29 April 2022
Published: 6 May 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
International Journal of
Environmental Research
and Public Health
Article
Cluster-Based Ensemble Learning Model for Aortic
Dissection Screening
Yan Gao
1
, Min Wang
1
, Guogang Zhang
2
, Lingjun Zhou
1
, Jingming Luo
2
and Lijue Liu
1,
*
1
School of Automation, Central South University, Changsha 410083, China; gaoyan@csu.edu.cn (Y.G.);
214611109@csu.edu.cn (M.W.); csu_0918@163.com (L.Z.)
2
Xiangya School of Medicine, Central South University, Changsha 410083, China;
zhangguogang@csu.edu.cn (G.Z.); jmluo0618@139.com (J.L.)
* Correspondence: ljliu@csu.edu.cn
Abstract:
Aortic dissection (AD) is a rare and high-risk cardiovascular disease with high mortality.
Due to its complex and changeable clinical manifestations, it is easily missed or misdiagnosed. In
this paper, we proposed an ensemble learning model based on clustering: Cluster Random under-
sampling Smote–Tomek Bagging (CRST-Bagging) to help clinicians screen for AD patients in the early
phase to save their lives. In this model, we propose the CRST method, which combines the advantages
of Kmeans++ and the Smote–Tomek sampling method, to overcome an extremely imbalanced AD
dataset. Then we used the Bagging algorithm to predict the AD patients. We collected AD patients’
and other cardiovascular patients’ routine examination data from Xiangya Hospital to build the
AD dataset. The effectiveness of the CRST method in resampling was verified by experiments on
the original AD dataset. Our model was compared with RUSBoost and SMOTEBagging on the
original dataset and a test dataset. The results show that our model performed better. On the test
dataset, our model’s precision and recall rates were 83.6% and 80.7%, respectively. Our model’s
F1-score was 82.1%, which is 4.8% and 1.6% higher than that of RUSBoost and SMOTEBagging, which
demonstrates our model’s effectiveness in AD screening.
Keywords: aortic dissection; imbalanced data; screening; clustering; bagging
1. Introduction
Aortic dissection (AD) is a medial rupture caused by intramural hemorrhage, which
leads to the separation of the aortic wall layer, followed by the separation of the true and
false lumen [
1
]. AD is a dangerous cardiovascular disease with many complications and
high mortality. Mortality can reach as high as 50% within 48 h of onset and 60–70% within
a week [2,3]. Rapid diagnosis is very important for the treatment of AD.
However, the clinical manifestations of AD are complex and changeable. AD patients
often lack specific symptoms and signs. Additionally, the location, lesion degree and scale
of AD are different. Clinicians tend to observe the common symptoms of AD to diagnose it,
such as chest pain and back pain. However, for patients without pain, atypical symptoms
make the diagnosis more difficult. Thus, AD is easily missed or misdiagnosed [
4
]. More
than 1/3 of AD cases are missed in actual cases of AD [
5
–
7
], and the rate at which acute
aortic syndrome is missed in the emergency room is close to 80% [
8
]. The rarity of AD
is also one of the reasons for the high rate of missed diagnosis. The incidence of AD is
about 11.9 cases per 100,000 people [
9
], and the incidence of AD in the emergency room is
5.93–24.92 cases per 100,000 people [
10
]. With the popularization of imaging technologies,
such as computerized tomography angiography (CTA) and magnetic resonance imaging
(MRI), the diagnosis rate of AD has increased significantly [4].
In Chinese rural and remote areas, many hospitals lack medical imaging equipment.
However, routine examinations are common in every hospital. Because of AD’s rarity,
many clinicians in Chinese rural or remote hospitals have less experience diagnosing AD
Int. J. Environ. Res. Public Health 2022, 19, 5657. https://doi.org/10.3390/ijerph19095657 https://www.mdpi.com/journal/ijerph