Citation: Li, T.; Zhang, B.; Lv, H.; Hu,
S.; Xu, Z.; Tuergong, Y. CAttSleepNet:
Automatic End-to-End Sleep Staging
Using Attention-Based Deep Neural
Networks on Single-Channel EEG.
Int. J. Environ. Res. Public Health 2022,
19, 5199. https://doi.org/10.3390/
ijerph19095199
Academic Editors: Oliver Faust and
Shang-Ming Zhou
Received: 12 March 2022
Accepted: 22 April 2022
Published: 25 April 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
International Journal of
Environmental Research
and Public Health
Article
CAttSleepNet: Automatic End-to-End Sleep Staging Using
Attention-Based Deep Neural Networks on Single-Channel EEG
Tingting Li
1
, Bofeng Zhang
2,3,
*, Hehe Lv
1
, Shengxiang Hu
1
, Zhikang Xu
1
and Yierxiati Tuergong
3
1
School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China;
gogogoit@shu.edu.cn (T.L.); hhlv@shu.edu.cn (H.L.); shengxianghu@shu.edu.cn (S.H.);
xuzhikangnba@shu.edu.cn (Z.X.)
2
School of Computer and Communication Engineering, Shanghai Polytechnic University, Shanghai 201209, China
3
School of Computer Science and Technology, Kashi University, Kashi 844008, China; erxat@ksu.edu.cn
* Correspondence: bfzhang@sspu.edu.cn
Abstract:
Accurate sleep staging results can be used to measure sleep quality, providing a reliable
basis for the prevention and diagnosis of sleep-related diseases. The key to sleep staging is the
feature representation of EEG signals. Existing approaches rarely consider local features in feature
extraction, and fail to distinguish the importance of critical and non-critical local features. We propose
an innovative model for automatic sleep staging with single-channel EEG, named CAttSleepNet.
We add an attention module to the convolutional neural network (CNN) that can learn the weights
of local sequences of EEG signals by exploiting intra-epoch contextual information. Then, a two-
layer bidirectional-Long Short-Term Memory (Bi-LSTM) is used to encode the global correlations
of successive epochs. Therefore, the feature representations of EEG signals are enhanced by both
local and global context correlation. Experimental results achieved on two real-world sleep datasets
indicate that the CAttSleepNet model outperforms existing models. Moreover, ablation experiments
demonstrate the validity of our proposed attention module.
Keywords:
sleep staging; convolutional neural network; attention mechanism; bidirectional long
short-term memory; EEG
1. Introduction
As an important physiological activity, high-quality sleep can effectively restore peo-
ple’s physical and mental strength, while long-term sleep deprivation or disorder can
seriously affect physical and emotional health. It has been shown that certain diseases,
such as Parkinson’s disease and Alzheimer’s disease, are strongly associated with sleep
disorders or abnormalities [
1
,
2
]. Therefore, it is important to improve sleep quality and
prevent diseases caused by sleep disorders through a detailed scoring of sleep stages. In
the process of sleep staging, sleep experts divide the polysomnography (PSG) into 30 s
(30-s) epochs and mark the corresponding sleep stages of each epoch according to the
Rechtschaffen and Kales (R&K) [
3
] and American Academy of Sleep Medicine (AASM) [
4
]
guidelines. Sleep specialists usually label an epoch by analyzing contextual information
to find important sleep-related events, such as LAMF and k-complex. However, artificial
sleep staging is time-consuming and complex, and the sleep staging results produced by
different sleep experts sometimes vary.
Recently, a growing number of researchers have tried to apply artificial intelligence
techniques, such as machine learning and deep learning, to solve the issue of sleep staging.
Machine learning-based methods usually choose appropriate features from physiolog-
ical signals (i.e., EEG, EOG, and EMG) [
5
–
8
]. Then, the feature selection algorithm is
used to select more representative signal features. Finally, the classifier categorizes sleep
stages according to the selected features. Although these approaches have led to some
achievements, they still demonstrate some problems. For instance, selecting the most
Int. J. Environ. Res. Public Health 2022, 19, 5199. https://doi.org/10.3390/ijerph19095199 https://www.mdpi.com/journal/ijerph