Abstract
Microsleeps are lapses in which there is a complete and unintentional sleep related loss of
consciousness up to 15 s. They are accompanied by partial to full eye closure and head nodding.
These, in turn, can lead to accidents resulting in fatal results not only to themselves but also people
around them. Thus, the detection and ideally prediction of microsleeps in subjects working in
high-hazard environment is imperative for the well-being of not only them but also for everyone
around in that environment.
Microsleeps are often subtle, and subjects are often unaware of them. They are also not
restricted to being sleep-deprived and often occur in situations in which the subjects are carrying
out monotonous tasks. For these reasons, detecting and predicting microsleeps pose a significant
challenge. There are several means to measure the brain activity of a subject during their routine,
like fMRI and EEG. The brain activities thus obtained are processed to obtain useful information
or features that can be used to identify microsleeps. This process of detecting and predicting
microsleeps needs to be automated to be deployed in real-time situations. Machine learning is a
field that aids this automation process. A recent development in the field of machine learning
is the Deep Neural Network (DNN), in which, unlike conventional Artificial Neural Networks
(ANN), more depth can be added to the neural layers to help the DNNs learn naturally with less
manual intervention.
The objective of this project was to use DNNs to detect and predict microsleeps during
sustained-attention tasks. DNN model(s) formed the basis of a system that can alert the subject
involved in such situations, helping them stay awake and focused on their tasks. Two previous
studies – Study A and C – were used in this work.
Deep learning (DL) approaches implemented for the detection and prediction of microsleeps
were Convolutional Autoencoder (CAE), Convolutional Neural Network (CNN), Long-Short
Term Memory (LSTM), and Bi-directional LSTM (BiLSTM). Several types of EEG representations,
specifically time-domain signal, log-power spectral 2D maps, and pseudo-3D
power-spectral maps, were fed to the deep learning models.
In Study A, EEG with vertical and horizontal EOG in parallel resulted in the best performance
for both microsleep state detection and prediction. The best performance yielded a phi = 0.53
(AUCROC = 0.97; AUCPR = 0.63) for microsleep state detection and a phi = 0.47 (AUCROC =0.95; AUCPR = 0.49) for microsleep state prediction with a prediction-time g = 1.0 s, respectively.
In comparison, EEG alone yielded a phi of 0.48 for microsleep state detection.
For microsleep onset detection (g = 0) and prediction (g = 1.0 s), the best performance
resulted in a phi = 0.10 (AUCROC = 0.94; AUCPR = 0.09) and a phi = 0.08 (AUCROC = 0.93;
AUCPR = 0.08), respectively.
For Study C, using the vertical EOG alone as input to the CNN-series network with weighted
cross-entropy as a classifier resulted in the overall-best performance for both microsleep state
and onset detection. The best performance yielded a phi = 0.32 (AUCROC = 0.77; AUCPR =
0.42) and a phi = 0.11 (AUCROC = 0.74; AUCPR = 0.06) for microsleep state detection and
microsleep state prediction with a prediction-time g = 1.0 s, respectively.
Our study and experimentation has provided the following insights.
• In a CNN, every network layer acts as a detection filter looking for the presence of features
or specific patterns. The CNN looks for simple features in the first few layers to very
complex, subtle, and abstract features in the later layers. This attribute of CNN has
aided in its learning of the complex and subtle nature of the microsleeps. It also resulted
in a superior end-to-end solution for detecting and predicting microsleeps states when
compared to using the features extracted by CNN and SVM and LDA classifiers.
• The addition of features extracted from EOG has significantly increased the performance
of both detection and prediction of microsleeps (both states and onsets).
• Initialising the CNN with CAE-based weights proved to yield a slightly better state
detection performance (phi: from 0.51 to 0.53) compared to initialising the CNN with
narrow-normal weights.
• When the CNN features were visualised, in Study A it was found that the CNN was
extracting features mainly from delta and theta bands to distinguish microsleeps from
responsives. It was also looking into differences in activity in the pre-frontal, parietal,
and occipital regions.
• n Study C, CNN was looking into the alpha and beta bands to distinguish microsleeps
from responsives. Spatially, CNN was looking into difference in activities from all regions
of the brain.
• Overall, the CNN did not perform superiorly when compared to traditional machine
learning approaches, when EEG alone was given as input, for microsleep detection and
prediction.