[en] This study delves into the effectiveness of various learning methods in
improving Transformer models, focusing particularly on the Gated Residual
Network Transformer (GRN-Transformer) in the context of pediatric intensive
care units (PICU) with limited data availability. Our findings indicate that
Transformers trained via supervised learning are less effective compared to
MLP, CNN, and LSTM networks in such environments. Yet, leveraging unsupervised
and self-supervised learning on unannotated data, with subsequent fine-tuning
on annotated data, notably enhances Transformer performance, although not to
the level of the GRN-Transformer. Central to our research is the analysis of
different activation functions for the Gated Linear Unit (GLU), a crucial
element of the GRN structure. We also employ Mutual Information Neural
Estimation (MINE) to evaluate the GRN's contribution. Additionally, the study
examines the effects of integrating GRN within the Transformer's Attention
mechanism versus using it as a separate intermediary layer. Our results
highlight that GLU with sigmoid activation stands out, achieving 0.98 accuracy,
0.91 precision, 0.96 recall, and 0.94 F1 score. The MINE analysis supports the
hypothesis that GRN enhances the mutual information between the hidden
representations and the output. Moreover, the use of GRN as an intermediate
filter layer proves more beneficial than incorporating it within the Attention
mechanism. In summary, this research clarifies how GRN bolsters
GRN-Transformer's performance, surpassing other learning techniques. These
findings offer a promising avenue for adopting sophisticated models like
Transformers in data-constrained environments, such as PPG artifact detection
in PICU settings.
Disciplines :
Computer science
Author, co-author :
LE, Thanh-Dung ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SigCom
Language :
English
Title :
Transformer Meets Gated Residual Networks To Enhance Photoplethysmogram Artifact Detection Informed by Mutual Information Neural Estimation