Working paper (E-prints, Working papers and Research blog)
Numerical Attributes Learning for Cardiac Failure Diagnostic from Clinical Narratives -- A LESA-CamemBERT-bio Approach
Aser Lompo, Boammani; LE, Thanh-Dung
2024
 

Files


Full Text
08982080.pdf
Author postprint (8.03 MB)
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
eess.SP
Abstract :
[en] Medical records created by healthcare professionals upon patient admission are rich in details critical for diagnosis. Yet, their potential is not fully realized because of obstacles such as complex medical language, inadequate comprehension of medical numerical data by state-of-the-art Large Language Models (LLMs), and the limitations imposed by small annotated training datasets. This research aims to classify numerical values extracted from medical documents across seven distinct physiological categories, employing CamemBERT-bio. Previous studies suggested that transformer-based models might not perform as well as traditional NLP models in such tasks. To enhance CamemBERT-bio's performances, we introduce two main innovations: integrating keyword embeddings into the model and adopting a number-agnostic strategy by excluding all numerical data from the text. The implementation of label embedding techniques refines the attention mechanisms, while the technique of using a `numerical-blind' dataset aims to bolster context-centric learning. Another key component of our research is determining the criticality of extracted numerical data. To achieve this, we utilized a simple approach that involves verifying if the value falls within the established standard ranges Our findings are encouraging, showing substantial improvements in the effectiveness of CamemBERT-bio, surpassing conventional methods with an F1 score of 0.89. This represents an over 20\% increase over the 0.73 $F_1$ score of traditional approaches and an over 9\% increase over the 0.82 $F_1$ score of state-of-the-art approaches.
Disciplines :
Computer science
Author, co-author :
Aser Lompo, Boammani
LE, Thanh-Dung  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SigCom
Language :
English
Title :
Numerical Attributes Learning for Cardiac Failure Diagnostic from Clinical Narratives -- A LESA-CamemBERT-bio Approach
Publication date :
2024
Commentary :
Under preparation to submit to IEEE for possible publication
Available on ORBilu :
since 03 September 2024

Statistics


Number of views
71 (1 by Unilu)
Number of downloads
39 (0 by Unilu)

Bibliography


Similar publications



Contact ORBilu