[en] The rise in Natural Language Processing usage is driven by the effectiveness of transformer-based models in understanding contextual information. While generative models like GPT have garnered attention, BERT-based models have not received as much focus. Despite their potency, GPT-based models face classification challenges mainly due to hallucinations. Alternatively, BERT-based models prove proficiency in classification, but are restricted to less than 512 words as input.
In this context, we propose a BERT-based architecture for long-text predictions, featuring an integration network designed to address the challenge of processing lengthy textual inputs. Within this architecture, an LSTM integration layer demonstrated superior performance for models processing only a few pages, while concatenation layers atop BERT yielded improved results for handling both short and very long text sequences. Two tests were conducted to evaluate the effectiveness of this architecture using company documents sourced from the Luxembourg Business Registers. The first test focused on predicting bankruptcy risk based on extensive text sequences extracted from annexes in Annual Accounts. Our approach exhibited strong predictive abilities in this task, boosting accuracy from 5\% to 80\% while maintaining a similar precision of 73\%. In the second test, which involved analyzing modification reports, the model excelled at predicting page types based on the entire content, achieving a significant increase of approximately 23\% in F1 Score compared to the BERT base model.
In summary, the proposed BERT-based architecture offers a versatile solution for handling long text sequences in various domains, providing valuable support for categorization and prediction tasks across different industries and applications.
Research center :
Interdisciplinary Centre for Security, Reliability and Trust (SnT) > SEDAN - Service and Data Management in Distributed Systems NCER-FT - FinTech National Centre of Excellence in Research
Disciplines :
Computer science
Author, co-author :
BLANCO, Braulio ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SEDAN
BRORSSON, Mats Håkan ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SEDAN
External co-authors :
yes
Language :
English
Title :
A Novel Architecture for Long-Text Predictions Using BERT-Based Models