Abstract :
[en] Time series forecasting (TSF) is indispensable for decision-making under uncertainty. This thesis systematically explores Transformer-based TSF across four key dimensions: Data, Model, Application, and Evaluation, to facilitate research in this field.
The evolution of artificial intelligence (AI) has been shaped by advances in both model sophistication and the emergence of the data-centric paradigm, which highlights the critical role of high-quality data in the machine learning (ML) pipeline. Among recent innovations, the Transformer architecture has shown remarkable performance across domains such as natural language processing (NLP), computer vision (CV), and time series forecasting (TSF). Chapter 3 (Data) of this thesis bridges the gap between Transformer-based TSF and data-centric AI through a structured literature review and taxonomy, highlighting recent research works with various solutions at the intersection of Transformers and data-centric AI, with the aim of laying a foundation for future research at this intersection.
Previous research showes that in multivariate time-series forecasting, Transformers handle complex, noisy datasets well but often suffer from redundant features and heavy computational demands. To address these challenges, we introduce a novel framework that integrates Principal Component Analysis (PCA) to streamline inputs, enhancing both accuracy and efficiency. Evaluated across five state-of-the-art (SOTA) models and four real-world datasets, the PCA+Crossformer variant reduces mean square error (MSE) by 33.3% and runtime by 49.2%. Notably, the framework achieves up to 86.9% runtime reduction on the Traffic dataset, demonstrating its practical value in real-world applications. The Chapter 4 (Model) details this PCA-enhanced Transformer framework, establishing a foundation for subsequent architectural innovations.
Beyond standard forecasting tasks, we extend our analysis to credit default swaps (CDS). While sophisticated models like Transformers, gradient-boosted machines (GBM), and extreme gradient boosted (XGBoost) offer strong performance, interpretability remains vital for high-stakes financial decisions. Leveraging explainability tools and hyperparameter optimization via high-performance computing (HPC), our experiments show that fine-tuned XGBoost offers the best balance between accuracy and interpretability. To further enhance trust in AI-driven decisions, we also include a Trustworthy AI (TAI) framework. The Chapter 5 (Application) presents both quantitative metrics and qualitative insights as well as transformer-based TSF in the CDS context.
Finally, to address the current lack of a unified hyperparameter optimization (HPO) pipeline for Transformer-based TSF, we present a generalizable HPO framework. Validated on benchmark datasets and extended to recent models like Mamba and TimeMixer, this pipeline offers practical guidance for efficient model tuning. All code and results are publicly released to promote transparency and further innovation. The Chapter 6 (Evaluation) provides open-source HPO tools and best practices for model tuning and assessment, aiming to facilitate model selection and usage in practice.
Commentary :
In reference to IEEE copyrighted material which is used with permission in this thesis, the IEEE does not endorse any of University of Luxembourg's products or services. Internal or personal use of this material is permitted. If interested in reprinting/republishing IEEE copyrighted material for advertising or promotional purposes or for creating new collective works for resale or redistribution, please go to http://www.ieee.org/publications_standards/publications/rights/rights_link.html to learn how to obtain a License from RightsLink.
If applicable, University Microfilms and/or ProQuest Library, or the Archives of Canada may supply single copies of the dissertation.