Keywords :
Machine learning, Biomarkers, Clinical cohort, Cancer, Geriatrics, Frailty, Chronobiology
Abstract :
[en] This dissertation is an illustrative example of how computational biology can be applied in interdisciplinary settings to identify prognostic biomarkers in clinical cohort and cancer cell line data and highlights the potential for integrating these methodologies into modern systems biology curricula. While the first two chapters of this dissertation are focusing on applied computational biology, the third chapter is exploring the integration of these machine learning approaches into current systems biology education.
The first chapter entitled ‘Biomarker detection in clinical cohort data using machine learning’ showcases how data-driven computational biology can be applied for exploratory and hypothesis-generating research in biomedical clinical cohort data. In the context of the geriatric condition of frailty, post-hoc interpretable machine learning applications reveal that men and women show distinct frailty phenotype profiles, linked to body composition in men and physiological anomalies in women. In fact, (pre-)frailty prediction performance improved with sex-specific tailored machine learning models. These revealed that the physical frailty profile in men is characterised by high fat and low body lean mass, whereas the female physical frailty is more linked to vitamin D deficiency and increased concentrations of monocytes, leukocytes and eosinophils in blood. Furthermore, post-hoc analysis indicates that the combinations of such features, not single markers, best capture these sex-specific pre-frailty patterns. Eventually, these findings led to follow-up research on validating and further investigating these intriguing physical pre-frailty patterns in a Luxembourgish Parkinson’s Disease study.
The second chapter, ‘Drug sensitivity prediction for time-of-day cancer treatment profiling’, concentrates on hypothesis-driven approaches to predict time-dependent drug sensitivity in cancer cell line expression data. The projects in this chapter underscore circadian dynamics as critical factor influencing overall cancer drug responsiveness, and our approaches significantly contributed to the development and validation of a robust quantitative phenotyping platform to evaluate drug timing effects and predict drug sensitivity, resulting in the introduction of the chronotherapeutic index and the chronosensitivity index to assess timing effect and sensitivity of cancer drugs. Additionally, these applications help leveraging circadian characteristics to stratify cancer cell lines into new subtypes with high predictive value, this in the context of triple negative breast cancer and neuroblastoma. For example, new circadian-related subtypes were identified in triple negative breast cancer, separating them in unstable, weak, dysfunctional, and functional circadian state. Overall, these contributions helped building an interdisciplinary and translational framework where cellular clock phenotypes effectively could shape chronotherapy design in oncological treatments. The projects presented in this chapter were initiated and led by the Granada Lab of the Charité Comprehensive Cancer Center of the Medical University of Berlin with close collaboration of the Systems Biology and Epigenetics Team of the Department of Life Sciences and Medicine at the University of Luxembourg.
Finally, the third chapter is focusing on ‘Machine learning integration in systems biology education’. This chapter attempts to lay out the status quo of machine learning in current systems biology study lines. It aligns the importance of interdisciplinary collaborations and the integration of computational biology to respond to the current opportunities and challenges in this field of study. In a recently published review, we realised that systems biology education must combine deep biological knowledge with computational and technological methods, yet current graduate programs still struggle to deliver this integration effectively. Insufficient exposure to multimodal data integration (e.g., clinical cohort data and cell line data coupled with machine learning approaches) adds to the consequences of this lack. As a result, we concluded that without early and sustained institutional commitment, the field risks producing graduates underprepared for translational bioinformatics and precision systems applications anticipated to shape the future of the field. A good example to mitigate such consequences is the careful design of adaptive and interdisciplinary educational material that can be used in classrooms to, for example, predict drug targets and candidate drugs for repurposed cancer therapies in the context of metabolic modelling, machine learning, and expression data.
In conclusion, this dissertation exhibits how computational biology can drive discovery in both research and education. From identifying prognostic biomarkers in geriatric conditions to shaping cancer treatment strategies, and from data integration to curriculum design, it underscores the power and necessity of bridging biology and machine learning in today’s scientific landscape.
Institution :
Unilu - University of Luxembourg [Faculty of Science, Technology, and Medicine (FSTM)], Belval, Luxembourg