References of "Rauschenberger, Armin 50034419"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailTen Quick Tips for Biomarker Discovery and Validation Analyses Using Machine Learning
Diaz-Uriarte, R.; Gómez de Lope, Elisa UL; Giugno, R. et al

in PLoS Computational Biology (in press)

High-throughput experimental methods for biosample profiling and growing collections of clinical and health record data provide ample opportunities for biomarker discovery and medical decision support ... [more ▼]

High-throughput experimental methods for biosample profiling and growing collections of clinical and health record data provide ample opportunities for biomarker discovery and medical decision support. However, many of the new data types, including single-cell omics and high-resolution cellular imaging data, also pose particular challenges for data analysis. A high dimensionality of the data in relation to small numbers of available samples, influences of additive and multiplicative noise, large numbers of uninformative or redundant data features, outliers, confounding factors and imbalanced sample group numbers are all common characteristics of current biomedical data collections. While first successes have been achieved in developing clinical decision support tools using multifactorial omics data, there is still an unmet need and great potential for earlier, more accurate and robust diagnostic and prognostic tools for many complex diseases. Here, we provide a set of broadly applicable tips to address some of the most common pitfalls and limitations for biomarker signature development, including supervised and unsupervised machine learning, feature selection and hypothesis testing approaches. In contrast to previous guidelines discussing detailed aspects of quality control, statistics or study reporting, we give a broader overview of the typical challenges and sort the quick tips to address them chronologically by the study phase (starting with study design, then covering consecutive phases of biomarker signature discovery and validation, see also the overview in Fig. 1). While these tips are not comprehensive, they are chosen to cover what we consider as the most frequent, significant, and practically relevant issues and risks in biomarker development. By pointing the reader to further relevant literature on the covered aspects of biomarker discovery and validation, we hope to provide an initial guideline and entry point into the more detailed technical and application-specific aspects of this field. [less ▲]

Detailed reference viewed: 39 (7 UL)
Full Text
Peer Reviewed
See detailA powerful global test for spliceQTL effects
de Menezes, Renee X.; Rauschenberger, Armin UL; ’t Hoen, Peter A. C. et al

in Biometrical Journal (in press)

Statistical methods to test for effects of SNPs on exon inclusion exist, but often rely on testing of associations between multiple exon-SNP pairs, with sometimes subsequent summarization of results at ... [more ▼]

Statistical methods to test for effects of SNPs on exon inclusion exist, but often rely on testing of associations between multiple exon-SNP pairs, with sometimes subsequent summarization of results at the gene level. Such approaches require heavy multiple testing correction, and detect mostly events with large effect sizes. We propose here a test to find spliceQTL effects which takes all exons and all SNPs into account simultaneously. For any chosen gene, this score-based test looks for association between the set of exon expressions and the set of SNPs, via a random-effects model framework. It is efficient to compute, and can be used if the number of SNPs is larger than the number of samples. In addition, the test is powerful to detect effects that are relatively small for individual exon-SNP pairs, but are observed for many pairs. Furthermore, test results are more often replicated across datasets than pairwise testing results. This partly our test is more robust to exon-SNP pair-specific effects, but do not extend to multiple pairs within the same gene. We conclude that the test we propose here offers more power and better replicability in the search for spliceQTL effects. [less ▲]

Detailed reference viewed: 73 (12 UL)
Full Text
Peer Reviewed
See detailRetrograde Procedural Memory in Parkinson’s Disease: A Cross-Sectional, Case-Control Study
Pauly, Laure UL; Pauly, Claire UL; Hansen, Maxime UL et al

in Journal of Parkinson's Disease (2022)

Background: The analysis of the procedural memory is particularly relevant in neurodegenerative disorders like Parkinson’s disease, due to the central role of the basal ganglia in procedural memory. It ... [more ▼]

Background: The analysis of the procedural memory is particularly relevant in neurodegenerative disorders like Parkinson’s disease, due to the central role of the basal ganglia in procedural memory. It has been shown that anterograde procedural memory, the ability to learn a new skill, is impaired in Parkinson’s disease. However, retrograde procedural memory, the long-term retention and execution of skills learned in earlier life stages, has not yet been systematically investigated in Parkinson’s disease. Objective: This study aims to investigate retrograde procedural memory in people with Parkinson’s disease.We hypothesized that retrograde procedural memory is impaired in people with Parkinson’s disease compared to an age- and gender-matched control group. Methods: First, we developed the CUPRO evaluation system, an extended evaluation system based on the Cube Copying Test, to distinguish the cube copying procedure, representing functioning of retrograde procedural memory, and the final result, representing the visuo-constructive abilities. Development of the evaluation system included tests of discriminant validity. Results: Comparing people with typical Parkinson’s disease (n = 201) with age- and gender-matched control subjects (n = 201), we identified cube copying performance to be significantly impaired in people with Parkinson’s disease (p = 0.008) No significant correlation was observed between retrograde procedural memory and disease duration. Conclusion: We demonstrated lower cube copying performance in people with Parkinson’s disease compared to control subjects, which suggests an impaired functioning of retrograde procedural memory in Parkinson’s disease. [less ▲]

Detailed reference viewed: 25 (0 UL)
Full Text
Peer Reviewed
See detailRetrograde Procedural Memory in Parkinson's Disease
Pauly, Laure UL; Pauly, Claire UL; Hansen, Maxime UL et al

Poster (2022, March)

Detailed reference viewed: 29 (0 UL)
Full Text
See detailThe retrograde procedural memory in people with Parkinson’s disease with or without freezing of gait – a cross-sectional study
Pauly, Laure UL; Rauschenberger, Armin UL; Pauly, Claire UL et al

Poster (2021, September 17)

Objective: To investigate the retrograde procedural memory in people with typical Parkinson’s disease (PwP) with or without freezing of gait (FOG). We hypothesized that the retrograde procedural memory is ... [more ▼]

Objective: To investigate the retrograde procedural memory in people with typical Parkinson’s disease (PwP) with or without freezing of gait (FOG). We hypothesized that the retrograde procedural memory is more strongly impaired in patients with FOG (FOG+) than in patients without FOG (FOG-). Background: Given that cognitive functions, like executive control and automaticity, are crucial for mobility, it is of great importance to get a deeper knowledge of the cognitive impairment that may interfere with walking and causing gait disturbances in PwP, i.e. FOG. The integrity of retrograde procedural memory, the ability to execute skills that have been learned in earlier life stages, is essential for a person’s ability to complete routine, procedural activities like walking. As FOG is characterized as a de-automatization disorder, we hypothesized an impairment of the retrograde procedural memory in patients with FOG. Methods: A total of 194 patients from the Luxembourg Parkinson’s study were included into the cross-sectional study. All patients were assigned to the FOG+ / FOG- groups based on a semi-structured interview conducted by a study physician. The extended evaluation system of the cube copying test was applied to evaluate both the cube-drawing procedure, representing the retrograde procedural memory, and the final result, representing the visuo-constructive abilities (Pauly et al., 2020, MDS abstract). We compared the cube copying performance of n=97 FOG+ with n=97 age-, gender- and education-matched FOG-. Results: FOG+ scored lower on the cube copying procedure compared to the FOG- (p=0.027), which is suggestive of an impaired retrograde procedural memory in FOG+. No significant differences in the visuo-constructional abilities were detected (p=0.945). Conclusion: In line with FOG being considered a de-automatization of walking, a skill acquired in earlier life stages, the present results suggest that PwP with FOG have an impaired retrograde procedural memory in comparison to PwP without FOG. The results lend support to the ability of the extended evaluation system of the cube copying test to assess impaired retrograde procedural memory and help improve our understanding of behavioral symptoms in PwP. [less ▲]

Detailed reference viewed: 77 (16 UL)
Full Text
Peer Reviewed
See detailFast cross-validation for multi-penalty ridge regression
van de Wiel, Mark A.; van Nee, Mirrelijn M.; Rauschenberger, Armin UL

in Journal of Computational and Graphical Statistics (2021), 30(4), 835-847

Prediction based on multiple high-dimensional data types needs to account for the potentially strong differences in predictive signal. Ridge regression is a simple, yet versatile and interpretable model ... [more ▼]

Prediction based on multiple high-dimensional data types needs to account for the potentially strong differences in predictive signal. Ridge regression is a simple, yet versatile and interpretable model for high-dimensional data that has challenged the predictive performance of many more complex models and learners, in particular in dense settings. Moreover, it allows using a specific penalty per data type to account for differences between those. Then, the largest challenge for multi-penalty ridge is to optimize these penalties efficiently in a cross-validation (CV) setting, in particular for GLM and Cox ridge regression, which require an additional loop for fitting the model by iterative weighted least squares (IWLS). Our main contribution is a computationally very efficient formula for the multi-penalty, sample-weighted hat-matrix, as used in the IWLS algorithm. As a result, nearly all computations are in the low-dimensional sample space. We show that our approach is several orders of magnitude faster than more naive ones. We developed a very flexible framework that includes prediction of several types of response, allows for unpenalized covariates, can optimize several performance criteria and implements repeated CV. Moreover, extensions to pair data types and to allow a preferential order of data types are included and illustrated on several cancer genomics survival prediction problems. The corresponding R-package, multiridge, serves as a versatile standalone tool, but also as a fast benchmark for other more complex models and multi-view learners. [less ▲]

Detailed reference viewed: 140 (40 UL)
Full Text
Peer Reviewed
See detailBiomarker discovery studies for patient stratification using machine learning analysis of omics data: a scoping review
Glaab, Enrico UL; Rauschenberger, Armin UL; Banzi, Rita et al

in BMJ Open (2021), 11(12), 053674

Objective: To review biomarker discovery studies using omics data for patient stratification which led to clinically validated FDA-cleared tests or laboratory developed tests, in order to identify common ... [more ▼]

Objective: To review biomarker discovery studies using omics data for patient stratification which led to clinically validated FDA-cleared tests or laboratory developed tests, in order to identify common characteristics and derive recommendations for future biomarker projects. Design: Scoping review. Methods: We searched PubMed, EMBASE and Web of Science to obtain a comprehensive list of articles from the biomedical literature published between January 2000 and July 2021, describing clinically validated biomarker signatures for patient stratification, derived using statistical learning approaches. All documents were screened to retain only peer-reviewed research articles, review articles or opinion articles, covering supervised and unsupervised machine learning applications for omics-based patient stratification. Two reviewers independently confirmed the eligibility. Disagreements were solved by consensus. We focused the final analysis on omics-based biomarkers which achieved the highest level of validation, that is, clinical approval of the developed molecular signature as a laboratory developed test or FDA approved tests. Results: Overall, 352 articles fulfilled the eligibility criteria. The analysis of validated biomarker signatures identified multiple common methodological and practical features that may explain the successful test development and guide future biomarker projects. These include study design choices to ensure sufficient statistical power for model building and external testing, suitable combinations of non-targeted and targeted measurement technologies, the integration of prior biological knowledge, strict filtering and inclusion/exclusion criteria, and the adequacy of statistical and machine learning methods for discovery and validation. Conclusions: While most clinically validated biomarker models derived from omics data have been developed for personalised oncology, first applications for non-cancer diseases show the potential of multivariate omics biomarker design for other complex disorders. Distinctive characteristics of prior success stories, such as early filtering and robust discovery approaches, continuous improvements in assay design and experimental measurement technology, and rigorous multicohort validation approaches, enable the derivation of specific recommendations for future studies. [less ▲]

Detailed reference viewed: 44 (5 UL)
Full Text
Peer Reviewed
See detailPredicting correlated outcomes from molecular data
Rauschenberger, Armin UL; Glaab, Enrico UL

in Bioinformatics (2021), 37(21), 38893895

Motivation: Multivariate (multi-target) regression has the potential to outperform univariate (single-target) regression at predicting correlated outcomes, which frequently occur in biomedical and ... [more ▼]

Motivation: Multivariate (multi-target) regression has the potential to outperform univariate (single-target) regression at predicting correlated outcomes, which frequently occur in biomedical and clinical research. Here we implement multivariate lasso and ridge regression using stacked generalisation. Results: Our flexible approach leads to predictive and interpretable models in high-dimensional settings, with a single estimate for each input-output effect. In the simulation, we compare the predictive performance of several state-of-the-art methods for multivariate regression. In the application, we use clinical and genomic data to predict multiple motor and non-motor symptoms in Parkinson’s disease patients. We conclude that stacked multivariate regression, with our adaptations, is a competitive method for predicting correlated outcomes. Availability and Implementation: The R package joinet is available on GitHub (https://github.com/rauschenberger/joinet) and CRAN (https://CRAN.R-project.org/package=joinet). [less ▲]

Detailed reference viewed: 79 (2 UL)
Full Text
Peer Reviewed
See detailPredictive and interpretable models via the stacked elastic net
Rauschenberger, Armin UL; Glaab, Enrico UL; van de Wiel, Mark

in Bioinformatics (2021), 37(14), 20122016

Motivation: Machine learning in the biomedical sciences should ideally provide predictive and interpretable models. When predicting outcomes from clinical or molecular features, applied researchers often ... [more ▼]

Motivation: Machine learning in the biomedical sciences should ideally provide predictive and interpretable models. When predicting outcomes from clinical or molecular features, applied researchers often want to know which features have effects, whether these effects are positive or negative, and how strong these effects are. Regression analysis includes this information in the coefficients but typically renders less predictive models than more advanced machine learning techniques. Results: Here we propose an interpretable meta-learning approach for high-dimensional regression. The elastic net provides a compromise between estimating weak effects for many features and strong effects for some features. It has a mixing parameter to weight between ridge and lasso regularisation. Instead of selecting one weighting by tuning, we combine multiple weightings by stacking. We do this in a way that increases predictivity without sacrificing interpretability. Availability and Implementation: The R package starnet is available on GitHub: https://github.com/rauschenberger/starnet. [less ▲]

Detailed reference viewed: 279 (27 UL)
Full Text
Peer Reviewed
See detailSparse classification with paired covariates
Rauschenberger, Armin UL; Ciocănea-Teodorescu, Iuliana; Jonker, Marianne A. et al

in Advances in Data Analysis and Classification (2020), 14

This paper introduces the paired lasso: a generalisation of the lasso for paired covariate settings. Our aim is to predict a single response from two high-dimensional covariate sets. We assume a one-to ... [more ▼]

This paper introduces the paired lasso: a generalisation of the lasso for paired covariate settings. Our aim is to predict a single response from two high-dimensional covariate sets. We assume a one-to-one correspondence between the covariate sets, with each covariate in one set forming a pair with a covariate in the other set. Paired covariates arise, for example, when two transformations of the same data are available. It is often unknown which of the two covariate sets leads to better predictions, or whether the two covariate sets complement each other. The paired lasso addresses this problem by weighting the covariates to improve the selection from the covariate sets and the covariate pairs. It thereby combines information from both covariate sets and accounts for the paired structure. We tested the paired lasso on more than 2000 classification problems with experimental genomics data, and found that for estimating sparse but predictive models, the paired lasso outperforms the standard and the adaptive lasso. The R package palasso is available from CRAN. [less ▲]

Detailed reference viewed: 187 (23 UL)
Full Text
Peer Reviewed
See detailTesting for association between RNA-Seq and high-dimensional data
Rauschenberger, Armin UL; Jonker, Marianne A.; van de Wiel, Mark A. et al

in BMC Bioinformatics (2016), 17

Background: Testing for association between RNA-Seq and other genomic data is challenging due to high variability of the former and high dimensionality of the latter. Results: Using the negative binomial ... [more ▼]

Background: Testing for association between RNA-Seq and other genomic data is challenging due to high variability of the former and high dimensionality of the latter. Results: Using the negative binomial distribution and a random-effects model, we develop an omnibus test that overcomes both difficulties. It may be conceptualised as a test of overall significance in regression analysis, where the response variable is overdispersed and the number of explanatory variables exceeds the sample size. Conclusions: The proposed test can detect genetic and epigenetic alterations that affect gene expression. It can examine complex regulatory mechanisms of gene expression. The R package globalSeq is available from Bioconductor. [less ▲]

Detailed reference viewed: 84 (5 UL)