Article (Scientific journals)
Fast cross-validation for multi-penalty ridge regression
van de Wiel, Mark A.; van Nee, Mirrelijn M.; Rauschenberger, Armin
2021In Journal of Computational and Graphical Statistics, 30 (4), p. 835-847
Peer reviewed
 

Files


Full Text
2005.09301.pdf
Author preprint (1.76 MB)
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Abstract :
[en] Prediction based on multiple high-dimensional data types needs to account for the potentially strong differences in predictive signal. Ridge regression is a simple, yet versatile and interpretable model for high-dimensional data that has challenged the predictive performance of many more complex models and learners, in particular in dense settings. Moreover, it allows using a specific penalty per data type to account for differences between those. Then, the largest challenge for multi-penalty ridge is to optimize these penalties efficiently in a cross-validation (CV) setting, in particular for GLM and Cox ridge regression, which require an additional loop for fitting the model by iterative weighted least squares (IWLS). Our main contribution is a computationally very efficient formula for the multi-penalty, sample-weighted hat-matrix, as used in the IWLS algorithm. As a result, nearly all computations are in the low-dimensional sample space. We show that our approach is several orders of magnitude faster than more naive ones. We developed a very flexible framework that includes prediction of several types of response, allows for unpenalized covariates, can optimize several performance criteria and implements repeated CV. Moreover, extensions to pair data types and to allow a preferential order of data types are included and illustrated on several cancer genomics survival prediction problems. The corresponding R-package, multiridge, serves as a versatile standalone tool, but also as a fast benchmark for other more complex models and multi-view learners.
Disciplines :
Mathematics
Author, co-author :
van de Wiel, Mark A.;  Amsterdam University Medical Centers > Department of Epidemiology and Data Science ; University of Cambridge > MRC Biostatistics Unit
van Nee, Mirrelijn M.;  Amsterdam University Medical Centers > Epidemiology and Data Science
Rauschenberger, Armin ;  University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB) > Biomedical Data Science
External co-authors :
yes
Language :
English
Title :
Fast cross-validation for multi-penalty ridge regression
Publication date :
2021
Journal title :
Journal of Computational and Graphical Statistics
Volume :
30
Issue :
4
Pages :
835-847
Peer reviewed :
Peer reviewed
Available on ORBilu :
since 04 January 2021

Statistics


Number of views
235 (65 by Unilu)
Number of downloads
65 (11 by Unilu)

Scopus citations®
 
7
Scopus citations®
without self-citations
4
OpenCitations
 
4
WoS citations
 
9

Bibliography


Similar publications



Contact ORBilu