Bernstein’s inequality; LASSO estimation; Low rank estimation; Quadratic variation; Rank recovery; Realized variance; Shrinkage estimator; Statistics and Probability
Abstract :
[en] In this paper, we develop a penalized realized variance (PRV) estimator of the quadratic variation (QV) of a high-dimensional continuous Itô semimartingale. We adapt the principle idea of regularization from linear regression to covariance estimation in a continuous-time high-frequency setting. We show that under a nuclear norm penalization, the PRV is computed by soft-thresholding the eigenvalues of realized variance (RV). It therefore encourages sparsity of singular values or, equivalently, low rank of the solution. We prove our estimator is minimax optimal up to a logarithmic factor. We derive a concentration inequality, which reveals that the rank of PRV is—with a high probability—the number of non-negligible eigenvalues of the QV. Moreover, we also provide the associated non-asymptotic analysis for the spot variance. We suggest an intuitive data-driven subsampling procedure to select the shrinkage parameter. Our theory is supplemented by a simulation study and an empirical application. The PRV detects about three–five factors in the equity market, with a notable rank decrease during times of distress in financial markets. This is consistent with most standard asset pricing models, where a limited amount of systematic factors driving the cross-section of stock returns are perturbed by idiosyncratic errors, rendering the QV—and also RV—of full rank.
Disciplines :
Mathematics
Author, co-author :
Christensen, Kim; Department of Economics and Business Economics, Aarhus University, Aarhus, Denmark
Nielsen, Mikkel Slot; Department of Mathematics, Aarhus University, Aarhus, Denmark
PODOLSKIJ, Mark ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Mathematics (DMATH)
External co-authors :
yes
Language :
English
Title :
High-dimensional estimation of quadratic variation based on penalized realized variance
H2020 - 815703 - STAMFORD - Statistical Methods For High Dimensional Diffusions
Name of the research project :
Statistical Methods For High Dimensional Diffusions
Funders :
FP7 Ideas: European Research Council Danmarks Frie Forskningsfond Union Européenne
Funding number :
815703
Funding text :
Christensen and Nielsen were supported by the Independent Research Fund Denmark under grant 1028–00030B and 9056–00011B. Podolskij acknowledges funding from the ERC Consolidator Grant 815703 “STAMFORD: Statistical Methods for High Dimensional Diffusions”.
Aït-Sahalia Y, Xiu D (2017) Using principal component analysis to estimate a high dimensional factor model with high-frequency data. J Econ 201(2):384–399 DOI: 10.1016/j.jeconom.2017.08.015
Aït-Sahalia Y, Xiu D (2019) Principal component analysis of high-frequency data. J Am Stat Assoc 114(525):287–303 DOI: 10.1080/01621459.2017.1401542
Andersen TG, Bollerslev T (1998) Answering the skeptics: yes, standard volatility models do provide accurate forecasts. Int Econ Rev 39(4):885–905 DOI: 10.2307/2527343
Andersen TG, Bollerslev T, Diebold FX, Labys P (2003) Modeling and forecasting realized volatility. Econometrica 71(2):579–625 DOI: 10.1111/1468-0262.00418
Argyriou A, Evgeniou T, Pontil M (2008) Convex multi-task feature learning. Mach Learn 73(3):243–272 DOI: 10.1007/s10994-007-5040-8
Bach FR (2008) Consistency of trace norm minimization. J Mach Learn Res 9(35):1019–1048
Barndorff-Nielsen OE, Graversen SE, Jacod J, Podolskij M, Shephard N (2006) A central limit theorem for realized power and bipower variations of continuous semimartingales. In: Kabanov Y, Lipster R, Stoyanov J (eds) From stochastic calculus to mathematical finance: the shiryaev festschrift. Springer, Heidelberg, pp 33–68 DOI: 10.1007/978-3-540-30788-4_3
Barndorff-Nielsen OE, Shephard N (2002) Econometric analysis of realized volatility and its use in estimating stochastic volatility models. J Roy Stat Soc B 64(2):253–280 DOI: 10.1111/1467-9868.00336
Barndorff-Nielsen OE, Shephard N (2004) Econometric analysis of realized covariation: high frequency based covariance, regression, and correlation in financial economics. Econometrica 72(3):885–925 DOI: 10.1111/j.1468-0262.2004.00515.x
Cai TT, Hu J, Li Y, Zheng X (2020) High-dimensional minimum variance portfolio estimation based on high-frequency data. J Econ 214(2):482–494 DOI: 10.1016/j.jeconom.2019.04.039
Candès EJ, Recht B (2009) Exact matrix completion via convex optimization. Found Comput Math 9(6):717–772 DOI: 10.1007/s10208-009-9045-5
Christensen K, Podolskij M, Thamrongrat N, Veliyev B (2017) Inference from high-frequency data: a subsampling approach. J Econ 197(2):245–272 DOI: 10.1016/j.jeconom.2016.07.010
Clarke FH (1990) Optimization and Nonsmooth Analysis. Society for Industrial and Applied Mathematics, Philadelphia, 1st edn
Delbaen F, Schachermayer W (1994) A general version of the fundamental theorem of asset pricing. Math Ann 300(1):463–520 DOI: 10.1007/BF01450498
Diop A, Jacod J, Todorov V (2013) Central limit theorems for approximate quadratic variations of pure jump Itô semimartingales. Stoch Process Appl 123(3):839–886 DOI: 10.1016/j.spa.2012.11.003
Fissler T, Podolskij M (2017) Testing the maximal rank of the volatility process for continuous diffusions observed with noise. Bernoulli 23(4B):3021–3066 DOI: 10.3150/16-BEJ836
Hautsch N, Kyj LM, Oomen RCA (2012) A blocking and regularization approach to high dimensional realized covariance estimation. J Appl Econom 27(4):625–645 DOI: 10.1002/jae.1218
Heiny J, Podolskij M (2020) “On estimation of quadratic variation for multivariate pure jump semimartingales,” preprint arXiv:2009.02786
Heston SL (1993) A closed-form solution for options with stochastic volatility with applications to bond and currency options. Rev Financ Stud 6(2):327–343 DOI: 10.1093/rfs/6.2.327
Jacod J (1994) “Limit of random measures associated with the increments of a Brownian semimartingale,” Preprint number 120, Laboratoire de Probabilitiés, Université Pierre et Marie Curie, Paris
Jacod J (2008) Asymptotic properties of realized power variations and related functionals of semimartingales. Stoch Process Appl 118(4):517–559 DOI: 10.1016/j.spa.2007.05.005
Jacod J, Lejay A, Talay D (2008) Estimation of the Brownian dimension of a continuous Itô process. Bernoulli 14(2):469–498 DOI: 10.3150/07-BEJ6190
Jacod J, Podolskij M (2013) A test for the rank of the volatility process: the random perturbation approach. Ann Stat 41(5):2391–2427 DOI: 10.1214/13-AOS1153
Jacod J, Podolskij M (2018) On the minimal number of driving Lévy motions in a multivariate price model. J Appl Probab 55(3):823–833 DOI: 10.1017/jpr.2018.52
Jacod J, Protter PE (2012) Discretization of processes, 2nd edn. Springer, Berlin DOI: 10.1007/978-3-642-24127-7
Kalnina I (2011) Subsampling high frequency data. J Econom 161(2):262–283 DOI: 10.1016/j.jeconom.2010.12.011
Koltchinskii V, Lounici K, Tsybakov AB (2011) Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion. Ann Stat 39(5):2302–2329 DOI: 10.1214/11-AOS894
Kong X-B (2017) On the number of common factors underlying large panel high-frequency data. Biometrika 104(2):397–410 DOI: 10.1093/biomet/asx014
Kong X-B (2020) A random-perturbation-based rank estimator of the number of factors. Biometrika 107(2):505–511
Lounici K (2014) High-dimensional covariance matrix estimation with missing observations. Bernoulli 20(3):1029–1058 DOI: 10.3150/12-BEJ487
Lunde A, Shephard N, Sheppard K (2016) Econometric analysis of vast covariance matrices using composite realized kernels and their application to portfolio choice. J Bus Econ Stat 34(4):504–518 DOI: 10.1080/07350015.2015.1064432
Mancini C (2009) Non-parametric threshold estimation for models with stochastic diffusion coefficient and jumps. Scand J Stat 36(2):270–296 DOI: 10.1111/j.1467-9469.2008.00622.x
Marinelli C, Röckner M (2016) On the maximal inequalities of Burkholder, Davis and Gundy. Expo Math 34(1):1–26 DOI: 10.1016/j.exmath.2015.01.002
Minsker S (2017) On some extensions of Bernstein’s inequality for self-adjoint operators. Stat Probab Lett 127(1):111–119 DOI: 10.1016/j.spl.2017.03.020
Negahban S, Wainwright MJ (2011) Estimation of (near) low-rank matrices with noise and high-dimensional scaling. Ann Stat 39(2):1069–1097 DOI: 10.1214/10-AOS850
Pelger M (2019) Large-dimensional factor modeling based on high-frequency observations. J Econom 208(1):23–42 DOI: 10.1016/j.jeconom.2018.09.004
Politis DN, Romano JP, Wolf M (1999) Subsampling, 1st edn. Springer, Berlin DOI: 10.1007/978-1-4612-1554-7
Recht B, Fazel M, Parrilo PA (2010) Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev 52(3):471–501 DOI: 10.1137/070697835
Reiss M, Todorov V, Tauchen G (2015) Nonparametric test for a constant beta between Itô semi-martingales based on high-frequency data. Stoch Process Appl 125(8):2955–2988 DOI: 10.1016/j.spa.2015.02.008
Ross SA (1976) The arbitrage theory of capital asset pricing. J Econ Theory 13(3):341–360 DOI: 10.1016/0022-0531(76)90046-6
Seidler J, Sobukawa T (2003) Exponential integrability of stochastic convolutions. J Lond Math Soc 67(1):245–258 DOI: 10.1112/S0024610702003745
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Roy Stat Soc B 58(1):267–288
Tropp J (2011) Freedman’s inequality for matrix martingales. Electron Commun Probab 16(1):262–270
Tropp JA (2012) User-friendly tail bounds for sums of random matrices. Found Comput Math 12(4):389–434 DOI: 10.1007/s10208-011-9099-z
Tropp JA (2015) An introduction to matrix concentration inequalities.’ Foundations and Trends® in Machine Learning 8(1–2):1–230
Vershynin R (2010) Introduction to the non-asymptotic analysis of random matrices. In: Eldar YC, Kutyniok G (eds) Compressed sensing: theory and applications. Cambridge University Press, Cambridge, pp 210–268
Wang Y, Zou J (2010) Vast volatility matrix estimation for high-frequency financial data. Ann Stat 38(2):943–978 DOI: 10.1214/09-AOS730
Watson GA (1992) Characterization of the subdifferential of some matrix norms. Linear Algebra Appl 170(1):33–45 DOI: 10.1016/0024-3795(92)90407-2
Zheng X, Li Y (2011) On the estimation of integrated covariance matrices of high dimensional diffusion processes. Ann Stat 39(6):3121–3151 DOI: 10.1214/11-AOS939