[en] This paper addresses a gradient-based Markov Chain Monte Carlo (MCMC) method to sample from the posterior distribution of problems with nonsmooth potential functions. Following the Bayesian paradigm, our potential function will be some of two convex functions, where one of which is smooth. We first approximate the potential function by the so-called forward-backward envelope function, which is a real-valued smooth function with the same critical points as the original one. Then, we incorporate this smoothing technique with the unadjusted Langevin algorithm (ULA), leading to smoothing ULA, called SULA. We next establish non-asymptotic convergence results of SULA under mild assumption on the original potential function. We finally report some numerical results to establish the promising performance of SULA on both synthetic and real chemoinformatics data.
Disciplines :
Mathematics
Author, co-author :
GHADERI, Susan ; University of Luxembourg ; Department of Electrical Engineering ESAT-STADIUS, KU Leuven, Leuven, Belgium
AHOOKHOSH, Masoud ; University of Luxembourg ; Department of Mathematics, University of Antwerp, Antwerp, Belgium
Arany, Adam; Department of Electrical Engineering ESAT-STADIUS, KU Leuven, Leuven, Belgium
SKUPIN, Alexander ; University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB) > Integrative Cell Signalling ; Department of Neuroscience, University of California San Diego, La Jolla, United States
Patrinos, Panagiotis; Department of Electrical Engineering ESAT-STADIUS, KU Leuven, Leuven, Belgium
Moreau, Yves ; Department of Electrical Engineering ESAT-STADIUS, KU Leuven, Leuven, Belgium
External co-authors :
yes
Language :
English
Title :
Smoothing unadjusted Langevin algorithms for nonsmooth composite potential functions
This project has received partially funded by Research Council, KU Leuven (Symbiosis3 ( C14/18/092 ); Federated cloud-based Artificial Intelligence-driven platform for liquid biopsy analyses ( C3/20/100 ); CELSA - Active Learning ( CELSA/21/019 ), CELSA-HIDUCTION ( CELSA/17/032 )); Flemish Government (FWO: SBO ( S003422N ), Elixir Belgium ( I002819N ); SB and Postdoctoral grants; This research received funding from the Flemish Government (AI Research Program). Yves Moreau and Susan Ghaderi are affiliated to Leuven.AI - KU Leuven institute for AI, B-3000 , Leuven, Belgium; VLAIO PM: Augmanting Therapeutic Effectiveness through Novel Analytics ( HBC.2019.2528 )); and EU (“MELLODDY” This project has received funding from the Innovative Medicines Initiative 2 Joint Undertaking under grant agreement No. 831472 . This Joint Undertaking receives support from the European Union's Horizon 2020 research and innovation programme and EFPIA ). The first author was supported by an FWO junior postdoctoral fellowship [ 12AK924N ]. The second author was partially supported by the Research Foundation Flanders (FWO) grant G081222N and by the UA BOF DocPRO4 project with ID 46929 . The authors are also grateful to the associate editor and to the anonymous referees for their helpful comments and suggestions that improved the quality of the paper.
Ahn, S., Shahbaba, B., Welling, M., Distributed stochastic gradient mcmc. International Conference on Machine Learning, PMLR, 2014, 1044–1052.
Ahookhosh, M., Optimal subgradient methods: computational properties for large-scale linear inverse problems. Optim. Eng. 19 (2018), 815–844.
Ahookhosh, M., Themelis, A., Patrinos, P., A Bregman forward-backward linesearch algorithm for nonconvex composite optimization: superlinear convergence to nonisolated local minima. SIAM J. Optim. 31 (2021), 653–685.
Bauschke, H.H., Combettes, P.L., Convex Analysis and Monotone Operator Theory in Hilbert Spaces. CMS Books in Mathematics, 2017, Springer, 10.1007/978-3-319-48311-5.
Beck, A., First-Order Methods in Optimization. 2017, Society for Industrial and Applied Mathematics, Philadelphia, PA, 10.1137/1.9781611974997.
Beck, A., Teboulle, M., Smoothing and first order methods: a unified framework. SIAM J. Optim. 22 (2012), 557–580.
Besag, J., Green, P., Higdon, D., Mengersen, K., Bayesian computation and stochastic systems. Stat. Sci., 1995, 3–41.
Brosse, N., Durmus, A., Moulines, É., Pereyra, M., Sampling from a log-concave distribution with compact support with proximal Langevin Monte Carlo. Conference on Learning Theory, PMLR, 2017, 319–342.
Cai, X., McEwen, J.D., Pereyra, M., Proximal nested sampling for high-dimensional Bayesian model selection. Stat. Comput., 32, 2022, 87.
Chatterji, N., Diakonikolas, J., Jordan, M.I., Bartlett, P., Langevin Monte Carlo without smoothness. International Conference on Artificial Intelligence and Statistics, PMLR, 2020, 1716–1726.
Chen, C., Ding, N., Carin, L., On the convergence of stochastic gradient mcmc algorithms with high-order integrators. arXiv:1610.06665, 2016.
Combettes, P.L., Pesquet, J.C., Proximal splitting methods in signal processing. Fixed-Point Algorithms for Inverse Problems in Science and Engineering, 2011, Springer New York, New York, NY, 185–212, 10.1007/978-1-4419-9569-8_10.
Combettes, P.L., Wajs, V.R., Signal recovery by proximal forward-backward splitting. Multiscale Model. Simul. 4 (2005), 1168–1200.
Dalalyan, A., Further and stronger analogy between sampling and optimization: Langevin Monte Carlo and gradient descent. Conference on Learning Theory, PMLR, 2017, 678–689.
Dalalyan, A.S., Karagulyan, A., User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient. Stoch. Process. Appl. 129 (2019), 5278–5311.
Dalalyan, A.S., Riou-Durand, L., Karagulyan, A., Bounding the error of discretized Langevin algorithms for non-strongly log-concave targets. arXiv preprint arXiv:1906.08530, 2019.
De Bortoli, V., Durmus, A., Pereyra, M., Vidal, A.F., Maximum likelihood estimation of regularization parameters in high-dimensional inverse problems: an empirical Bayesian approach. Part II: theoretical analysis. SIAM J. Imaging Sci. 13 (2020), 1990–2028.
Durmus, A., Majewski, S., Miasojedow, B., Analysis of Langevin Monte Carlo via convex optimization. J. Mach. Learn. Res. 20 (2019), 2666–2711.
Durmus, A., Moulines, E., High-dimensional Bayesian inference via the unadjusted Langevin algorithm. Bernoulli 25 (2019), 2854–2882.
Durmus, A., Moulines, E., Pereyra, M., Efficient Bayesian computation by proximal Markov chain Monte Carlo: when Langevin meets Moreau. SIAM J. Imaging Sci. 11 (2018), 473–506.
Dwivedi, R., Chen, Y., Wainwright, M.J., Yu, B., Log-concave sampling: Metropolis-Hastings algorithms are fast!. Conference on Learning Theory, PMLR, 2018, 793–797.
Eftekhari, A., Vargas, L., Zygalakis, K.C., The forward–backward envelope for sampling with the overdamped Langevin algorithm. Stat. Comput., 33, 2023, 85.
Ermak, D.L., A computer simulation of charged particles in solution. I. Technique and equilibrium properties. J. Chem. Phys. 62 (1975), 4189–4196.
Ghaderi, S., Mathematical Approaches to Biological Complexity in Systems Biomedicine. Ph.D. thesis, 2021, University of Luxembourg, Esch-sur-Alzette, Luxembourg.
Gilks, W.R., Wild, P., Adaptive rejection sampling for Gibbs sampling. J. R. Stat. Soc., Ser. C, Appl. Stat. 41 (1992), 337–348.
Giselsson, P., Fält, M., Envelope functions: unifications and further properties. J. Optim. Theory Appl., 2018, 10.1007/s10957-018-1328-z.
Lamberton, D., Pages, G., Recursive computation of the invariant distribution of a diffusion. Bernoulli 8:3 (2002), 367–405.
Li, Q., Lin, N., et al. The Bayesian elastic net. Bayesian Anal. 5 (2010), 151–170.
Lions, P.L., Mercier, B., Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16 (1979), 964–979, 10.1137/0716071.
Luu, T.D., Fadili, J., Chesneau, C., Sampling from non-smooth distributions through Langevin diffusion. Methodol. Comput. Appl. Probab. 23 (2021), 1173–1201.
Ma, Y.A., Chen, T., Fox, E.B., A complete recipe for stochastic gradient mcmc. arXiv:1506.04696, 2015.
Martin, J., Wilcox, L.C., Burstedde, C., Ghattas, O., A stochastic Newton mcmc method for large-scale statistical inverse problems with application to seismic inversion. SIAM J. Sci. Comput. 34 (2012), A1460–A1487.
Mattingly, J.C., Stuart, A.M., Higham, D.J., Ergodicity for sdes and approximations: locally Lipschitz vector fields and degenerate noise. Stoch. Process. Appl. 101 (2002), 185–232.
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E., Equation of state calculations by fast computing machines. J. Chem. Phys. 21 (1953), 1087–1092.
Moreau, J.J., Propriétés des applications «prox». C. R. Hebd. Séances Acad. Sci. 256 (1963), 1069–1071.
Moreau, J.J., Proximité et dualité dans un espace hilbertien. Bull. Soc. Math. Fr. 93 (1965), 273–299.
Neal, R.M., et al. Mcmc using Hamiltonian dynamics. Handbook of Markov Chain Monte Carlo, vol. 2, 2011, 2.
Nesterov, Y., Spokoiny, V., Random gradient-free minimization of convex functions. Found. Comput. Math. 17 (2017), 527–566.
Nesterov, Y., et al. Lectures on Convex Optimization, vol. 137. 2018, Springer.
Parisi, G., Correlation functions and computer simulations. Nucl. Phys. B 180 (1981), 378–384.
Patrinos, P., Bemporad, A., Proximal Newton methods for convex composite optimization. 52nd IEEE Conference on Decision and Control, 2013, 2358–2363.
Pereyra, M., Proximal Markov chain Monte Carlo algorithms. Stat. Comput. 26 (2016), 745–760.
Pereyra, M., Mieles, L.V., Zygalakis, K.C., Accelerating proximal Markov chain Monte Carlo by using an explicit stabilized method. SIAM J. Imaging Sci. 13 (2020), 905–935.
Pereyra, M., Vargas-Mieles, L.A., Zygalakis, K.C., The split Gibbs sampler revisited: improvements to its algorithmic structure and augmented target distribution. arXiv preprint arXiv:2206.13894, 2022.
Salim, A., Kovalev, D., Richtárik, P., Stochastic proximal Langevin algorithm: potential splitting and nonasymptotic rates. arXiv:1905.11768, 2019.
Stella, L., Themelis, A., Patrinos, P., Forward-backward quasi-Newton methods for nonsmooth optimization problems. Comput. Optim. Appl. 67 (2017), 443–487, 10.1007/s10589-017-9912-y.
Talay, D., Tubaro, L., Expansion of the global error for numerical schemes solving stochastic differential equations. Stoch. Anal. Appl. 8 (1990), 483–509.
Themelis, A., Ahookhosh, M., Patrinos, P., On the acceleration of forward-backward splitting via an inexact Newton method. Splitting Algorithms, Modern Operator Theory, and Applications, 2019, Springer, 363–412.
Themelis, A., Stella, L., Patrinos, P., Forward-backward envelope for the sum of two nonconvex functions: further properties and nonmonotone linesearch algorithms. SIAM J. Optim. 28 (2018), 2274–2303.
Vidal, A.F., De Bortoli, V., Pereyra, M., Durmus, A., Maximum likelihood estimation of regularization parameters in high-dimensional inverse problems: an empirical Bayesian approach part I: methodology and experiments. SIAM J. Imaging Sci. 13 (2020), 1945–1989.
Vincze, A., Dargó, G., Rácz, A., Balogh, G.T., A corneal-pampa-based in silico model for predicting corneal permeability. J. Pharm. Biomed. Anal., 203, 2021, 114218.
Welling, M., Teh, Y.W., Bayesian learning via stochastic gradient Langevin dynamics. Proceedings of the 28th International Conference on Machine Learning (ICML-11), 2011, Citeseer, 681–688.
Yosida, K., Functional Analysis, vol. 123. 1988, Springer.
Zhang, Y., Akyildiz, Ö.D., Damoulas, T., Sabanis, S., Nonasymptotic estimates for stochastic gradient Langevin dynamics under local conditions in nonconvex optimization. arXiv preprint arXiv:1910.02008, 2019.