[en] Credit risk assessment is a standard procedure for financial institutions (FIs) when estimating their credit risk exposure. It involves the gathering and processing quantitative and qualitative datasets to estimate whether an individual or entity will be able to make future required payments. To ensure effective processing of this data, FIs increasingly use machine learning methods. Large FIs often have more powerful models as they can access larger datasets. In this paper, we present a Federated Learning prototype that allows smaller FIs to compete by training in a cooperative fashion a machine learning model which combines key data derived from several smaller datasets. We test our prototype on an historical mortgage dataset and empirically demonstrate the benefits of Federated Learning for smaller FIs. We conclude that smaller FIs can expect a significant performance increase in their credit risk assessment models by using collaborative machine learning.
Research center :
- Interdisciplinary Centre for Security, Reliability and Trust (SnT) > FINATRAX - Digital Financial Services and Cross-organizational Digital Transformations ULHPC - University of Luxembourg: High Performance Computing
Disciplines :
Finance Business & economic sciences: Multidisciplinary, general & others Engineering, computing & technology: Multidisciplinary, general & others Computer science
Author, co-author :
Lee, Chul Min ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > FINATRAX
Delgado Fernandez, Joaquin ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > FINATRAX
Potenciano Menci, Sergio ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > FINATRAX
Rieger, Alexander ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > FINATRAX
Fridgen, Gilbert ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > FINATRAX
External co-authors :
no
Language :
English
Title :
Federated Learning for Credit Risk Assessment
Publication date :
03 January 2023
Event name :
56th Hawaii International Conference on System Sciences
Event organizer :
University of Hawaii
Event place :
Maui, Hawaii, United States
Event date :
from 03-01-23 to 06-01-23
Audience :
International
Main work title :
Proceedings of the 56th Hawaii International Conference on System Sciences
Aïvodji, U. M., Gambs, S., & Martin, A. (2019). Iotfla: A secured and privacy-preserving smart home architecture implementing federated learning. 2019 IEEE Security and Privacy Workshops (SPW), 175-180. https://doi.org/10.1109/SPW.2019.00041
Altman, E. I. (2002). Managing credit risk: A challenge for the new millennium. Economic Notes, 31(2), 201-214. https://doi.org/10.1111/14680300.00084
Bank, E. C. (2010). Memorandum of understanding on the exchange of information among national central credit registers for the purpose of passing it on to reporting institutions (tech. rep.).
Bansal, A., Kauffman, R. J., & Weitz, R. R. (1993). Comparing the modeling performance of regression and neural networks as data quality varies: A business value approach. Journal of Management Information Systems, 10(1), 11-32. http://www.jstor.org/stable/40398029
Borgman, C. L. (2012). The conundrum of sharing research data. Journal of the American Society for Information Science and Technology, 63(6), 1059-1078. https://doi.org/10.1002/asi.22634
Chen, N., Ribeiro, B., & Chen, A. (2016). Financial credit risk assessment: A recent review. Artificial Intelligence Review, 45. https://doi.org/10.1007/s10462-015-9434-x
Chollet, F. et al. (2015). Keras.
Ekbia, H., Mattioli, M., Kouper, I., Arave, G., Ghazi, A., Bowman, T., Suri, V., Tsou, A., Weingart, S., & Sugimoto, C. (2015). Big data, bigger dilemmas: A critical review. Journal of the Association for Information Science and Technology, 66. https://doi.org/10.1002/asi.23294
Federal Reserve. (2021). Federal Reserve Economic Data (FRED) [Accessed: 2021-08-11]. https://fred.stlouisfed.org/
Freddie Mac. (2021a). Freddie Mac's House Price Index (FMHPI) [Accessed: 2021-08-11]. https://www.freddiemac.com/research/indices/houseprice-index
Freddie Mac. (2021b). Single Family Loan-Level Dataset [Accessed: 2021-08-11]. http://www.freddiemac.com/research/datasets/sfloanleveldataset.page
Galindo, J., & Tamayo, P. (2000). Credit risk assessment using statistical and machine learning: Basic methodology and risk modeling applications. Computational Economics, 15, 107-43. https://doi.org/10.1023/A:1008699112516
Heitfield, E. (2009). Parameter uncertainty and the credit risk of collateralized debt obligations. Risk Management.
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.
Israël, J.-M., Damia, V., Bonci, R., & Watfe, G. (2017). The Analytical Credit Dataset - A magnifying glass for analysing credit in the euro area (Occasional Paper Series No. 187). European Central Bank. https://ideas.repec.org/p/ecb/ecbops/2017187.html
Kaissis, G., Ziller, A., Passerat-Palmbach, J., Ryffel, T., Usynin, D., Trask, A., Lima, I., Mancuso, J., Jungmann, F., Steinborn, M.-M., Saleh, A., Makowski, M., Rueckert, D., & Braren, R. (2021). End-to-end privacy preserving deep learning on multi-institutional medical imaging. Nature Machine Intelligence, 1-12. https://doi.org/10.1038/s42256-021-00337-8
Kearns, G. S., & Lederer, A. L. (2004). The impact of industry contextual factors on it focus and the use of it for competitive advantage. Information & Management, 41(7), 899-919. https://doi.org/https://doi.org/10.1016/j.im.2003.08.018
Kroese, D. P., Brereton, T. J., Taimre, T., & Botev, Z. I. (2014). Why the monte carlo method is so important today. Wiley Interdisciplinary Reviews: Computational Statistics, 6.
McMahan, H. B., Moore, E., Ramage, D., Hampson, S., & y Arcas, B. A. (2017). Communication-efficient learning of deep networks from decentralized data. Artificial Intelligence and Statistics, 1273-1282.
McMahan, H. B., Moore, E., Ramage, D., & y Arcas, B. A. (2016). Federated learning of deep networks using model averaging. CoRR, abs/1602.05629. http://arxiv.org/abs/1602.05629
Murphy, A. (2008). An analysis of the financial crisis of 2008: Causes and solutions. An Analysis of the Financial Crisis of.
on Banking Supervision, B. C. (2018). Pillar 3 disclosure requirement - updated framework (tech. rep.).
Redman, T. C. (1995). Improve data quality for competitive advantage [Copyright - Copyright Sloan Management Review Association, Alfred P. Sloan School of Management Winter 1995; Last updated - 2021-09-09; SubjectsTermNotLitGenreText - United States-US]. Sloan management review, 36(2), 99. https://www.proquest.com/scholarly-journals/improve-data-quality-competitive-advantage/docview/224971040/se-2?accountid=41819
Robbins, H., & Monro, S. (1951). A stochastic approximation method. The annals of mathematical statistics, 400-407.
Rosenblatt, M. (1956). Remarks on Some Nonparametric Estimates of a Density Function. The Annals of Mathematical Statistics, 27(3), 832-837. https://doi.org/10.1214/aoms/1177728190
Saputra, Y. M., Hoang, D. T., Nguyen, D. N., Dutkiewicz, E., Mueck, M. D., & Srikanteswara, S. (2019). Energy demand prediction with federated learning for electric vehicle networks. 2019 IEEE Global Communications Conference (GLOBECOM), 1-6. https://doi.org/10.1109/GLOBECOM38437.2019.9013587
Sezer, O. B., Gudelek, M. U., & Ozbayoglu, A. M. (2020). Financial time series forecasting with deep learning: A systematic literature review: 2005-2019. Applied Soft Computing, 90, 106181. https://doi.org/https://doi.org/10.1016/j.asoc.2020.106181
Shingi, G. (2020). A federated learning based approach for loan defaults prediction. 2020 International Conference on Data Mining Workshops (ICDMW), 362-368. https://doi.org/10.1109/ICDMW51313.2020.00057
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(56), 1929-1958. http://jmlr.org/papers/v15/srivastava14a.html
Taleb, N. N. (2020). Statistical consequences of fat tails: Real world preasymptotics, epistemology, and applications. arXiv preprint arXiv:2001.10488.
United States Bureau of Labor Statistics. (2021). United States Bureau of Labor Statistics' Local Area Unemployment Statistics (LAUS) [Accessed: 2021-08-11]. https://www.bls.gov/lau/
Varrette, S., Bouvry, P., Cartiaux, H., & Georgatos, F. (2014). Management of an academic hpc cluster: The ul experience. Proc. of the 2014 Intl. Conf. on High Performance Computing & Simulation (HPCS 2014), 959-967.
Walczak, S. (2001). An empirical analysis of data requirements for financial forecasting with neural networks. Journal of Management Information Systems, 17(4), 203-222. https://doi.org/10.1080/07421222.2001.11045659
Yang, W., Zhang, Y., Ye, K., Li, L., & Xu, C. (2019). Ffd: A federated learning based method for credit card fraud detection. BigData Congress.
Zheng, W., Yan, L., Gou, C., & Wang, F.-Y. (2020). Federated meta-learning for fraudulent credit card detection. IJCAI.
Zuiderwijk, A., Janssen, M., Poulis, K., & van de Kaa, G. (2015). Open data for competitive advantage: Insights from open data use by companies. Proceedings of the 16th Annual International Conference on Digital Government Research, 79-88. https://doi.org/10.1145/2757401.2757411