artificial intelligence; data sharing challenges; federated learning; eGovernment
Abstract :
[en] Intergovernmental collaboration is needed to address global problems. Modern solutions to these problems often include data-driven methods like artificial intelligence (AI), which require large amounts of data to perform well. As AI emerges as a central catalyst in deriving effective solutions for global problems, the infrastructure that supports its data needs becomes crucial. However, data sharing between governments is often constrained due to socio-technical barriers such as concerns over data privacy, data sovereignty issues, and the risks of information misuse. Federated learning (FL) presents a promising solution as a decentralized AI methodology, enabling the use of data from multiple silos without necessitating central aggregation. Instead of sharing raw data, governments can build their own models and just share the model parameters with a central server aggregating all parameters, resulting in a superior overall model. By conducting a structured literature review, we show how major intergovernmental data-sharing challenges listed by the Organisation for Economic Co-operation and Development can be overcome by utilizing FL. Furthermore, we provide a tangible resource implementing FL linked to the Ukrainian refugee crisis that can be utilized by researchers and policymakers alike who want to implement FL in cases where data cannot be shared. Enhanced AI while maintaining privacy through FL thus allows governments to collaboratively address global problems, positively impacting governments and citizens.
Research center :
Interdisciplinary Centre for Security, Reliability and Trust (SnT) > FINATRAX - Digital Financial Services and Cross-organizational Digital Transformations
Disciplines :
Engineering, computing & technology: Multidisciplinary, general & others Computer science
Author, co-author :
Sprenkamp, Kilian
DELGADO FERNANDEZ, Joaquin ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > FINATRAX
Eckhardt, Sven
Zavolokina, Liudmila
External co-authors :
yes
Language :
English
Title :
Overcoming intergovernmental data sharing challenges with federated learning
Publication date :
2024
Journal title :
Data and Policy
eISSN :
2632-3249
Publisher :
Cambridge University Press (CUP)
Volume :
6
Issue :
27
Peer reviewed :
Peer Reviewed verified by ORBi
Focus Area :
Computational Sciences Security, Reliability and Trust
H2020 - 814654 - MDOT - Medical Device Obligations Taskforce
FnR Project :
FNR13342933 - Paypal-fnr Pearl Chair In Digital Financial Services, 2019 (01/01/2020-31/12/2024) - Gilbert Fridgen
Funders :
FNR - Fonds National de la Recherche EC - European Commission Union Européenne
Funding text :
We thank the University of Zurich and the Digital Society Initiative for (partially) funding this study under the
Digitalization Initiative of the Zurich Higher Education Institutions postdoc fellowship of L.Z. Further, this work has been supported
by the European Union (EU) within its Horizon 2020 program, project MDOT (Medical Device Obligations Taskforce), grant
agreement 814,654, and from PayPal and the Luxembourg National Research Fund FNR (P17/IS/13342933/PayPal-FNR/Chair in
DFS/Gilbert Fridgen).
Abad MSH, Ozfatura E, Gunduz D, and Ercetin O, (2020) Hierarchical federated learning ACROSS heterogeneous cellular networks. In ICASSP 2020 - mdash;2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, pp. 8866-8870.
Agarwal N, Suresh AT, Yu FXX, Kumar S, and McMahan B, (2018) cpSGD: Communication-efficient and differentially-private distributed SGD. Advances in Neural Information Processing Systems.
Antonio N, (2022) The Public Sector Must Accelerate Digital Transformation-Or Risk Losing Sovereignty and Trust. Available at https://www.weforum.org/agenda/2022/05/the-public-sector-must-accelerate-digital-transformation-or-risk-losing-sovereignty-and-trust/ (accessed 23 May 23 2022).
Aspinwall M, and Greenwood J, (2013) Collective action in the European Union: Interests and the New Politics of Associability. London: Routledge.
Balta D, Sellami M, Kuhn P, Schöpp U, Buchinger M, Baracaldo N, Anwar A, Ludwig H, Sinn M, Purcell M, and Altakrouri B, (2021) Accountable federated machine learning in government: engineering and management insights. In Electronic Participation: 13th IFIP WG 8.5 International Conference, ePart 2021, Granada, Spain, September 7-9, 2021, Proceedings 13, pp. 125-138. Springer International Publishing.
Benmalek M, Benrekia MA, and Challal Y, (2022) Security of federated learning: Attacks, defensive mechanisms, and challenges. Revue des Sciences et Technologies de l'Information-Série RIA: Revue d'Intelligence Artificielle 36 (1), 49-59.
Bennett CJ, (2016) Voter databases, micro-targeting, and data protection law: Can political parties campaign in Europe as they do in North America? International Data Privacy Law 6 (4), 261-275.
Bonawitz K, Ivanov V, Kreuter B, Marcedone A, McMahan HB, Patel S, Ramage D, Segal A, and Seth K, (2017) Practical secure aggregation for privacy-preserving machine learning. In proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1175-1191.
Cao D, Chang S, Lin Z, Liu G, and Sun D, (2019) Understanding distributed poisoning attack in federated learning. In 2019 IEEE 25th International Conference on Parallel and Distributed Systems (ICPADS), pp. 233-239. Available at https://api.semanticscholar.org/CorpusID:210992177.
Chen CP, and Zhang C-Y, (2014) Data-intensive applications, challenges, techniques and technologies: A survey on big data. Information Sciences 275 (10), 314-347.
Chen D, and Zhao H, (2012) Data security and privacy protection issues in cloud computing. In 2012 International Conference on Computer Science and Electronics Engineering.
Chen Y, Qin X, Wang J, Yu C, and Gao W, (2020) Fedhealth: A federated transfer learning framework for wearable healthcare. IEEE Intelligent Systems 35 (4), 83-89.
Chen Y, Su L, and Xu J, (2017) Distributed statistical machine learning in adversarial settings: Byzantine gradient descent. Proceedings of the ACM on Measurement and Analysis of Computing Systems 1 (2), 1-25.
Clarifying Lawful Overseas Use of Data Act (CLOUD Act). (2018) 115th Congress of the United States of America. Available at https://www.congress.gov/bill/115th-congress/house-bill/4943/text.
Clarkson G, Jacobsen TE, and Batcheller AL, (2007) Information asymmetry and information sharing. Government Information Quarterly.
de Luca AB, Zhang G, Chen X, and Yu Y, (2022) Mitigating data heterogeneity in federated learning with data augmentation. arXiv preprint arXiv:2206.09979.
Devlin J, Chang M-W, Lee K, and Toutanova K, (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
Duan Y, Edwards JS, and Dwivedi YK, (2019) Artificial intelligence for decision making in the era of big data-evolution, challenges and research agenda. International Journal of Information Management 48, 63-71.
Eckhardt S, Sprenkamp K, Zavolokina L, Bauer I, and Schwabe G, (2022) Can artificial intelligence help used-car dealers survive in a data-driven used-car market? In International Conference on Design Science Research in Information Systems and Technology, pp. 115-127.
Fallah A, Mokhtari A, and Ozdaglar A, (2020) Personalized federated learning with theoretical guarantees: A model-agnostic meta-learning approach. Advances in Neural Information Processing Systems 33, 3557-3568.
General Data Protection Regulation (2018, May 25). European Commission. Available at https://gdpr.eu/.
Goldsmith J, (2007) Who controls the internet? illusions of a borderless world. Strategic Direction 23 (11).
Guberovic E, Alexopoulos C, Bosnić I, and Čavrak I, (2022) Framework for federated learning open models in e-government applications. Interdisciplinary Description of Complex Systems 20, 162-178.
Isaak J, and Hanna MJ, (2018) User data privacy: Facebook, Cambridge Analytica, and privacy protection. Computer 51 (8), 56-59.
Jiang JC, Kantarci B, Oktug S, and Soyata T, (2020) Federated learning in smart city sensing: Challenges and opportunities. Sensors 20 (21), 6230.
Johnson PA, (2016) Reflecting on the success of open data: How municipal government evaluates their open data programs. International Journal of E-Planning Research 5 (3).
Jordan MI, and Mitchell TM, (2015) Machine learning: Trends, perspectives, and prospects. Science 349 (6245), 255-260.
Kang J, Xiong Z, Niyato D, Xie S, and Zhang J, (2019) Incentive mechanism for reliable federated learning: A joint optimization approach to combining reputation and contract theory. IEEE Internet of Things Journal 6 (6), 10700-10714.
Lachana Z, Alexopoulos C, Loukis E, and Charalabidis Y, (2018) Identifying the different generations of eGovernment: An analysis framework. In 12th Mediterranean Conference on Information Systems.
Lasi H, Fettke P, Kemper H-G, Feld T, and Hoffmann M, (2014) Industry 4.0. Business & Information Systems Engineering 6, 239-242.
Li L, Fan Y, Tse M, and Lin K-Y, (2020a) A review of applications in federated learning. Computers & Industrial Engineering 149, 106854.
Li Q, Wen Z, Wu Z, Hu S, Wang N, Li Y, Liu X, and He B, (2021) A survey on federated learning systems: Vision, hype and reality for data privacy and protection. IEEE Transactions on Knowledge and Data Engineering 35 (4), 3347-3366.
Li T, Sahu AK, Talwalkar A, and Smith V, (2020b) Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine 37 (3), 50-60.
Liu X, Shao S, Yang Y, Wu K, Yang W, and Fang H, (2021) Secure federated learning model verification: A client-side backdoor triggered watermarking scheme. IEEE International Conference on Systems, Man, and Cybernetics, 2414-2419.
Liu Y, James J, Kang J, Niyato D, and Zhang S, (2020) Privacy-preserving traffic flow prediction: A federated learning approach. IEEE Internet of Things Journal 7 (8), 7751-7763.
Manoj T, Makkithaya K, and Narendra V, (2022) A federated learning-based crop yield prediction for agricultural production risk management. In 2022 IEEE Delhi Section Conference 1-7.
McMahan B, Moore E, Ramage D, Hampson S, and Arcas BA, (2017) Communication-efficient learning of deep networks from decentralized data. Artificial Intelligence and Statistics.
McMahan HB, Ramage D, Talwar K, and Zhang L, (2017) Learning differentially private recurrent language models. Preprint, arxiv:1710.06963.
Mikhaylov SJ, Esteve M, and Campion A, (2018) Artificial intelligence for the public sector: Opportunities and challenges of cross-sector collaboration. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 376 (2128), 20170357.
Mironov I, (2017) Rényi differential privacy. In 2017 IEEE 30th Computer Security Foundations Symposium (CSF). https://doi.org/10.1109/csf.2017.11.
Mitra A, Jaafar R, Pappas GJ, and Hassani H, (2021) Linear convergence in federated learning: Tackling client heterogeneity and sparse gradients. Advances in Neural Information Processing Systems.
Mothukuri V, Parizi RM, Pouriyeh S, Huang Y, Dehghantanha A, and Srivastava G, (2021) A survey on security and privacy of federated learning. Future Generation Computer Systems. 115, 619-640.
Niknam S, Dhillon HS, and Reed JH, (2020) Federated learning for wireless communications: Motivation, opportunities, and challenges. IEEE Communications Magazine 58 (6), 46-51.
OECD (2019) Enhancing access to and sharing of data: Reconciling risks and benefits for data re-use across societies. Paris: OECD.
Olson M, (1965) The Logic of Collective Action. Cambridge, MA: Harvard University Press.
Ozfatura E, Ozfatura K, and GÜndÜz D, (2021) Time-correlated sparsification for communication-efficient federated learning. In 2021 IEEE International Symposium on Information Theory (ISIT), pp. 461-466.
Passerat-Palmbach J, Farnan T, McCoy M, Harris JD, Manion ST, Flannery HL, and Gleim B, (2020) Blockchain-orchestrated machine learning for privacy preserving federated learning in electronic health data. In 2020 IEEE International Conference on Blockchain.
Pingitore G, Rao V, Dwivedi K, and Cavallaro K, (2017) To Share or Not to Share. Available at https://www2.deloitte.com/content/dam/insights/us/articles/4020_To-share-or-not-to-share/DUP_To-share-or-not-to-share.pdf.
Reimsbach-Kounatze C, (2021) Enhancing access to and sharing of data: Striking the balance between openness and control over data. In Data access, consumer interests and public welfare, pp. 25-68. Nomos Verlagsgesellschaft mbH & Co. KG.
Shae Z, and Tsai J, (2018) Transform blockchain into distributed parallel computing architecture for precision medicine. In International Conference on Distributed Computing Systems, pp. 1290-1299.
Shah SM, and Lau VK, (2021) Model compression for communication efficient federated learning. IEEE Transactions on Neural Networks and Learning Systems 34 (9), 5937-5951.
Sprenkamp K, Delgado Fernandez J, Eckhardt S, and Zavolokina L, (2023) Federated learning as a solution for problems related to intergovernmental data sharing. In 56th Hawaii International Conference on System Sciences.
Sprenkamp K, Zavolokina L, Angst M, and Dolata M, (2023) Data-driven governance in crises: Topic modelling for the identification of refugee needs. In Proceedings of the 24th Annual International Conference on Digital Government Research, pp. 1-11.
Trampusch C, (2023) Regulating the digital economy: Explaining heterogenous business preferences in data governance. Journal of European Public Policy, 1-25.
Truong N, Sun K, Wang S, Guitton F, and Guo Y, (2021) Privacy preservation in federated learning: An insightful survey from the GDPR perspective. Computers & Security 110, 102402.
Tuor T, Wang S, Ko BJ, Liu C, and Leung KK, (2021) Overcoming noisy and irrelevant data in federated learning. In International Conference on Pattern Recognition.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones Ł, Gomez AN, Kaiser F, and Polosukhin I, (2017) Attention is all you need. Advances in Neural Information Processing Systems 30.
Ward BT, and Sipior JC, (2010) The internet jurisdiction risk of cloud computing. Information Systems Management 27 (4), 334-339.
Webster J, and Watson RT, (2002) Analyzing the past to prepare for the future: Writing a literature review. MIS Quarterly 26 (2), xiii-xxiii.
Wheaton S, and Martuscelli, (2021) WHO, Berlin Float Sanctions If Countries Suppress Information on Pandemics. Available at https://www.politico.eu/article/who-berlin-float-sanctions-if-countries-suppress-information-on-pandemics/ (accessed 20 May 2022).
WHO (2021) Global leaders unite in urgent call for international pandemic treaty. Available at https://www.who.int/news/item/30-03-2021-global-leaders-unite-in-urgent-call-for-international-pandemic-treaty (accessed 19 May 2022).
Wiseman J, (2020) Silo busting: The challenges and success factors for sharing intergovernmental data. IBM Center for The Business of Government.
Xu J, Glicksberg BS, Su C, Walker P, Bian J, and Wang F, (2021) Federated learning for healthcare informatics. Journal of Healthcare Informatics Research 5, 1-19.
Yang D, Xu Z, Li W, Myronenko A, Roth HR, Harmon S, Xu S, Turkbey B, Turkbey E, Wang X, et al., (2021a) Federated semi-supervised learning for covid region segmentation in chest ct using multi-national data from China, Italy, Japan. Medical Image Analysis 70, 101992.
Yang Q, Liu Y, Chen T, and Tong Y, (2019) Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology 10 (2), 1-19.
Yang W, Zhou Y, Hu M, Wu D, Zheng X, Wang JH, Guo S, and Li C, (2021b) Gain without pain: Offsetting Dp-injected noises stealthily in cross-device federated learning. IEEE Internet of Things Journal 9 (22), 22147-22157.
Yu H, Liu Z, Liu Y, Chen T, Cong M, Weng X, Niyato D, and Yang Q, (2020) A fairness-aware incentive scheme for federated learning. In Conference on AI, Ethics, and Society, pp. 393-399.
Yukhno A, (2022) Digital transformation: Exploring big data governance in public administration. Public Organization Review 24, 335-349.
Ziller A, Trask A, Lopardo A, Szymkow B, Wagner B, Bluemke E, Nounahon J-M, Passerat-Palmbach J, Prakash K, Rose N, et al., (2021) Pysyft: A library for easy federated learning. In Federated Learning Systems. Berlin: Springer.