Bibliometrics; GPT3.5; Large Language Models (LLMs); Llama 2; PaLM 2; Sustainable Developement Goals (SDGs); United Nations (UN); Accurate measurement; Bibliometric; Gpt3.5; Language model; Large language model; Sustainable developement goal; United nation; United Nations; Computer Science Applications; Artificial Intelligence; Computer Networks and Communications; Hardware and Architecture; Information Systems and Management; Education; Health (social science)
Résumé :
[en] United Nations defined a set of 17 Sustainable Development Goals (SDGs) that must be derived by all states into concrete actions. As a result, methods need to be defined to evaluate the progress towards achieving those goals. However, evaluating each individual action with accurate measurements is not possible. As a result, many methods rely on analyzing textual documentation such as reports or publications to identify and comprehend the contributions of an entity to the different SDGs. Existing solutions are based on queries composed of a mostly manually fixed set of keywords. The exhaustiveness of these queries is strongly linked to the datasets used to build them but also to the personal interpretations of the SDGs. To remedy this situation, we propose to extend a set of initial and manually validated keywords thanks to three major Large Language Models in order to generate and aggregate synonyms. For validation purposes, we rely on the OSDG Community Dataset which contains labelled text extracts alongside with the associated SDGs.
United Nations, "Transforming our world: the 2030 agenda for sustainable development," 2015. [Online]. Available: https://sdgs.un.org/2030agenda
Y. Kashnitsky, G. Roberge, J. Mu, K. Kang, W. Wang et al., "Evaluating approaches to identifying research supporting the united nations sustainable development goals," arXiv, 2023. [Online]. Available: https://doi.org/10.48550/arXiv.2209.07285
W. Wang, W. Kang, and J. Mu, "Mapping research to the sustainable development goals: A contextualised approach," Research Square, 2023. [Online]. Available: https://doi.org/10.21203/rs.3.rs-2544385/v3
V. Maurice, O. René, and S. Eike, "Search Queries for "Mapping Research Output to the Sustainable Development Goals (SDGs)" v1.0," Zenodo, Jan. 2018. [Online]. Available: https://doi.org/10.5281/zenodo.3817352
V. Maurice, O. René, and S. Eike, "Search Queries for "Mapping Research Output to the Sustainable Development Goals (SDGs)" v5.0.2," Zenodo, Jul. 2020. [Online]. Available: https://doi.org/10.5281/zenodo.4883250
V. Maurice, S. Eike, and G. Yassin, "Survey data of "Mapping Research Output to the Sustainable Development Goals (SDGs)"," Zenodo, May 2020. [Online]. Available: https://doi.org/10.5281/zenodo.3813230
J. Bamini, B. Roy, A. Kevin, and K. Lisette, "Identifying research supporting the united nations sustainable development goals," Elsevier Data Repository, V1, 2019. [Online]. Available: https://doi.org/10.17632/87txkw7khs.1
R. Maxime, K. Yury, B.-V. Alexandre, C. David, K. Paul et al., "Improving the scopus and aurora queries to identify research that supports the united nations sustainable development goals (sdgs)," Elsevier Data Repository, V4, 2021. [Online]. Available: https://doi.org/10.17632/9sxdykm8s4.4
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones et al., "Attention is all you need," in Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc., 2017.
J. Ye, X. Chen, N. Xu, C. Zu, Z. Shao et al., "A comprehensive capability analysis of gpt-3 and gpt-3.5 series models," arXiv, 2023. [Online]. Available: https://doi.org/10.48550/arXiv.2303.10420
H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi et al., "Llama 2: Open foundation and fine-tuned chat models," arXiv, 2023. [Online]. Available: https://doi.org/10.48550/arXiv.2307.09288
R. Anil, A. M. Dai, O. Firat, M. Johnson, D. Lepikhin et al., "Palm 2 technical report," arXiv, 2023. [Online]. Available: https://doi.org/10.48550/arXiv.2305.10403
OSDG, UNDP IICPSD SDG AI Laband PPMI, "Osdg community dataset (osdg-cd)," Zenodo, Jul. 2023. [Online]. Available: https://doi.org/10.5281/zenodo.8107038