Keywords :
Bibliometrics; GPT3.5; Large Language Models (LLMs); Llama 2; PaLM 2; Sustainable Developement Goals (SDGs); United Nations (UN); Accurate measurement; Bibliometric; Gpt3.5; Language model; Large language model; Sustainable developement goal; United nation; United Nations; Computer Science Applications; Artificial Intelligence; Computer Networks and Communications; Hardware and Architecture; Information Systems and Management; Education; Health (social science)
Abstract :
[en] United Nations defined a set of 17 Sustainable Development Goals (SDGs) that must be derived by all states into concrete actions. As a result, methods need to be defined to evaluate the progress towards achieving those goals. However, evaluating each individual action with accurate measurements is not possible. As a result, many methods rely on analyzing textual documentation such as reports or publications to identify and comprehend the contributions of an entity to the different SDGs. Existing solutions are based on queries composed of a mostly manually fixed set of keywords. The exhaustiveness of these queries is strongly linked to the datasets used to build them but also to the personal interpretations of the SDGs. To remedy this situation, we propose to extend a set of initial and manually validated keywords thanks to three major Large Language Models in order to generate and aggregate synonyms. For validation purposes, we rely on the OSDG Community Dataset which contains labelled text extracts alongside with the associated SDGs.
Scopus citations®
without self-citations
1