[en] Keyphrase extraction is the task of identifying a set of phrases that best represent a natural language document. It is a fundamental and challenging task that assists publishers to index and recommend relevant documents to readers. In this article, we introduce C-Rank, a novel unsupervised approach to automatically extract keyphrases from single documents by using concept linking. Our method explores Babelfy to identify candidate keyphrases, which are weighted based on heuristics and their centrality inside a co-occurrence graph where keyphrases appear as vertices. It improves the results obtained by graph-based techniques without training nor background data inserted by users. Evaluations are performed on SemEval and INSPEC datasets, producing competitive results with state-of-the-art tools. Furthermore, C-Rank generates intermediate structures with semantically annotated data that can be used to analyze larger textual compendiums, which might improve domain understatement and enrich textual representation methods.
Disciplines :
Sciences informatiques
Auteur, co-auteur :
DALLE LUCCA TOSI, Mauro ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
Reis, Julio Cesar Dos
Co-auteurs externes :
yes
Langue du document :
Anglais
Titre :
C-rank: a concept linking approach to unsupervised keyphrase extraction
Date de publication/diffusion :
2019
Nom de la manifestation :
13th International Conference on Metadata and Semantics Research
Organisateur de la manifestation :
Springer
Date de la manifestation :
from 28-10-2019 to 31-10-2019
Manifestation à portée :
International
Titre de l'ouvrage principal :
Research Conference on Metadata and Semantics Research