Article (Scientific journals)
Keyphrase extraction from single textual documents based on semantically defined background knowledge and co-occurrence graphs
Dalle Lucca Tosi, Mauro; Reis, Julio Cesar Dos
2021In International Journal of Metadata, Semantics and Ontologies, 15 (2), p. 121--132
Peer reviewed
 

Files


Full Text
2021_IJMSO.pdf
Publisher postprint (539.08 kB)
Request a copy

All documents in ORBilu are protected by a user license.

Send to



Details



Abstract :
[en] The keyphrase extraction task is a fundamental and challenging task designed to automatically extract a set of keyphrases from textual documents. Keyphrases are fundamental to assist publishers in indexing documents and readers in identifying the most relevant ones. They are short phrases composed of one or more terms used to best represent a textual document and its main topics. In this article, we extend our research on C-Rank, an unsupervised approach that automatically extracts keyphrases from single documents. C-Rank uses a concept-linking approach that links concepts in common between single documents and an external background knowledge base. Our approach uses those concepts as candidate keyphrases, which are modeled in a co-occurrence graph. On this basis, keyphrases are extracted relying on heuristics and their centrality in the graph. We advance our study over C-Rank by evaluating it using different concept-linking approaches - Babelfy and DBPedia Spotlight. The evaluation was performed in five gold-standard datasets composed of distinct types of data - academic articles, academic abstracts, and news articles. Our findings indicate that C-Rank achieves state-of-the-art results extracting keyphrases from scientific documents by experimentally comparing it to other unsupervised existing approaches.
Disciplines :
Computer science
Author, co-author :
Dalle Lucca Tosi, Mauro ;  University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
Reis, Julio Cesar Dos
External co-authors :
yes
Language :
English
Title :
Keyphrase extraction from single textual documents based on semantically defined background knowledge and co-occurrence graphs
Publication date :
2021
Journal title :
International Journal of Metadata, Semantics and Ontologies
Publisher :
Inderscience Publishers (IEL)
Volume :
15
Issue :
2
Pages :
121--132
Peer reviewed :
Peer reviewed
Available on ORBilu :
since 06 September 2022

Statistics


Number of views
59 (3 by Unilu)
Number of downloads
0 (0 by Unilu)

Bibliography


Similar publications



Contact ORBilu