Paper published in a book (Scientific congresses, symposiums and conference proceedings)
A Comparative Study of Sentence Embeddings for Unsupervised Extractive Multi-document Summarization
LAMSIYAH, Salima; SCHOMMER, Christoph
2023In Artificial Intelligence and Machine Learning
Peer reviewed
 

Files


Full Text
paper_Bnaic22_3_April_2023.pdf
Publisher postprint (352.4 kB)
Request a copy

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
Unsupervised Multi-Document Summarization; Sentence Embeddings; Transfer Learning; Contrastive Learning; Coreference Resolution
Abstract :
[en] Obtaining large-scale and high-quality training data for multi-document summarization (MDS) tasks is time-consuming and resource-intensive, hence, supervised models can only be applied to limited domains and languages. In this paper, we introduce unsupervised extractive methods for both generic and query-focused MDS tasks, intending to produce a relevant summary from a collection of documents without using labeled training data or domain knowledge. More specifically, we leverage the potential of transfer learning from recent sentence embedding models to encode the input documents into rich semantic representations. Moreover, we use a coreference resolution system to resolve the broken pronominal coreference expressions in the generated summaries, aiming to improve their cohesion and textual quality. Furthermore, we provide a comparative analysis of several existing sentence embedding models in the context of unsupervised extractive multi-document summarization. Experiments on the standard DUC'2004-2007 datasets demonstrate that the proposed methods are competitive with previous unsupervised methods and are even comparable to recent supervised deep learning-based methods. The empirical results also show that the SimCSE embedding model, based on contrastive learning, achieves substantial improvements over strong sentence embedding models. Finally, the newly involved coreference resolution method is proven to bring a noticeable improvement to the unsupervised extractive MDS task.
Disciplines :
Computer science
Author, co-author :
LAMSIYAH, Salima  ;  University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
SCHOMMER, Christoph  ;  University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
External co-authors :
no
Language :
English
Title :
A Comparative Study of Sentence Embeddings for Unsupervised Extractive Multi-document Summarization
Publication date :
2023
Event name :
Artificial Intelligence and Machine Learning: 34th Joint Benelux Conference, BNAIC/Benelearn 2022
Event date :
November 7 – November 9, 2022
Audience :
International
Main work title :
Artificial Intelligence and Machine Learning
Publisher :
Springer Nature Switzerland, Cham, Unknown/unspecified
ISBN/EAN :
978-3-031-39144-6
Pages :
78--95
Peer reviewed :
Peer reviewed
Available on ORBilu :
since 16 September 2023

Statistics


Number of views
54 (6 by Unilu)
Number of downloads
0 (0 by Unilu)

Scopus citations®
 
0
Scopus citations®
without self-citations
0

Bibliography


Similar publications



Contact ORBilu