Doctoral thesis (Dissertations and theses)
Detection of Sentiment in Luxembourgish User Comments
Gierschek, Daniela
2022
 

Files


Full Text
PhD_Dissertation_Daniela_Gierschek.pdf
Author postprint (2.52 MB)
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
Computational Linguistics; Luxembourgish; Linguistics; Sentiment
Abstract :
[en] Sentiment is all around us in everyday life. It can be found in blog posts, social media comments, text messages and many other places where people express themselves. Sentiment analysis is the task of automatically detecting those sentiments, attitudes or opinions in written text. In this research, the first sentiment analysis solution for the low-resource language, Luxembourgish, is conducted using a large corpus of user comments published on the RTL Luxembourg website www.rtl.lu. Various resources were created for this purpose to set the foundation for further sentiment research in Luxembourgish. A Luxembourgish sentiment lexicon and an annotation tool were built as external resources that can be used for collecting and enlarging training data for sentiment analysis tasks. Additionally, a corpus of mainly sentences of user comments was annotated with negative, neutral and positive labels. This corpus was furthermore automatically translated to English and German. Afterwards, diverse text representations such as word2vec, tf-idf and one-hot encoding were used on the three versions of the corpus of labeled sentences for training different machine learning models. Furthermore, one part of the experimental setup leveraged linguistic features for the classification process in order to study their impact on sentiment expressions. By following such a broad strategy, this thesis not only sets the basis for sentiment analysis with Luxembourgish texts but also intends to give recommendations for conducting sentiment detection research for other low-resource languages. It is demonstrated that creating new resources for a low-resource language is an intensive task and should be carefully planned in order to outperform working with translations where the target language is a high-resource language such as English and German.
Disciplines :
Computer science
Arts & humanities: Multidisciplinary, general & others
Languages & linguistics
Author, co-author :
Gierschek, Daniela ;  University of Luxembourg > Faculty of Humanities, Education and Social Sciences (FHSE)
Language :
English
Title :
Detection of Sentiment in Luxembourgish User Comments
Defense date :
25 February 2022
Number of pages :
152
Institution :
Unilu - University of Luxembourg, Esch-sur-Alzette, Luxembourg
Degree :
Docteur en Sciences du Langage
Promotor :
President :
Jury member :
Plank, Barbara
Kralj Novak, Petra
Schommer, Christoph  
Available on ORBilu :
since 09 March 2022

Statistics


Number of views
220 (21 by Unilu)
Number of downloads
374 (9 by Unilu)

Bibliography


Similar publications



Contact ORBilu