The LuNa Open Toolbox for the Luxembourgish Language

SIRAJZADE, Joshgun; SCHOMMER, Christoph

Download

Paper published in a book (Scientific congresses, symposiums and conference proceedings)

The LuNa Open Toolbox for the Luxembourgish Language

SIRAJZADE, Joshgun; SCHOMMER, Christoph

2019 • In Perner, Petra (Ed.) Advances in Data Mining, Applications and Theoretical Aspects, Poster Proceedings 2019

Permalink
https://hdl.handle.net/10993/40407

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

CRC_industrial_paper_84.pdf

Author preprint (1.41 MB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Luxembourgish language; POS-Tagging; Topic Modeling; Sentiment Analysis; Text Preparation; XML-Database

Abstract :

[en] Despite some recent work, the ongoing research for the processing of Luxembourgish is still largely in its infancy. While a rich variety of linguistic processing tools exist, especially for English, these software tools offer little scope for the Luxembourgish language. LuNa (a Tool for Luxembourgish National Corpus) is an Open Toolbox that allows researchers to annotate a text corpus written in Luxembourgish language and to build/query an annotated corpus. The aim of the paper is to demonstrate the components of the system and its usage for Machine Learning applications like Topic Modelling and Sentiment Detection. Overall, LuNa bases on a XML-database to store the data and to define the XML scheme, it offers a Graphical User Interface (GUI) for a linguistic data preparation such as tokenization, Part-Of-Speech tagging, and morphological analysis -- just to name a few.

Disciplines :

Languages & linguistics
Computer science

Author, co-author :

SIRAJZADE, Joshgun ; University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC)

SCHOMMER, Christoph ; University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC)

External co-authors :

Language :

English

Title :

The LuNa Open Toolbox for the Luxembourgish Language

Publication date :

2019

Event name :

19th Industrial Conference on Data Mining, ICDM 2019

Event place :

New York, United States

Event date :

from 17-07-2019 to 21-07-2019

Audience :

International

Main work title :

Advances in Data Mining, Applications and Theoretical Aspects, Poster Proceedings 2019

Editor :

Perner, Petra

Publisher :

ibai publishing, Leipzig, Germany

ISBN/EAN :

978-3-942952-61-3

Focus Area :

Computational Sciences

Available on ORBilu :

since 17 September 2019

Statistics

Number of views

715 (35 by Unilu)

Number of downloads

452 (37 by Unilu)

More statistics