Reference : DSCo: A Language Modeling Approach for Time Series Classification
Scientific congresses, symposiums and conference proceedings : Paper published in a book
Engineering, computing & technology : Computer science
Computational Sciences
http://hdl.handle.net/10993/26733
DSCo: A Language Modeling Approach for Time Series Classification
English
Li, Daoyuan mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Li, Li mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Bissyande, Tegawendé François D Assise mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Klein, Jacques mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Computer Science and Communications Research Unit (CSC) >]
Le Traon, Yves mailto [University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC) >]
Jul-2016
12th International Conference on Machine Learning and Data Mining (MLDM 2016)
Yes
International
12th International Conference on Machine Learning and Data Mining (MLDM 2016)
from 16-07-2016 to 21-07-2016
New York
[en] Time Series ; Time Series Classification ; Language Modeling
[en] Time series data are abundant in various domains and are often characterized as large in size and high in dimensionality, leading to storage and processing challenges. Symbolic representation of time series – which transforms numeric time series data into texts – is a promising technique to address these challenges. However, these techniques are essentially lossy compression functions and information are partially lost during transformation. To that end, we bring up a novel approach named Domain Series Corpus (DSCo), which builds per-class language models from the symbolized texts. To classify unlabeled samples, we compute the fitness of each symbolized sample against all per-class models and choose the class represented by the model with the best fitness score. Our work innovatively takes advantage of mature techniques from both time series mining and NLP communities. Through extensive experiments on an open dataset archive, we demonstrate that it performs similarly to approaches working with original uncompressed numeric data.
Researchers ; Professionals ; Students ; General public
http://hdl.handle.net/10993/26733

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Open access
paper.pdfAuthor preprint345.66 kBView/Open

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.