[en] Time series mining has become essential for extracting knowledge from the abundant data that flows out from many application domains. To overcome storage and processing challenges in time series mining, compression techniques are being used. In this paper, we investigate the loss/gain of performance of time series classification approaches when fed with lossy-compressed data. This empirical study is essential for reassuring practitioners, but also for providing more insights on how compression techniques can even be effective in reducing noise in time series data. From a knowledge engineering perspective, we show that time series may be compressed by 90% using discrete wavelet transforms and still achieve remarkable classification ac- curacy, and that residual details left by popular wavelet compression techniques can sometimes even help achieve higher classification accuracy than the raw time series data, as they better capture essential local features.
Disciplines :
Sciences informatiques
Auteur, co-auteur :
LI, Daoyuan ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)
KLEIN, Jacques ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Computer Science and Communications Research Unit (CSC)
LE TRAON, Yves ; University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC)
Co-auteurs externes :
no
Langue du document :
Anglais
Titre :
Time Series Classification with Discrete Wavelet Transformed Data: Insights from an Empirical Study
Date de publication/diffusion :
juillet 2016
Nom de la manifestation :
The 28th International Conference on Software Engineering and Knowledge Engineering (SEKE 2016)
Date de la manifestation :
from 01-07-2016 to 03-07-2016
Manifestation à portée :
International
Titre de l'ouvrage principal :
The 28th International Conference on Software Engineering and Knowledge Engineering (SEKE 2016)
T.-C. Fu, "A review on time series data mining," Engineering Applications of Artificial Intelligence, vol. 24, no. 1, pp. 164-181, 2011.
Q. Wang and V. Megalooikonomou, "A dimensionality reduction technique for efficient time series similarity analysis," Information systems, vol. 33, no. 1, pp. 115-132, 2008.
J. Lin, E. Keogh, L. Wei, and S. Lonardi, "Experiencing sax: A novel symbolic representation of time series," Data Mining and knowledge discovery, vol. 15, no. 2, pp. 107-144, 2007.
I. Daubechies, "Orthonormal bases of compactly supported wavelets," Communications on pure and applied mathematics, vol. 41, no. 7, pp. 909-996, 1988.
A. Cohen, I. Daubechies, and P. Vial, "Wavelets on the interval and fast wavelet transforms," Applied and computational harmonic analysis, vol. 1, no. 1, pp. 54-81, 1993.
I. Daubechies, "Orthonormal bases of compactly supported wavelets ii. variations on a theme," SIAM Journal on Mathematical Analysis, vol. 24, no. 2, pp. 499-519, 1993.
D. S. Taubman and M.W. Marcellin, "JPEG2000: Standard for interactive imaging," Proceedings of the IEEE, vol. 90, no. 8, pp. 1336- 1357, 2002.
P. S. Addison, "Wavelet transforms and the ECG: A review," Physiological measurement, vol. 26, no. 5, p. R155, 2005.
A. Pizurica, A. M. Wink, E. Vansteenkiste, W. Philips, and B. J. Roerdink, "A review of wavelet denoising in MRI and ultrasound brain imaging," Current medical imaging reviews, vol. 2, no. 2, pp. 247-260, 2006.
C. Duarte, P. Delmar, K. W. Goossen, K. Barner, and E. Gomez-Luna, "Non-intrusive load monitoring based on switching voltage transients and wavelet transforms," in Future of Instrumentation International Workshop (FIIW), 2012. IEEE, 2012, pp. 1-4.
M. Gray and W. Morsi, "Application of wavelet-based classification in non-intrusive load monitoring," in Electrical and Computer Engineering (CCECE), 2015 IEEE 28th Canadian Conference on. IEEE, 2015, pp. 41-45.
T. Rakthanmanon and E. Keogh, "Fast shapelets: A scalable algorithm for discovering time series shapelets," in Proceedings of the thirteenth SIAM conference on data mining, 2013.
L. Ye and E. Keogh, "Time series shapelets: A new primitive for data mining," in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2009, pp. 947-956.
M. G. Baydogan, G. Runger, and E. Tuv, "A bag-of-features framework to classify time series," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 11, pp. 2796-2802, 2013.
G. E. Batista, X. Wang, and E. J. Keogh, "A complexity-invariant distance measure for time series." in SDM, vol. 11, 2011, pp. 699- 710.
K. Amolins, Y. Zhang, and P. Dare, "Wavelet based image fusion techniques - An introduction, review and comparison," ISPRS Journal of Photogrammetry and Remote Sensing, vol. 62, no. 4, pp. 249- 263, 2007.
A. K.-m. Leung, F.-t. Chau, and J.-b. Gao, "A review on applications of wavelet transform techniques in chemical analysis: 1989-1997," Chemometrics and Intelligent Laboratory Systems, vol. 43, no. 1, pp. 165-184, 1998.
F. K.-P. Chan, A. W.-c. Fu, and C. Yu, "Haar wavelets for efficient similarity search of time-series: With and without time warping," Knowledge and Data Engineering, IEEE Transactions on, vol. 15, no. 3, pp. 686-705, 2003.
D. Li, T. F. Bissyande, S. Kubler, J. Klein, and Y. Le Traon, "Profiling Household Appliance Electricity Usage with N-Gram Language Modeling," in The 2016 IEEE International Conference on Industrial Technology (ICIT 2016). Taipei: IEEE, 2016, pp. 604-609.
D. Li, L. Li, T. F. Bissyande, J. Klein, and Y. Le Traon, "DSCo: A Language Modeling Approach for Time Series Classification," in The 12th International Conference on Machine Learning and Data Mining (MLDM 2016). New York: Springer, 2016.
J. Serrà and J. L. Arcos, "An empirical evaluation of similarity measures for time series classification," Knowledge-Based Systems, vol. 67, pp. 305-314, 2014.
S. Salvador and P. Chan, "Toward accurate dynamic time warping in linear time and space," Intelligent Data Analysis, vol. 11, no. 5, pp. 561-580, 2007.
T. Rakthanmanon, B. Campana, A. Mueen, G. Batista, B. Westover, Q. Zhu, J. Zakaria, and E. Keogh, "Searching and mining trillions of time series subsequences under dynamic time warping," in Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2012, pp. 262-270.
Y. Chen, E. Keogh, B. Hu, N. Begum, A. Bagnall, A. Mueen, and G. Batista, "The ucr time series classification archive," July 2015, www.cs.ucr.edu/-eamonn/time series data/.