Reference : Self-adaptive Change Detection in Streaming Data with Non-stationary Distribution
Scientific congresses, symposiums and conference proceedings : Paper published in a book
Engineering, computing & technology : Computer science
http://hdl.handle.net/10993/16012
Self-adaptive Change Detection in Streaming Data with Non-stationary Distribution
English
Zhang, Xiangliang [MCSE, King Abdullah University of Science and Technology, Saudi Arabia]
Wang, Wei [University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC) > ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)]
2010
Advanced Data Mining and Applications
Springer
Lecture Notes in Computer Science, 6440
334-345
No
978-3-642-17315-8
Berlin
Germany
The 6th International Conference on Advanced Data Mining and Applications (ADMA'2010)
19-21 November 2010
Chongqing
China
[en] Change detection ; Data stream ; Self-adaptive parameter setting ; Non-stationary distribution
[en] Non-stationary distribution, in which the data distribution evolves over time, is a common issue in many application fields, e.g., intrusion detection and grid computing. Detecting the changes in massive streaming data with a non-stationary distribution helps to alarm the anomalies, to clean the noises, and to report the new patterns. In this paper, we employ a novel approach for detecting changes in streaming data with the purpose of improving the quality of modeling the data streams. Through observing the outliers, this approach of change detection uses a weighted standard deviation to monitor the evolution of the distribution of data streams. A cumulative statistical test, Page-Hinkley, is employed to collect the evidence of changes in distribution. The parameter used for reporting the changes is self-adaptively adjusted according to the distribution of data streams, rather than set by a fixed empirical value. The self-adaptability of the novel approach enhances the effectiveness of modeling data streams by timely catching the changes of distributions. We validated the approach on an online clustering framework with a benchmark KDDcup 1999 intrusion detection data set as well as with a real-world grid data set. The validation results demonstrate its better performance on achieving higher accuracy and lower percentage of outliers comparing to the other change detection approaches.
http://hdl.handle.net/10993/16012
10.1007/978-3-642-17316-5_33

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Limited access
Wang.pdfPublisher postprint363.82 kBRequest a copy

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.