Reference : Self-adaptive Change Detection in Streaming Data with Non-stationary Distribution
Scientific congresses, symposiums and conference proceedings : Paper published in a book
Engineering, computing & technology : Computer science
Self-adaptive Change Detection in Streaming Data with Non-stationary Distribution
Zhang, Xiangliang [MCSE, King Abdullah University of Science and Technology, Saudi Arabia]
Wang, Wei [University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC) > ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)]
Advanced Data Mining and Applications
Lecture Notes in Computer Science, 6440
The 6th International Conference on Advanced Data Mining and Applications (ADMA'2010)
19-21 November 2010
[en] Change detection ; Data stream ; Self-adaptive parameter setting ; Non-stationary distribution
[en] Non-stationary distribution, in which the data distribution evolves over time, is a common issue in many application fields, e.g., intrusion detection and grid computing. Detecting the changes in massive streaming data with a non-stationary distribution helps to alarm the anomalies, to clean the noises, and to report the new patterns. In this paper, we employ a novel approach for detecting changes in streaming data with the purpose of improving the quality of modeling the data streams. Through observing the outliers, this approach of change detection uses a weighted standard deviation to monitor the evolution of the distribution of data streams. A cumulative statistical test, Page-Hinkley, is employed to collect the evidence of changes in distribution. The parameter used for reporting the changes is self-adaptively adjusted according to the distribution of data streams, rather than set by a fixed empirical value. The self-adaptability of the novel approach enhances the effectiveness of modeling data streams by timely catching the changes of distributions. We validated the approach on an online clustering framework with a benchmark KDDcup 1999 intrusion detection data set as well as with a real-world grid data set. The validation results demonstrate its better performance on achieving higher accuracy and lower percentage of outliers comparing to the other change detection approaches.

File(s) associated to this reference

Fulltext file(s):

Limited access
Wang.pdfPublisher postprint363.82 kBRequest a copy

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.