References of "Zhang, Xiangliang"
     in
Bookmark and Share    
Full Text
See detailHigh-speed Web Attack Detection through Extracting Exemplars from HTTP Traffic
Wang, Wei UL; Zhang, Xiangliang

in Proceedings of the 2011 ACM Symposium on Applied Computing (2011)

In this work, we propose an effective method for high-speed web attack detection by extracting exemplars from HTTP traffic before the detection model is built. The smaller set of exemplars keeps valuable ... [more ▼]

In this work, we propose an effective method for high-speed web attack detection by extracting exemplars from HTTP traffic before the detection model is built. The smaller set of exemplars keeps valuable information of the original traffic while it significantly reduces the size of the traffic so that the detection remains effective and improves the detection efficiency. The Affinity Propagation (AP) is employed to extract the exemplars from the HTTP traffic. K-Nearest Neighbor(K-NN) and one class Support Vector Machine (SVM) are used for anomaly detection. To facilitate comparison, we also employ information gain to select key attributes (a.k.a. features) from the HTTP traffic for web attack detection. Two large real HTTP traffic are used to validate our methods. The extensive test results show that the AP based exemplar extraction significantly improves the real-time performance of the detection compared to using all the HTTP traffic and achieves a more robust detection performance than information gain based attribute selection for web attack detection. [less ▲]

Detailed reference viewed: 61 (0 UL)
Full Text
See detailSelf-adaptive Change Detection in Streaming Data with Non-stationary Distribution
Zhang, Xiangliang; Wang, Wei UL

in Advanced Data Mining and Applications (2010)

Non-stationary distribution, in which the data distribution evolves over time, is a common issue in many application fields, e.g., intrusion detection and grid computing. Detecting the changes in massive ... [more ▼]

Non-stationary distribution, in which the data distribution evolves over time, is a common issue in many application fields, e.g., intrusion detection and grid computing. Detecting the changes in massive streaming data with a non-stationary distribution helps to alarm the anomalies, to clean the noises, and to report the new patterns. In this paper, we employ a novel approach for detecting changes in streaming data with the purpose of improving the quality of modeling the data streams. Through observing the outliers, this approach of change detection uses a weighted standard deviation to monitor the evolution of the distribution of data streams. A cumulative statistical test, Page-Hinkley, is employed to collect the evidence of changes in distribution. The parameter used for reporting the changes is self-adaptively adjusted according to the distribution of data streams, rather than set by a fixed empirical value. The self-adaptability of the novel approach enhances the effectiveness of modeling data streams by timely catching the changes of distributions. We validated the approach on an online clustering framework with a benchmark KDDcup 1999 intrusion detection data set as well as with a real-world grid data set. The validation results demonstrate its better performance on achieving higher accuracy and lower percentage of outliers comparing to the other change detection approaches. [less ▲]

Detailed reference viewed: 64 (0 UL)
Full Text
See detailK-AP: Generating Specified K Clusters by Efficient Affinity Propagation
Zhang, Xiangliang; Wang, Wei UL; Nørvåg, Kjetil et al

in ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining (2010)

The Affinity Propagation (AP) clustering algorithm proposed by Frey and Dueck (2007) provides an understandable, nearly optimal summary of a data set. However, it suffers two major shortcomings: i) the ... [more ▼]

The Affinity Propagation (AP) clustering algorithm proposed by Frey and Dueck (2007) provides an understandable, nearly optimal summary of a data set. However, it suffers two major shortcomings: i) the number of clusters is vague with the user-defined parameter called self-confidence, and ii) the quadratic computational complexity. When aiming at a given number of clusters due to prior knowledge, AP has to be launched many times until an appropriate setting of self-confidence is found. The re-launched AP increases the computational cost by one order of magnitude. In this paper, we propose an algorithm, called K-AP, to exploit the immediate results of K clusters by introducing a constraint in the process of message passing. Through theoretical analysis and experimental validation, K-AP was shown to be able to directly generate K clusters as user defined, with a negligible increase of computational cost compared to AP. In the meanwhile, KAP preserves the clustering quality as AP in terms of the distortion. K-AP is more effective than k-medoids w.r.t. the distortion minimization and higher clustering purity. [less ▲]

Detailed reference viewed: 43 (1 UL)
Full Text
See detailAbstracting Audit Data for Lightweight Intrusion Detection
Wang, Wei UL; Zhang, Xiangliang; Pitsilis, Georgios UL

in Information Systems Security (2010)

High speed of processing massive audit data is crucial for an anomaly Intrusion Detection System (IDS) to achieve real-time performance during the detection. Abstracting audit data is a potential solution ... [more ▼]

High speed of processing massive audit data is crucial for an anomaly Intrusion Detection System (IDS) to achieve real-time performance during the detection. Abstracting audit data is a potential solution to improve the efficiency of data processing. In this work, we propose two strategies of data abstraction in order to build a lightweight detection model. The first strategy is exemplar extraction and the second is attribute abstraction. Two clustering algorithms, Affinity Propagation (AP) as well as traditional k-means, are employed to extract the exemplars, and Principal Component Analysis (PCA) is employed to abstract important attributes (a.k.a. features) from the audit data. Real HTTP traffic data collected in our institute as well as KDD 1999 data are used to validate the two strategies of data abstraction. The extensive test results show that the process of exemplar extraction significantly improves the detection efficiency and has a better detection performance than PCA in data abstraction. [less ▲]

Detailed reference viewed: 44 (0 UL)