[en] An important field within data mining and pattern recognition is
classification. Classification is necessary in a number nowadays-world
processes. Several works and methods have been proposed with the goal to
achieve classifiers to be more effective each time. However, most of them
consider the training sets to be perfectly clustered, without having into
account that incorrectly classified data might be in them. The process of
removing incorrectly classified objects is called noise cleaning. Obviously,
noise cleaning influences considerably in classification of new samples. In
this work, we present a neighborhood-based algorithm for noise cleaning on data
stream for classification. In addition, it considers the data distribution
changes that may occur on the time. It was measured, by several experiments,
the effect of the method on automatic building of training sets by using
databases from UCI repository and two synthetic ones. The obtained results show
prove the efficacy of the proposed noise cleaning strategy and its influence on
the right classification of new samples.
Disciplines :
Computer science
Author, co-author :
Toro Pozo, Jorge Luis ; University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC)
Pascual González, Damaris; Universidad de Ortiente > Faculty of Economics
Vázquez Mesa, Fernando; Universidad de Oriente > Faculty of Economics
External co-authors :
yes
Language :
Spanish
Title :
Limpieza de ruido para clasificación basado en vecindad y cambios de concepto en el tiempo
Alternative titles :
[en] Noise cleaning for classification based on neighborhood and concept changes over time
Publication date :
April 2016
Journal title :
Revista Cubana de Ciencias Informaticas
ISSN :
2227-1899
Publisher :
Universidad de las Ciencias Informaticas, Havana, Cuba