Reference : On the Reduction of Biases in Big Data Sets for the Detection of Irregular Power Usage
Scientific congresses, symposiums and conference proceedings : Paper published in a book
Engineering, computing & technology : Computer science
Computational Sciences
http://hdl.handle.net/10993/35427
On the Reduction of Biases in Big Data Sets for the Detection of Irregular Power Usage
English
Glauner, Patrick mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
State, Radu mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Valtchev, Petko [University of Quebec in Montreal]
Duarte, Diogo [CHOICE Technologies Holding Sàrl]
2018
Proceedings 13th International FLINS Conference on Data Science and Knowledge Engineering for Sensing Decision Support (FLINS 2018)
Yes
International
13th International FLINS Conference on Data Science and Knowledge Engineering for Sensing Decision Support (FLINS 2018)
from 21-08-2018 to 24-08-2018
[en] Bias ; Class imbalance ; Covariate shift ; Non-technical losses
[en] In machine learning, a bias occurs whenever training sets are not representative for the test data, which results in unreliable models. The most common biases in data are arguably class imbalance and covariate shift. In this work, we aim to shed light on this topic in order to increase the overall attention to this issue in the field of machine learning. We propose a scalable novel framework for reducing multiple biases in high-dimensional data sets in order to train more reliable predictors. We apply our methodology to the detection of irregular power usage from real, noisy industrial data. In emerging markets, irregular power usage, and electricity theft in particular, may range up to 40% of the total electricity distributed. Biased data sets are of particular issue in this domain. We show that reducing these biases increases the accuracy of the trained predictors. Our models have the potential to generate significant economic value in a real world application, as they are being deployed in a commercial software for the detection of irregular power usage.
http://hdl.handle.net/10993/35427

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Open access
On the Reduction of Biases in Big Data Sets for the Detection of Irregular Power Usage.pdfAuthor postprint275.17 kBView/Open

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.