Doctoral thesis (Dissertations and theses)
ML-based data-entry automation and data anomaly detection to support data quality assurance
BELGACEM, Hichem
2023
 

Files


Full Text
Thesis_HB.pdf
Author postprint (1.77 MB)
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
Form filling, Data entry forms, Machine Learning, Software data quality, User interfaces
Abstract :
[en] Data playsacentralroleinmodernsoftwaresystems,whichare very oftenpoweredbymachinelearning(ML)andusedincriticaldo- mains ofourdailylives,suchasfinance,health,andtransportation. However,theeffectivenessofML-intensivesoftwareapplicationshighly depends onthequalityofthedata.Dataqualityisaffectedbydata anomalies; dataentryerrorsareoneofthemainsourcesofanomalies. The goalofthisthesisistodevelopapproachestoensuredataquality by preventingdataentryerrorsduringtheform-fillingprocessandby checking theofflinedatasavedindatabases. The maincontributionsofthisthesisare: 1. LAFF, anapproachtoautomaticallysuggestpossiblevaluesofcat- egorical fieldsindataentryforms. 2. LACQUER, anapproachtoautomaticallyrelaxthecompleteness requirementofdataentryformsbydecidingwhenafieldshould be optionalbasedonthefilledfieldsandhistoricalinputinstances. 3. LAFF-AD, anapproachtoautomaticallydetectdataanomaliesin categorical columnsinofflinedatasets. LAFF andLACQUERfocusmainlyonpreventingdataentryerrors during theform-fillingprocess.Bothapproachescanbeintegratedinto data entryapplicationsasefficientandeffectivestrategiestoassistthe user duringtheform-fillingprocess.LAFF-ADcanbeusedofflineon existing suspiciousdatatoeffectivelydetectanomaliesincategorical data. In addition,weperformedanextensiveevaluationofthethreeap- proaches,assessingtheireffectivenessandefficiency,usingreal-world datasets.
Disciplines :
Computer science
Author, co-author :
BELGACEM, Hichem ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV
Language :
English
Title :
ML-based data-entry automation and data anomaly detection to support data quality assurance
Defense date :
15 September 2023
Institution :
Unilu - University of Luxembourg [The Faculty of Sciences, Technology and Medicine], Luxembourg, Luxembourg
Degree :
Docteur en Informatique (DIP_DOC_0006_B)
Promotor :
BIANCULLI, Domenico  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV
President :
BRIAND, Lionel ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV
Jury member :
Boytsov, Andrey
SHIN, Seung Yeob  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV
Baresi, Luciano;  Politecnico Di Milano
Available on ORBilu :
since 20 November 2023

Statistics


Number of views
150 (19 by Unilu)
Number of downloads
187 (5 by Unilu)

Bibliography


Similar publications



Contact ORBilu