Unpublished conference/Abstract (Scientific congresses, symposiums and conference proceedings)
Small-Scale Testing on Generative AI and Post-OCR Correction in Historical Datasets
ARMASELU, Florentina
2024Digital Humanities Benelux 2024 Conference
Peer reviewed
 

Files


Full Text
DHB2024_GenAI_Post-OCR_Abstract.pdf
Author postprint (571.66 kB) Creative Commons License - Attribution
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
generative artificial intelligence; post-OCR correction; historical datasets
Abstract :
[en] This article proposes a small-scale investigation on the use of generative AI agents for post-OCR correction in historical datasets. Three chatbots, ChatGPT-4, Google Bard and YouChat and excerpts from 18th century French texts were utilised. The evaluation included qualitative and quantitative methods. Character and word error rates (CER, WER) were computed both by the agents and independently using a specialised Python library, and gold standard excerpts from the ICDAR 2017 competition on post-OCR text correction.
Disciplines :
Arts & humanities: Multidisciplinary, general & others
Author, co-author :
ARMASELU, Florentina  ;  University of Luxembourg > Luxembourg Centre for Contemporary and Digital History (C2DH) > Digital History and Historiography
External co-authors :
no
Language :
English
Title :
Small-Scale Testing on Generative AI and Post-OCR Correction in Historical Datasets
Publication date :
31 May 2024
Event name :
Digital Humanities Benelux 2024 Conference
Event place :
Leuven, Belgium
Event date :
from 5 to 7 June, 2024
Audience :
International
Peer reviewed :
Peer reviewed
Available on ORBilu :
since 20 June 2024

Statistics


Number of views
188 (4 by Unilu)
Number of downloads
127 (0 by Unilu)

Bibliography


Similar publications



Contact ORBilu