Scientific presentation in universities or research centers (Scientific presentations in universities or research centers)
From Text to Bits: Making the Treaties of a Monk from the 16th Century "Understandable" to the Computer
Dubuisson, Bastien
2019
 

Files


Full Text
DUBUISSONBastien_Presentation_Greenes_Institute.pdf
Author preprint (13.29 MB)
Presentation slides
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
optical character recognition; stylometry; Trier
Abstract :
[en] Historians who aspire to explore texts with the help of the computer are more than often confronted to the absence of machine-readable corpora to suit their needs. As such, the early 16th century Latin treaties about the city of Trier and its relics by the Benedictine monk Johannes Scheckmann are no exception. Despite their historical importance, among others for the study of the religious phenomenon during this pivotal era, these texts remain in their original form. Optical Character Recognition software makes it possible to overcome this gap by translating printed or handwritten characters into encoded glyphs, but not at any cost. Creating a digital corpus requires a careful reflection, particularly when it comes to small scale projects. Moreover, Latin being a highly inflected language, further post-processing steps such as tokenization and lemmatization are necessary in order to finally be able to process a corpus of texts with lines of code. During the digital transformation, each step leads to a profound but necessary modification of the original text and the creation of new data. The example of Scheckmann’s works illustrates what is lost when a text is turned into bits, as well as what can be discovered once the machine reads.
Disciplines :
History
Author, co-author :
Dubuisson, Bastien ;  University of Luxembourg > Faculty of Humanities, Education and Social Sciences (FHSE) > Department of Humanities (DHUM)
Language :
English
Title :
From Text to Bits: Making the Treaties of a Monk from the 16th Century "Understandable" to the Computer
Publication date :
19 October 2019
Event name :
Greene's Institute Conference Series "Found in Translation Interpreting Reworking and Reinventing Texts"
Event organizer :
Greene's Institute
Event place :
Oxford, United Kingdom
Event date :
19-10-2019
Audience :
International
Focus Area :
Computational Sciences
FnR Project :
FNR13505915 - Books, Saints, And Men: For A Revaluation Of Latin Hagiographic Culture In The Diocese Of Trier (13th-16th Centuries), 2019 (15/09/2019-14/09/2023) - Bastien Dubuisson
Available on ORBilu :
since 05 March 2020

Statistics


Number of views
62 (7 by Unilu)
Number of downloads
13 (2 by Unilu)

Bibliography


Similar publications



Contact ORBilu