Reference : rom Text to Bits: Making the Treaties of a Monk from the 16th Century "Understandable...
Scientific congresses, symposiums and conference proceedings : Unpublished conference
Arts & humanities : History
Computational Sciences
rom Text to Bits: Making the Treaties of a Monk from the 16th Century "Understandable" to the Computer
Dubuisson, Bastian mailto [University of Luxembourg > Faculty of Language and Literature, Humanities, Arts and Education (FLSHASE) > Identités, Politiques, Sociétés, Espaces (IPSE) >]
Greene's Institute Conference Series "Found in Translation Interpreting Reworking and Reinventing Texts"
Greene's Institute
United Kingdom
[en] optical character recognition ; stylometry ; Trier
[en] Historians who aspire to explore texts with the help of the computer are more than often confronted to the absence of machine-readable corpora to suit their needs. As such, the early 16th century Latin treaties about the city of Trier and its relics by the Benedictine monk Johannes Scheckmann are no exception. Despite their historical importance, among others for the study of the religious phenomenon during this pivotal era, these texts remain in their original form.
Optical Character Recognition software makes it possible to overcome this gap by translating printed or handwritten characters into encoded glyphs, but not at any cost. Creating a digital corpus requires a careful reflection, particularly when it comes to small scale projects. Moreover, Latin being a highly inflected language, further post-processing steps such as tokenization and lemmatization are necessary in order to finally be able to process a corpus of texts with lines of code. During the digital transformation, each step leads to a profound but necessary modification of the original text and the creation of new data.
The example of Scheckmann’s works illustrates what is lost when a text is turned into bits, as well as what can be discovered once the machine reads.

There is no file associated with this reference.

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.