Reference : Machine Learning to Geographically Enrich Understudied Sources: A Conceptual Approach
Scientific congresses, symposiums and conference proceedings : Paper published in a book
Engineering, computing & technology : Multidisciplinary, general & others
Computational Sciences
http://hdl.handle.net/10993/42937
Machine Learning to Geographically Enrich Understudied Sources: A Conceptual Approach
English
Viola, Lorella mailto [University of Luxembourg > Luxembourg Center for Contemporary and Digital History (C2DH) > >]
Verheul, Jaap mailto [Universiteit Utrecht > History and Art History]
2020
Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 1: ARTIDIGH
Rocha, Ana
Steels, Luc
van den Herik, Jaap
SCITEPRESS
NLPinAI
469-475
Yes
International
978-989-758-395-7
12th International Conference on Agents and Artificial Intelligence
from 22-02-2020 to 24-02-2020
Valletta
Malta
[en] Machine Learning ; Sequence Tagging ; Spatial Humanities
[en] This paper discusses the added value of applying machine learning (ML) to contextually enrich digital collections. In this study, we employed ML as a method to geographically enrich historical datasets. Specifically, we used a sequence tagging tool (Riedl and Padó 2018) which implements TensorFlow to perform NER on a corpus of historical immigrant newspapers. Afterwards, the entities were extracted and geocoded. The aim was to prepare large quantities of unstructured data for a conceptual historical analysis of geographical references. The intention was to develop a method that would assist researchers working in spatial humanities, a recently emerged interdisciplinary field focused on geographic and conceptual space. Here we describe the ML methodology and the geocoding phase of the project, focussing on the advantages and challenges of this approach, particularly for humanities scholars. We also argue that, by choosing to use largely neglected sources such as immigrant newspapers (a lso known as ethnic newspapers), this study contributes to the debate about diversity representation and archival biases in digital practices.
Researchers ; Professionals ; Students
http://hdl.handle.net/10993/42937
10.5220/0009094204690475
https://www.scitepress.org/PublicationsDetail.aspx?ID=R0/n7aoZn9Q=&t=1
The original publication is available at scitepress.org

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Limited access
ARTIDIGH_2020_1.pdfPublisher postprint575.11 kBRequest a copy

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.