![]() Viola, Lorella ![]() in Digital Humanities Quarterly (2022), 16(2), Linking large digitized newspaper corpora in different languages that have become available in national and state libraries opens up new possibilities for the computational analysis of patterns of ... [more ▼] Linking large digitized newspaper corpora in different languages that have become available in national and state libraries opens up new possibilities for the computational analysis of patterns of information flow across national and linguistic boundaries. The significant contribution this article presents is to demonstrate how word vector models can be used to explore the way concepts have shifted in meaning over time, as they migrated across space, by comparing newspapers from different countries published between 1840 and 1914. We define a concept, rather pragmatically, as a key term or core idea that has been used in historical discourse: an abstraction or mental representation that has served as a building block for thoughts and beliefs. We use historical newspapers in English, Finnish, German and Swedish from collections in the UK, US, Germany, and Finland, as well as the Europeana collection. As use cases, we analyze how the different conceptual constructs of “nation” and “illness” emerged and changed between 1840 and 1920. Conceptual change over time is simulated by creating a series of overlapping word vector models, each spanning ten years. Historical vocabularies are retrieved on the basis of vector space proximity. Conceptual change across space is simulated by comparing the historical change of vocabularies in newspaper collections from different nations in several languages. This computational approach to conceptual history opens up new ways to identify patterns in public discourse over longer periods of time and across borders. [less ▲] Detailed reference viewed: 26 (1 UL)![]() Viola, Lorella ![]() in Frontiers in Artificial Intelligence (2020), 3(64), This study proposes an experimental method to trace the historical evolution of media discourse as a means to investigate the construction of collective meaning. Based on distributional semantics theory ... [more ▼] This study proposes an experimental method to trace the historical evolution of media discourse as a means to investigate the construction of collective meaning. Based on distributional semantics theory (Harris, 1954; Firth, 1957) and critical discourse theory (Wodak and Fairclough, 1997), it explores the value of merging two techniques widely employed to investigate language and meaning in two separate fields: neural word embeddings (computational linguistics) and the discourse-historical approach (DHA; Reisigl and Wodak, 2001) (applied linguistics). As a use case, we investigate the historical changes in the semantic space of public discourse of migration in the United Kingdom, and we use the Times Digital Archive (TDA) from 1900 to 2000 as dataset. For the computational part, we use the publicly available TDA word2vec models1 (Kenter et al., 2015; Martinez-Ortiz et al., 2016); these models have been trained according to sliding time windows with the specific intention to map conceptual change. We then use DHA to triangulate the results generated by the word vector models with social and historical data to identify plausible explanations for the changes in the public debate. By bringing the focus of the analysis to the level of discourse, with this method, we aim to go beyond mapping different senses expressed by single words and to add the currently missing sociohistorical and sociolinguistic depth to the computational results. The study rests on the foundation that social changes will be reflected in changes in public discourse (Couldry, 2008). Although correlation does not prove direct causation, we argue that historical events, language, and meaning should be considered as a mutually reinforcing cycle in which the language used to describe events shapes explicit meanings, which in turn trigger other events, which again will be reflected in the public discourse. [less ▲] Detailed reference viewed: 49 (2 UL)![]() Viola, Lorella ![]() in Rocha, Ana; Steels, Luc; van den Herik, Jaap (Eds.) Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 1: ARTIDIGH (2020) This paper discusses the added value of applying machine learning (ML) to contextually enrich digital collections. In this study, we employed ML as a method to geographically enrich historical datasets ... [more ▼] This paper discusses the added value of applying machine learning (ML) to contextually enrich digital collections. In this study, we employed ML as a method to geographically enrich historical datasets. Specifically, we used a sequence tagging tool (Riedl and Padó 2018) which implements TensorFlow to perform NER on a corpus of historical immigrant newspapers. Afterwards, the entities were extracted and geocoded. The aim was to prepare large quantities of unstructured data for a conceptual historical analysis of geographical references. The intention was to develop a method that would assist researchers working in spatial humanities, a recently emerged interdisciplinary field focused on geographic and conceptual space. Here we describe the ML methodology and the geocoding phase of the project, focussing on the advantages and challenges of this approach, particularly for humanities scholars. We also argue that, by choosing to use largely neglected sources such as immigrant newspapers (a lso known as ethnic newspapers), this study contributes to the debate about diversity representation and archival biases in digital practices. [less ▲] Detailed reference viewed: 151 (9 UL)![]() Viola, Lorella ![]() Scientific Conference (2020) Detailed reference viewed: 70 (6 UL)![]() Viola, Lorella ![]() in Digital Scholarship in the Humanities (2019) This article aims to offer a methodological contribution to digital humanities by exploring the value of a mixed-method approach to uncover and understand historical patterns in large quantities of ... [more ▼] This article aims to offer a methodological contribution to digital humanities by exploring the value of a mixed-method approach to uncover and understand historical patterns in large quantities of textual data. It refines the distant reading technique of topic modelling (TM) by using the discourse-historical approach (DHA——Wodak, 2001) in order to analyse the mechanisms underlying discursive practices in historical newspapers. Specifically, we investigate public discourses produced by Italian minorities and test the methodology on a corpus of digitized Italian ethnic newspapers published in the USA between 1898 and 1920 (ChroniclItaly—Viola, 2018). This combined methodology, which we suggest to label ‘discourse-driven topic modelling’ (DDTM), enabled us to triangulate linguistic, social, and historical data and to examine how the changing experience of migration, identity construction, and assimilation was reflected over time in the accounts of the minorities themselves. The results proved DDTM to be effective in obtaining a categorization of the topics discussed in the immigrant press. The changing distribution of topics over time revealed how the Italian immigrant community negotiated their sense of connectedness with both the host country and the homeland. At the same time, without jeopardizing the analytical depth of the findings, the method proved its value of minimizing the risk of biases when identifying the topics which stemmed from the results rather than from preconceived assumptions. [less ▲] Detailed reference viewed: 103 (1 UL)![]() Viola, Lorella ![]() Software (2019) Detailed reference viewed: 50 (1 UL)![]() Viola, Lorella ![]() Software (2019) Detailed reference viewed: 45 (3 UL)![]() Viola, Lorella ![]() in Identity (2019) This article explores a novel way to understand the process of diasporic identity formation by comparing the discursive structure of Italian diasporic newspapers published in the United States with the ... [more ▼] This article explores a novel way to understand the process of diasporic identity formation by comparing the discursive structure of Italian diasporic newspapers published in the United States with the baseline of public discourse in Italy. It uses as its evidence Italian language newspapers published in the United States from 1898 to 1920 (ChroniclItaly) and the Italian newspaper La Stampa published in Italy between 1867 and 1900. Applying a mixed-method approach of close and distant reading, the study examines how the ideological concept of Italian identity, linguistically represented by the anchor word italianità, “Italianness”, was constructed in these printed media at the turn of the twentieth century. The overarching aim is to explore how differences in the two identity constructions can be explained from their specific historical contexts: the process of ethnic integration and redefinition in the United States as opposed to the need to consolidate national unity in the face of emerging nations and nationalism within Europe. [less ▲] Detailed reference viewed: 93 (3 UL) |
||