Exploring Corpus Linguistics Approaches in Linguistic Landscape Research with Automatic Text Recognition Software

GILLES, Peter; Ziegler, Evelyn

doi:10.3726/b17795

Download

Contribution to collective works (Parts of books)

Exploring Corpus Linguistics Approaches in Linguistic Landscape Research with Automatic Text Recognition Software

GILLES, Peter; Ziegler, Evelyn

2021 • In Ziegler, Evelyn; Marten, Heiko F. (Eds.) Linguistic Landscapes im deutschsprachigen Kontext

Peer reviewed

Permalink
https://hdl.handle.net/10993/47176

DOI
10.3726/b17795

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

Gilles_Ziegler_GAL_Beitrag_final.pdf

Author preprint (5.53 MB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

text recognition; corpus linguistics

Abstract :

[en] Taking a more quantitative approach in linguistic landscape research, we explore recent techniques of automatic information extraction from images. The recently released Cloud Vision API by Google offers new perspectives on the software-assisted processing and classification of pictures. A software interface makes it possible to extract various kinds of information from pictures automatically, among them the written text, certain labels to describe the picture (e.g. road sign, shop sign, prohibition sign) or the colours used in the picture. Applying this new technique to large-scale image data collections will not only enhance analysis but may also offer hitherto unrecognized structures. The data comes from a large-scale investigation of the Ruhr Metropolis in Germany, where 25,504 photos have been taken to document the linguistic landscape of selected neighbourhoods in four cities (Ziegler et al. 2018). This data has been annotated manually in various categories to analyze the occurrence, form and function of visual multilingualism. These pictures are then automatically processed by the Cloud Vision API and the results compared to the manual annotation. It will be shown that the quality of the image recognition greatly depends on the quality of the picture. The textual information extracted from the pictures will be stored in a database. Rather than presenting results on the linguistic landscape, this chapter is predominantly concerned with practical tools to facilitate large-scale linguistic landscape research.

Disciplines :

Languages & linguistics

Author, co-author :

GILLES, Peter ; University of Luxembourg > Faculty of Humanities, Education and Social Sciences (FHSE) > Department of Humanities (DHUM)

Ziegler, Evelyn

External co-authors :

yes

Language :

English

Title :

Exploring Corpus Linguistics Approaches in Linguistic Landscape Research with Automatic Text Recognition Software

Publication date :

2021

Main work title :

Linguistic Landscapes im deutschsprachigen Kontext

Editor :

Ziegler, Evelyn

Marten, Heiko F.

Publisher :

Peter Lang D, Frankfurt, Unknown/unspecified

ISBN/EAN :

978-3-631-84069-6 978-3-631-84068-9 978-3-631-84070-2 978-3-631-79110-3

Pages :

65--86

Peer reviewed :

Peer reviewed

Focus Area :

Computational Sciences

Additional URL :

https://www.peterlang.com/view/title/70936

Available on ORBilu :

since 21 May 2021

Statistics

Number of views

489 (1 by Unilu)

Number of downloads

723 (3 by Unilu)

More statistics

OpenCitations

OpenAlex citations