[en] This chapter considers the presence, retrievability, and analysis relating to women, gender, and COVID-19 in web archives, based on research in the international “novel coronavirus IIPC collection”. It focuses on challenges raised by the huge IIPC collection regarding multilingualism, “big data”, access and searchability, silence and noise, duplicates and loss of information, and the use of the Archives Research Compute Hub (ARCH) interface, developed by the Archives Unleashed Team. It shares the debates and technical challenges we faced, specifically in selecting computational tools and algorithms for conducting text mining and topic modeling. It also investigates issues relating to transnational studies, digital methods, collaborative work, scalable reading, and gender studies, while reflecting on asymmetries, invisibility, and inclusiveness in web archives.
Research center :
Luxembourg Centre for Contemporary and Digital History (C2DH) > Contemporary European History (EHI)
Disciplines :
Arts & humanities: Multidisciplinary, general & others
Author, co-author :
SCHAFER, Valerie ; University of Luxembourg > Luxembourg Centre for Contemporary and Digital History (C2DH) > Contemporary European History
CLAVERT, Frédéric ; University of Luxembourg > Luxembourg Centre for Contemporary and Digital History (C2DH) > Contemporary European History
Aasman, Susan; University of Gröningen
de Wild, Karin; Leiden University
Sirajzade, Joshgun
External co-authors :
yes
Language :
English
Title :
The challenges of searching for women in the COVID-19 web archive collections: Promises, achievements, and pitfalls
Publication date :
April 2025
Main work title :
The Routledge Companion to Transnational Web Archive Studies
Aasman, S., Brugger, N., de Wild, K., Clavert, F., Gebeil, S., & Schafer, V. (2021a). Analysing Web Archives of the COVID-19 Crisis through the IIPC collaborative collection: Early findings and further research question. IIPC netpreserve.org. Retrieved May 31, 2023, from https://web.archive.org/web/20230322212643/https://netpreserveblog.wordpress.com/2021/11/02/analysing-web-archives-ofthe-covid-19-crisis-through-the-iipc-collaborative-collection-early-findings-and-further-research-questions/.
Aasman, S., Bingham, N., Brugger, N., de Wild, K., Gebeil, S., & Schafer, V. (2021b). Chicken and Egg: Reporting from a Datathon Exploring Datasets of the COVID-19 Special Collections. WARCnet Paper. Retrieved May 31, 2023, from https://web.archive.org/web/20230902214737/https://cc.au.dk/fileadmin/dac/Projekter/WARCnet/Aasman_et_al_Chicken_and_Egg.pdf.
Aasman, S., Brugger, N., de Wild, K., Clavert, F., Gebeil, S., Schafer, V., & Sirajzade, J. (2022). Studying women and the COVID-19 crisis through the IIPC coronavirus collection, IIPC, netpreserve.org. Retrieved October 16, 2023, from https://web.archive.org/web/20230207134247/https://netpreserveblog.wordpress.com/2022/12/20/studying-women-and-the-C0VID-19-crisis-through-the-iipc-coronavirus-collection/.
ARCH. (2023). Archives Research Compute Hub (ARCH). Website archivesunleashed.org. Retrieved February 26, 2024, from https://web.archive.org/web/20230224212953/https://archivesunleashed.org/arch/.
Archive-It. (2023). International Internet preservation consortium.archive-it.org. Retrieved February 24, 2024, from https://web.archive.org/web/20230402054011/www.archive-it.org/home/IIPC.
Archive-It Team. (2021). Archives Unleashed and Archive-It’s ARCH Program Update.archive-It blog. Retrieved October 16, 2023, from https://web.archive.org/web/20230428025730/https://ait.blog.archive.org/post/arch-program-update/.
Bingham, N. (2020). IIPC content development Group’s activities 2019-2020. Retrieved October 16, 2023, from https://web.archive.org/web/20230323130322/https://netpreserveblog.wordpress.com/2020/07/01/iipc-content-development-groups-activities-2019-2020/.
Blasko, Z., Papadimitriou, E., & Manca, A. R(2020). How Will the COVID-19 Crisis Affect Existing Gender Divides in Europe. Publications Office of the European Union. Retrieved October 16, 2023, from https://web.archive.org/web/20230401141007/https://publications.jrc.ec.europa.eu/repository/handle/JRC120525.
Brugger, N. (2020). Welcome to WARCnet. WARCnet Paper. Retrieved February 24, 2024, from https://web.archive.org/web/20230924000852/https:/cc.au.dk/en/warcnet/warcnet-papers-and-special-reports.
Brugger, N., Laursen, D., & Nielsen, J. (2015). Studying a nation’s websphere over time: analytical and methodological considerations. IIPC Conference: Innovation, Connection and Co-Operation in Web Data, San Francisco, USA.
Clavert, F., & Fickers, A. (2021). On pyramids, prisms, and scalable reading. Journal of Digital History, no. jdh001. Retrieved May 31, 2023, from https://journalofdigitalhistory.org/en/article/jXupS3QAeNgb
Clavert, F., Mahroug, S., & Schafer, V. (2022). Preservation et distorsion: L’espace-temps des reseaux socio-numeriques et du web archive. Revue d’histoire culturelle. Retrieved May 31, 2023, from https://web.archive.org/web/20231004091659/https://revues.mshparisnord.fr/rhc/index.php?id=2791.
Fritz, S., Milligan, I., Ruest, N., & Lin, J. (2021). Fostering community engagement through Datathon events: The archives unleashed experience. Digital Humanities Quarterly, 15, 1. Retrieved October 16, 2023, from https://web.archive.org/web/20230929031010/www.digitalhumanities.org/dhq/vol/15/1/000536/000536.html.
Geeraert, F., & Bingham, N. (2020). Exploring Special Web Archives Collections Related to COVID-19: The Case of the IIPC Collaborative Collection. WARCnet Paper. Retrieved May 31, 2023, from https://web.archive.org/web/20230924000852/https:/cc.au.dk/en/warcnet/warcnet-papers-and-special-reports.
Geeraert, F., Bingham, N., Clavert, F., Strandgaard Jensen, H., & Winters, J. (2025). Oral histories and scalable reading: analysing born-digital collecting practices during the COVID-19 pandemic. In S. Aasman, A. Ben-David, & N.brugger (Eds.), Companion to Transnational Web Archive Studies (pp. 121-141). Routledge.
Geeraert, F., de Wild, K., & Aasman, S. (2025). What can we learn from URLs? Understanding the scope of COVID-19 web archive collections for transnational analyses. In S. Aasman, A. Ben-David, & N.brugger (Eds.), Companion to Transnational Web Archive Studies (pp. 160-176). Routledge.
Geeraert, F., Winters, J., et al. (2021, 18 juin). Representation, participation and inclusivity: European web archives collecting the digital traces of COVID-19. Talk during the 4th RESAW Conference at the University of Luxembourg. Retrieved October 16, 2023, from https://web.archive.org/web/20230918164635/www.resaw2021.net/programme/.
IIPC. (2023). Content Development Working Group. Netpreserve.org. Retrieved February 24, 2024, from https://web.archive.org/web/20230323033003/https://netpreserve.org/about-us/working-groups/content-development-working-group/.
Karsdorp, F., Kestemont, M., & Riddell, A. (2021). Humanities Data Analysis: Case Studies with Python. Princeton University Press. Retrieved February 24, 2024, from https://web.archive.org/web/20240217171742/www.humanitiesdataanalysis.org/introduction-cook-books/notebook.html.
Maemura, E., Worby, N., Milligan, I., & Becker, C. (2018). If these crawls could talk: Studying and documenting web archives provenance. Journal of the Association for Information Science and Technology, 69(10), 1223-1233.
Milligan, I. (2020). You Shouldn’t Need to Be a Web Historian to Use Web Archives: Lowering Barriers to Access Through Community and Infrastructure. WARCnet Paper. Retrieved May 31, 2023, from https://web.archive.org/web/20230924000852/https:/cc.au.dk/en/warcnet/warcnet-papers-and-special-reports.
Mons, B. (2018). Data Stewardship for Open Science. Implementing FAIR Principles. CRC Press.
Moretti, F. (2007). Graphs, Maps, Trees: Abstract Models for Literary History. Verso.
Mueller, M. (2014). Shakespeare his contemporaries: Collaborative curation and exploration of early modern drama in a digital environment. DHQ: Digital Humanities Quarterly, 8(3). Retrieved October 15, 2023, from https://web.archive.org/web/20231003050948/www.digitalhumanities.org/dhq/vol/8/3/000183/000183.html.
Nivakoski, S., Calo, X., Mencarini, L., & Profeta, P. (2022). COVID-19 pandemic and the gender divide at work and home. Eurofound. Retrieved May 31, 2023, from https://web.archive.org/web/20231016070730/www.eurofound.europa.eu/en/publications/2021/covid-19-pandemic-and-gender-divide-work-and-home.
Pomerantz, J. (2015). Metadata. MIT Press.
Rhodes, T. (2013). A Living, Breathing Revolution: How Libraries Can Use “Living Archives” to Support, Engage, and Document Social Movements. IFLA WLIC.
Ruest, N., Fritz, S., Deschamps, R., Lin, J., & Milligan, I. (2021). From archive to analysis: Accessing web archives at scale through a cloud-based interface. International Journal of Digital Humanities. Retrieved October 16, 2023, from https://web.archive.org/web/20231016071015/https://link.springer.com/article/10.1007/s42803-020-00029-6?wt_mc=Internal.Event.1.SEM.ArticleAuthorOnlineFirst&ArticleAuthorOnlineFirst_20210108#Sec7.
Schafer, V., Musiani, F., & Borelli, M. (2016). Negotiating the web of the past. French Journal for Media Research, 6. Retrieved October 16, 2023, from https://web.archive.org/web/20230918164203/http://frenchjournalformediaresearch.com/lodel-1.0/main/index.php?id=952.
Schafer, V., & Winters, J. (2021). The values of web archives. International Journal of Digital Humanities, 2, 129-144. Retrieved May 31, 2023, from https://web.archive.org/web/20220507025030/https://link.springer.com/article/10.1007/s42803-021-00037-0.
UN Women. (2021). Explainer. How COVID-19 impacts women and girls. UN Women website. Retrieved October 16, 2023, from https://web.archive.org/web/20230923145003/https://interactive.unwomen.org/multimedia/explainer/covid19/en/index.html.
WARCnet. (2024). WARCnet Papers and Special Reports. Retrieved February 24, 2024, from https://web.archive.org/web/20240102160933/https://cc.au.dk/en/warcnet/warcnet-papers-and-special-reports.