Communication publiée dans un ouvrage (Colloques, congrès, conférences scientifiques et actes)
"¿Te vienes? Sure!" Joint Fine-tuning of Language Detection and Transcription Improves Automatic Recognition of Code-Switching Speech
HILLAH, Léopold Edem Ayité; DUBIEL, Mateusz; LEIVA, Luis A.
2024In Proceedings of the 6th ACM Conference on Conversational User Interfaces
Peer reviewed
 

Documents


Texte intégral
Joint_Fine_tuning_of_Language_Detection_and_ASR_for_Code_Switching_Speech.pdf
Postprint Auteur (694.09 kB)
Télécharger

Tous les documents dans ORBilu sont protégés par une licence d'utilisation.

Envoyer vers



Détails



Mots-clés :
Code Switching; Multilingual Conversations; Language Identification; Automatic Speech Recognition; Whisper; Speech
Résumé :
[en] Human communication in multilingual communities often leads to code-switching, where individuals seamlessly alternate between two or more languages in their daily interactions. While this phenomenon has been increasingly prevalent thanks to linguistic globalization, it presents challenges for Automatic Speech Recognition (ASR) systems since they are designed with the assumption of transcribing a single language at a time. In this work, we propose a simple yet unexplored approach to tackle this challenge by fine-tuning the Whisper pre-trained model jointly on language identification (LID) and transcription tasks through the introduction of an auxiliary LID loss term. Our results show significant improvements in transcription errors, ranging between 14 and 36 percentage points of difference. Ultimately, our work opens a new direction for research on code-switching speech, offering an opportunity to enhance current capabilities of conversational agents.
Disciplines :
Sciences informatiques
Auteur, co-auteur :
HILLAH, Léopold Edem Ayité  ;  University of Luxembourg
DUBIEL, Mateusz  ;  University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
LEIVA, Luis A.  ;  University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
Co-auteurs externes :
no
Langue du document :
Anglais
Titre :
"¿Te vienes? Sure!" Joint Fine-tuning of Language Detection and Transcription Improves Automatic Recognition of Code-Switching Speech
Date de publication/diffusion :
08 juillet 2024
Nom de la manifestation :
CUI '24: 6th ACM Conference on Conversational User Interfaces
Organisateur de la manifestation :
Association for Computing Machinery (ACM)
Lieu de la manifestation :
Luxembourg City, Luxembourg
Date de la manifestation :
from 8 to 10 July 2024
Manifestation à portée :
International
Titre de l'ouvrage principal :
Proceedings of the 6th ACM Conference on Conversational User Interfaces
Maison d'édition :
Association for Computing Machinery, New York, NY, Etats-Unis
Peer reviewed :
Peer reviewed
Focus Area :
Computational Sciences
Projet européen :
HE - 101071147 - SYMBIOTIK - Context-aware adaptive visualizations for critical decision making
Projet FnR :
FNR15722813 - Brainsourcing For Affective Attention Estimation, 2021 (01/02/2022-31/01/2025) - Luis Leiva
Organisme subsidiant :
Union Européenne
Subventionnement (détails) :
This work is supported by the Horizon 2020 FET program of the European Union through the ERA-NET Cofund funding (BANANA, grant CHIST-ERA-20-BCI-001) and Horizon Europe's European Innovation Council through the Pathfinder program (SYMBIOTIK, grant 101071147).
Disponible sur ORBilu :
depuis le 12 juillet 2024

Statistiques


Nombre de vues
136 (dont 12 Unilu)
Nombre de téléchargements
245 (dont 10 Unilu)

citations Scopus®
 
1
citations Scopus®
sans auto-citations
1
OpenCitations
 
0
citations OpenAlex
 
1

Bibliographie


Publications similaires



Contacter ORBilu