Speech BCI; Electroencephalography; Hilbert Envelope; Transfer Learning; Signal Processing
Résumé :
[en] Brain-Computer Interfaces (BCIs) can decode imagined speech from neural activity. However, these systems typically require extensive training sessions where participants imaginedly repeat words, leading to mental fatigue and difficulties identifying the onset of words, especially when imagining sequences of words. This paper addresses these challenges by transferring a classifier trained in overt speech data to covert speech classification. We used electroencephalogram (EEG) features derived from the Hilbert envelope and temporal fine structure, and used them to train a bidirectional long-short-term memory (BiLSTM) model for classification. Our method reduces the burden of extensive training and achieves state-of-the-art classification accuracy: 86.44% for overt speech and 79.82% for covert speech using the overt speech classifier.
Disciplines :
Sciences informatiques
Auteur, co-auteur :
DURAISAMY, Saravanakumar ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
DUBIEL, Mateusz ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
REKRUT, Maurice; German Research Center for Artificial Intelligence (DFKI)
LEIVA, Luis A. ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
Co-auteurs externes :
yes
Langue du document :
Anglais
Titre :
Transfer Learning for Covert Speech Classification Using EEG Hilbert Envelope and Temporal Fine Structure
Date de publication/diffusion :
06 avril 2025
Nom de la manifestation :
(ICASSP 2025): 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing
Lieu de la manifestation :
Hyderabad, Inde
Date de la manifestation :
from 6 to 11 April 2025
Manifestation à portée :
International
Titre de l'ouvrage principal :
Proceedings of ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Maison d'édition :
Institute of Electrical and Electronics Engineers (IEEE)
Peer reviewed :
Peer reviewed
Focus Area :
Computational Sciences
Projet européen :
HE - 101071147 - SYMBIOTIK - Context-aware adaptive visualizations for critical decision making
Projet FnR :
FNR15722813 - Brainsourcing For Affective Attention Estimation, 2021 (01/02/2022-31/01/2025) - Luis Leiva
Organisme subsidiant :
UE - Union Européenne
Subventionnement (détails) :
Research supported by the Horizon 2020 FET program of the European Union through the ERA-NET Cofund funding (grant CHIST-ERA-20-BCI-001) and the Pathfinder program of the European Innovation Council (SYMBIOTIK project, grant 101071147). Rekrut’s work is supported by the German Federal Ministry of Education and Research (grants 01IS12050 and 01IS23073).
J. Freitas, A. Teixeira, M. S. Dias, S. Silva, et al., An introduction to silent speech interfaces, Springer, 2017.
B. Denby, T. Schultz, K. Honda, T. Hueber, J. M. Gilbert, and J. S. Brumberg, “Silent speech interfaces, ” Speech Communication, vol. 52, no. 4, pp. 270-287, 2010.
A. Kapur, S. Kapur, and P. Maes, “Alterego: A personalized wearable silent speech interface, ” in Proceedings of the 23rd International Conference on Intelligent User Interfaces, 2018, pp. 43-53.
N. Kimura, M. Kono, and J. Rekimoto, “Sottovoce: An ultrasound imaging-based silent speech interaction using deep neural networks, ” in Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 2019, pp. 1-11.
L. Wang, X. Zhang, X. Zhong, and Y. Zhang, “Analysis and classification of speech imagery eeg for bci, ” Biomedical signal processing and control, vol. 8, no. 6, pp. 901-908, 2013.
J. A. Gonzalez-Lopez, A. Gomez-Alanis, J. M. M. Doñas, J. L. Pérez-Córdoba, and A. M. Gomez, “Silent speech interfaces for speech restoration: A review, ” IEEE access, vol. 8, pp. 177995-178021, 2020.
J. S. Brumberg, A. Nieto-Castanon, P. R. Kennedy, and F. H. Guenther, “Brain-computer interfaces for speech communication, ” Speech communication, vol. 52, no. 4, pp. 367-379, 2010.
J. S. Brumberg, D. J. Krusienski, S. Chakrabarti, A. Gunduz, P. Brunner, A. L. Ritaccio, and G. Schalk, “Spatio-temporal progression of cortical activity related to continuous overt and covert speech production in a reading task, ” PloS one, vol. 11, no. 11, pp. e0166872, 2016.
T. Proix, J. Delgado Saa, A. Christen, S. Martin, B. N. Pasley, R. T. Knight, X. Tian, D. Poeppel, W. K. Doyle, O. Devinsky, et al., “Imagined speech can be decoded from low-and cross-frequency intracranial eeg features, ” Nature communications, vol. 13, no. 1, pp. 48, 2022.
S. Luo, Q. Rabbani, and N. E. Crone, “Brain-computer interface: applications to speech decoding and synthesis to augment communication, ” Neurotherapeutics, vol. 19, no. 1, pp. 263-273, 2023.
D. Lopez-Bernal, D. Balderas, P. Ponce, and A. Molina, “A state-of-the-art review of eeg-based imagined speech decoding, ” Frontiers in human neuroscience, vol. 16, pp. 867281, 2022.
A. S. S. Reddy and R. B. Pachori, “Multivariate dynamic mode decomposition for automatic imagined speech recognition using multichannel eeg signals, ” IEEE Sensors Letters, 2024.
M. Rekrut, A. M. Selim, and A. Krüger, “Improving silent speech bci training procedures through transfer from overt to silent speech, ” in 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, 2022, pp. 2650-2656.
J. S. García-Salinas, L. Villaseñor-Pineda, C. A. Reyes-García, and A. A. Torres-García, “Transfer learning in imagined speech eeg-based bcis, ” Biomedical Signal Processing and Control, vol. 50, pp. 151-157, 2019.
S.-H. Lee, M. Lee, and S.-W. Lee, “Eeg representations of spatial and temporal features in imagined speech and overt speech, ” in Pattern Recognition: 5th Asian Conference, ACPR 2019, Auckland, New Zealand, November 26-29, 2019, Revised Selected Papers, Part II 5. Springer, 2020, pp. 387-400.
H. Watanabe, H. Tanaka, S. Sakti, and S. Nakamura, “Synchronization between overt speech envelope and eeg oscillations during imagined speech, ” Neuroscience research, vol. 153, pp. 48-55, 2020.
S. Komeiji, T. Mitsuhashi, Y. Iimura, H. Suzuki, H. Sugano, K. Shinoda, and T. Tanaka, “Feasibility of decoding covert speech in ecog with a transformer trained on overt speech, ” Scientific Reports, vol. 14, no. 1, pp. 11491, 2024.
A. Mohamed Selim, M. Rekrut, M. Barz, and D. Sonntag, “Speech imagery bci training using game with a purpose, ” in Proceedings of the 2024 International Conference on Advanced Visual Interfaces, 2024, pp. 1-5.
J. Iriarte, E. Urrestarazu, M. Valencia, M. Alegre, A. Malanda, C. Viteri, and J. Artieda, “Independent component analysis as a tool to eliminate artifacts in eeg: a quantitative study, ” Journal of clinical neurophysiology, vol. 20, no. 4, pp. 249-257, 2003.
D. Hilbert, “Fundamentals of a general theory of linear integral equations, ” in Integral equations and equations with infinitely many unknowns, pp. 6-169. Springer, 1989.
L. He and V. Dellwo, “A praat-based algorithm to extract the amplitude envelope and temporal fine structure using the hilbert transform, ” ISCA, 2016.
M. Saeidi, W. Karwowski, F. V. Farahani, K. Fiok, R. Taiar, P. A. Hancock, and A. Al-Juaid, “Neural decoding of eeg signals with machine learning: a systematic review, ” Brain Sciences, vol. 11, no. 11, pp. 1525, 2021.
S. Biswas and R. Sinha, “Wavelet filterbank-based eeg rhythm-specific spatial features for covert speech classification, ” IET Signal Processing, vol. 16, no. 1, pp. 92-105, 2022.
T. Hernández-Del-Toro, C. A. Reyes-García, and L. Villaseñor-Pineda, “Toward asynchronous eeg-based bci: Detecting imagined words segments in continuous eeg signals, ” Biomedical Signal Processing and Control, vol. 65, pp. 102351, 2021.
M. A. Bakhshali, M. Khademi, A. Ebrahimi-Moghadam, and S. Moghimi, “Eeg signal classification of imagined speech based on riemannian distance of correntropy spectral density, ” Biomedical Signal Processing and Control, vol. 59, pp. 101899, 2020.
I. J. Moon and S. H. Hong, “What is temporal fine structure and why is it important?, ” Korean journal of audiology, vol. 18, no. 1, pp. 1, 2014.
G. Ni, Z. Xu, Y. Bai, Q. Zheng, R. Zhao, Y. Wu, and D. Ming, “Eeg-based assessment of temporal fine structure and envelope effect in mandarin syllable and tone perception, ” Cerebral Cortex, vol. 33, no. 23, pp. 11287-11299, 2023.
W. Zhang, M. Jiang, K. A. C. Teo, R. Bhuvanakantham, L. Fong, W. K. J. Sim, Z. Guo, C. H. V. Foo, R. H. J. Chua, P. Padmanabhan, et al., “Revealing the spatiotemporal brain dynamics of covert speech compared with overt speech: A simultaneous eeg-fmri study, ” NeuroImage, vol. 293, pp. 120629, 2024.
G. M. Oppenheim and G. S. Dell, “Motor movement matters: The flexible abstractness of inner speech, ” Memory & cognition, vol. 38, no. 8, pp. 1147-1160, 2010.
C. Cooney, R. Folli, and D. Coyle, “Optimizing layers improves cnn generalization and transfer learning for imagined speech decoding from eeg, ” in 2019 IEEE international conference on systems, man and cybernetics (SMC). IEEE, 2019, pp. 1311-1316.