Semantic Analysis of Spoken Input Using Markov Logic Networks

DESPOTOVIC, Vladimir; Walter, Oliver; Haeb-Umbach, Reinhold

Paper published in a book (Scientific congresses, symposiums and conference proceedings)

DESPOTOVIC, Vladimir; Walter, Oliver; Haeb-Umbach, Reinhold

2015 • In Proceedings of the 16th Annual Conference of the International Speech Communication Association (INTERSPEECH 2015)

Peer reviewed

Permalink
https://hdl.handle.net/10993/40877

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

INTERSPEECH 2015.pdf

Publisher postprint (176.16 kB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Unsupervised learning; Acoustic units; Speech; Markov Logic Networks; Semantic frame

Abstract :

[en] We present a semantic analysis technique for spoken input using Markov Logic Networks (MLNs). MLNs combine graphical models with first-order logic. They are particularly suitable for providing inference in the presence of inconsistent and in- complete data, which are typical of an automatic speech recognizer’s (ASR) output in the presence of degraded speech. The target application is a speech interface to a home automation system to be operated by people with speech impairments, where the ASR output is particularly noisy. In order to cater for dysarthric speech with non-canonical phoneme realizations, acoustic representations of the input speech are learned in an unsupervised fashion. While training data transcripts are not required for the acoustic model training, the MLN training requires supervision, however, at a rather loose and abstract level. Results on two databases, one of them for dysarthric speech, show that MLN-based semantic analysis clearly outperforms baseline approaches employing non-negative matrix factorization, multinomial naive Bayes models, or support vector machines.

Disciplines :

Computer science

Author, co-author :

DESPOTOVIC, Vladimir ; University of Belgrade > Technical Faculty in Bor

Walter, Oliver; University of Paderborn > Department of Communications Engineering

Haeb-Umbach, Reinhold; University of Paderborn > Department of Communications Engineering

External co-authors :

yes

Language :

English

Title :

Semantic Analysis of Spoken Input Using Markov Logic Networks

Publication date :

September 2015

Event name :

16th Annual Conference of the International Speech Communication Association (INTERSPEECH 2015)

Event place :

Dresden, Germany

Event date :

from 06-09-2015 to 10-09-2015

Audience :

International

Main work title :

Proceedings of the 16th Annual Conference of the International Speech Communication Association (INTERSPEECH 2015)

Pages :

1859-1863

Peer reviewed :

Peer reviewed

Additional URL :

https://www.isca-speech.org/archive/interspeech_2015/i15_1859.html

Funders :

DFG - Deutsche Forschungsgemeinschaft
CE - Commission Européenne

Available on ORBilu :

since 05 November 2019

Statistics

Number of views

141 (0 by Unilu)

Number of downloads

94 (0 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

Bibliography

B. Coppola, A. Moschitti, and G. Riccardi, "Shallow semanticparsing for spoken language understand ing, " in Proceedings ofHuman Language Technologies: The 2009 Annual Conference ofthe North American Chapter of the Association for ComputationalLinguistics, Companion Volume: Short Papers, ser. NAACLShort'09, 2009, pp. 85-88.
Y.-Y. Wang, "Strategies for statistical spoken language understand ingwith small amount of data-an empirical study. " in INTERSPEECH2010, 11th Annual Conference of the InternationalSpeech Communication Association, Makuhari, Chiba, Japan, 2010, pp. 2498-2501.
R. D. Mori, F. Béchet, D. Hakkani-Tür, M. McTear, G. Riccardi, and G. Tur, "Spoken language understand ing-interpreting thesigns given by a speech signal, " IEEE Signal Processing Magazine, pp. 50-58, 2008.
E. Sand ers, M. B. Ruiter, L. Beijer, and H. Strik, "Automaticrecognition of dutch dysarthric speech: A pilot study, " in 7thInternational Conference on Spoken Language Processing, ICSLP2002-INTERSPEECH 2002, Denver, Colorado, USA, 2002.
J. F. Gemmeke, J. V. D. Loo, G. D. Pauw, J. Driesen, H. V. hamme, and W. Daelemans, "A self-learning assistive vocal interfacebased on vocabulary learning and grammar induction, " Proc. INTERSPEECH, 2012, pp. 1-4.
J. F. Gemmeke, B. Ons, H. Van hamme, J. van de Loo, W. D. G. De Pauw, J. Huyghe, J. Derboven, L. Vugen, B. van Den Broeck, P. Karsmakers, and B. Vanrumste, "Self-taught assistive vocal interfaces: An overview of the ALADIN project, " in Proc. INTERSPEECH, 2013, pp. 1-5.
O. Walter, V. Despotovic, R. Haeb-Umbach, J. Gemmeke, B. Ons, and H. Van hamme, "An evaluation of unsupervisedacoustic model training for a dysarthric speech interface, " inINTERSPEECH 2014, 2014. [Online]. Available: http: //nt. unipaderborn. de/public/pubs/2014/WaDeHaebGeOnVa14. pdf
Z. Chen, S. Tamang, A. Lee, X. Li, M. Passantino, and H. Ji, "Topdownand bottom-up: A combined approach to slot filling. " inAAIRS, ser. Lecture Notes in Computer Science, vol. 6458, 2010, pp. 300-309.
S. Riedel and I. Meza-Ruiz, "Collective semantic role labellingwith markov logic, " in Proceedings of the Twelfth Conferenceon Computational Natural Language Learning, ser. CoNLL '08, 2008, pp. 193-197.
I. Meza-Ruiz and S. Riedel, "Multilingual semantic role labellingwith markov logic, " in Proceedings of the Thirteenth Conferenceon Computational Natural Language Learning (CoNLL 2009): Shared Task, June 2009, pp. 85-90.
W. Che and T. Liu, "Jointly modeling wsd and srl with markovlogic, " in Proceedings of the 23rd International Conference onComputational Linguistics, ser. COLING '10, 2010, pp. 161-169.
M.-J. Meurs, F. Duvert, F. Lefevre, and R. D. Mori, "Markovlogic networks for spoken language interpretation, " InformationSystems Journal, pp. 535-544, 2008.
C. Kennington and D. Schlangen, "Markov logic networks for situatedincremental natural language understand ing. " in SIGDIALConference, 2012, pp. 314-323.
H. Poon and P. Domingos, "Unsupervised semantic parsing, " Proceedings of the 2009 Conference on Empirical Methods inNatural Language Processing: Volume 1, ser. EMNLP '09, 2009, pp. 1-10.
C. R. Kennington and D. Schlangen, "Situated incremental naturallanguage understand ing using markov logic networks, " ComputerSpeech and Language, vol. 28, no. 1, pp. 240-255, 2014.
T. Netsanet, O. Bart, van de Loo Janneke, G. Jort, D. P. Guy, D. Walter, and V. hamme Hugo, "Metadata for corpora patcor and domotica-2, " KU Leuven, Tech. Rep., 2013.
M. Richardson and P. Domingos, "Markov logic networks, "Mach. Learn., vol. 62, no. 1-2, pp. 107-136, 2006.
S. Kok, M. Sumner, M. Richardson, P. Singla, H. Poon, D. Lowd, J. Wang, and P. Domingos, "The alchemy system for statisticalrelational AI, " Department of Computer Science and Engineering, University of Washington, Seattle, WA., Tech. Rep., 2009.
H. Poon and P. Domingos, "Sound and efficient inference withprobabilistic and deterministic dependencies, " in Proc. of the 21stNational Conference on Artificial Intelligence (AAAI '06), Boston, Massachusetts, USA, 2006, pp. 458-463.
S. Chaudhuri and B. Raj, "Unsupervised structure discovery forsemantic analysis of audio, " in Advances in Neural InformationProcessing Systems 25: 26th Annual Conference on Neural InformationProcessing Systems 2012, Lake Tahoe, Nevada, UnitedStates., 2012, pp. 1187-1195.
S. Chaudhuri, M. Harvilla, and B. Raj, "Unsupervised learning ofacoustic unit descriptors for audio content representation and classification. "in INTERSPEECH 2011, 12th Annual Conference ofthe International Speech Communication Association, Florence, Italy, 2011, pp. 2265-2268.
M. Siu, H. Gish, A. Chan, W. Belfield, and S. Lowe, "UnsupervisedTraining of an HMM-Based Self-Organising Unit Recognizerwith Applications to Topic Classification and Keyword Discovery, "Comput. Speech Lang., vol. 28, no. 1, pp. 210-223, Jan. 2013.
O. Walter, V. Despotovic, R. Haeb-Umbach, J. Gemmeke, B. Ons, and H. Van hamme, "An evaluation of unsupervised acousticmodel training for a dysarthric speech interface, " in INTERSPEECH2014, 15th Annual Conference of the InternationalSpeech Communication Association, Singapore, 2014, pp. 1013-1017.
O. Walter, T. Korthals, R. Haeb-Umbach, and B. Raj, "HierarchicalSystem for Word Discovery Exploiting DTW-Based Initialization, "in Automatic Speech Recognition and Understand ingWorkshop (ASRU 2013), Dec. 2013, pp. 386-391.
J. F. Gemmeke, B. Ons, N. Tessema, H. V. hamme, J. van de Loo, G. D. Pauw, W. Daelemans, J. Huyghe, J. Derboven, L. Vuegen, B. V. D. Broeck, P. Karsmakers, and B. Vanrumste, "Self-taughtassistive vocal interfaces: An overview of the ALADIN project, "in INTERSPEECH 2013, 14th Annual Conference of the InternationalSpeech Communication Association, Lyon, France, 2013, pp. 2039-2043.
C. Middag, "Automatic analysis of pathological speech, " Ph. D. dissertation, Ghent University, Belgium, 2012.
B. Ons, N. Tessema, J. van de Loo, J. Gemmeke, G. D. Pauw, W. Daelemans, and H. V. hamme, "A self learning vocal interfacefor speech-impaired users, " in 4th Workshop on Speech and Language Processing for Assistive Technologies (SLPAT), Lyon, France, 2013, pp. 78-81.
A. M. Kibriya, E. Frank, B. Pfahringer, and G. Holmes, "Multinomialnaive bayes for text categorization revisited, " in Proceedingsof the 17th Australian Joint Conference on Advances in ArtificialIntelligence, Cairns, Australia. Berlin, Heidelberg: Springer-Verlag, 2004, pp. 488-499.
C. Cortes and V. Vapnik, "Support-vector network, " MachineLearning, vol. 20, pp. 273-297, 1995.
T. Joachims, "Text categorization with suport vector machines: Learning with many relevant features, " in Proceedings of the10th European Conference on Machine Learning, ser. ECML '98. London, UK, UK: Springer-Verlag, 1998, pp. 137-142.