Volterra filters; Speech prediction; Pitch; Nonlinear signal processing
Résumé :
[en] Speech prediction is extensively based on linear models. However, components generated by nonlinear effects are also contained in speech signals, which is neglected using linear techniques. This paper presents long-term nonlinear predictor based on second-order Volterra filters that is shown to be superior to linear long-term predictor with only a minimal increase in complexity and the number of coefficients. It can be used connected in cascade with short-term linear predictor. The frame/subframe structure is proposed, where each frame is divided into four subframes. Second order Volterra long-term prediction is applied to each subframe separately.
M. Faúndez-Zanuy, G. Kubin, W. B. Kleijn, P. Maragos, S. McLaughlin, A. Esposito, A. Hussain and J. Schoentgen, "Nonlinear speech processing: overview and applications," Control and Intelligent Systems, vol. 30, no.1, 2002, pp. 1-10.
H. M. Teager and S. M. Teager, "Evidence for nonlinear sound production mechanisms in the vocal tract," in Speech Production and Speech Modelling, W. J. Hardcastle and A. Marchal, Eds., vol. 55 of NATO Advanced Study Institute Series D, pp. 241-261, Bonas, France, July 1989.
G. Richard and C. R. D'Alessandro, "Modification of the aperiodic component of speech signals for synthesis," in Progress in Speech Synthesis, R. Sproat, J. Olive, J. Hirschberg J. P.H. van Santen, Ed. New York: Springer-Verlag, 1997, pp. 41-56.
M. Chetouani, A. Hussain, M. Faúndez-Zanuy and B. Gas, "Non-linear predictive models for speech processing," ICANN 2005, Lecture Notes in Computer Science, W. Duch et al. (Eds.), vol. 3697, pp. 779-784, 2005.
G. Kubin, "Nonlinear processing of speech," in Speech coding and synthesis, Chapter 16, W. B. Kleijn & K. K. Paliwal, Ed.: Elsevier, 1995.
J. Thyssen, H. Nielsen and S. Hansen, "Non-linear short-term prediction in speech coding," in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Adelaide, Australia, 1994, pp. 185-188.
E. Mumolo, A. Carini and D. Francescato, "ADPCM with nonlinear predictors," in Signal Processing VII: Theories and applications.: Elsevier, 1994, pp. 387-390.
Gh. Alipoor and M. H. Savoji, "Employing Volterra filters in the ADPCM technique for speech coding: a comprehensive investigation," European Transactions on Telecommunications, vol. 22, no. 2, 2011, pp. 81-92.
T. Ogunfunmi, Adaptive nonlinear system identification: the Volterra and Wiener Model Approaches, Springer, 2007.
K. Schnell and A. Lacroix, "Estimation of Speech Features of Glottal Excitation by Nonlinear Prediction," in Proceedings of the ISCA ITRW Non-Linear Speech Processing (NOLISP 2007), Paris, France, 2007, pp. 116-119.
W. C. Chu, Speech coding algorithms: foundation and evolution of standardized coders, New Jersey: John Wiley & Sons, 2003.
V. Despotovic, N. Goertz and Z. Peric, "Nonlinear long-term prediction of speech based on truncated Volterra series," IEEE Transactions on Audio, Speech & Language Processing, vol. 20, no. 3, 2012, pp. 1069-1073.
R. P. Ramachandran and P. Kabal, "Pitch prediction filters in speech coding," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 37, no. 4, pp. 467-478, 1989.
J. Garofolo, L. Lamel, W. Fisher, J. Fiscus, D. Pallett, N. Dahlgren, and V. Zue, "TIMIT Acoustic-Phonetic Continuous Speech Corpus Linguistic Data Consortium, " Philadelphia, USA, 1993.