References of "Despotovic, Vladimir 50036151"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailTwo-dimensional fractional linear prediction
Skovranek, Tomas; Despotovic, Vladimir UL; Peric, Zoran

in Computers and Electrical Engineering (2019), 77

Linear prediction (LP) has been applied with great success in coding of one-dimensional, time-varying signals, such as speech or biomedical signals. In case of two-dimensional signal representation (e.g ... [more ▼]

Linear prediction (LP) has been applied with great success in coding of one-dimensional, time-varying signals, such as speech or biomedical signals. In case of two-dimensional signal representation (e.g. images) the model can be extended by applying one-dimensional LP along two space directions (2D LP). Fractional linear prediction (FLP) is a generalisation of standard LP using the derivatives of non-integer (arbitrary real) order. While FLP was successfully applied to one-dimensional signals, there are no reported implementations in multidimensional space. In this paper two variants of two-dimensional FLP (2D FLP) are proposed and optimal predictor coefficients are derived. The experiments using various grayscale images confirm that the proposed 2D FLP models are able to achieve comparable performance in comparison to 2D LP using the same support region of the predictor, but with one predictor coefficient less, enabling potential compression. [less ▲]

Detailed reference viewed: 25 (7 UL)
Full Text
Peer Reviewed
See detailAudio signal processing using fractional linear prediction
Skovranek, Tomas; Despotovic, Vladimir UL

in Mathematics (2019), 7(7),

Fractional linear prediction (FLP), as a generalization of conventional linear prediction (LP), was recently successfully applied in different fields of research and engineering, such as biomedical signal ... [more ▼]

Fractional linear prediction (FLP), as a generalization of conventional linear prediction (LP), was recently successfully applied in different fields of research and engineering, such as biomedical signal processing, speech modeling and image processing. The FLP model has a similar design as the conventional LP model, i.e., it uses a linear combination of “fractional terms” with different orders of fractional derivative. Assuming only one “fractional term” and using limited number of previous samples for prediction, FLP model with “restricted memory” is presented in this paper and the closed-form expressions for calculation of FLP coefficients are derived. This FLP model is fully comparable with the widely used low-order LP, as it uses the same number of previous samples, but less predictor coefficients, making it more efficient. Two different datasets, MIDI Aligned Piano Sounds (MAPS) and Orchset, were used for the experiments. Triads representing the chords composed of three randomly chosen notes and usual Western musical chords (both of them from MAPS dataset) served as the test signals, while the piano recordings from MAPS dataset and orchestra recordings from the Orchset dataset served as the musical signal. The results show enhancement of FLP over LP in terms of model complexity, whereas the performance is comparable. [less ▲]

Detailed reference viewed: 49 (2 UL)
Full Text
Peer Reviewed
See detailNovel Two-Bit Adaptive Delta Modulation Algorithms
Peric, Zoran; Denic, Bojan; Despotovic, Vladimir UL

in Informatica (2019), 30(1), 117-134

This paper introduces two novel algorithms for the 2-bit adaptive delta modulation, namely 2-bit hybrid adaptive delta modulation and 2-bit optimal adaptive delta modulation. In 2-bit hybrid adaptive ... [more ▼]

This paper introduces two novel algorithms for the 2-bit adaptive delta modulation, namely 2-bit hybrid adaptive delta modulation and 2-bit optimal adaptive delta modulation. In 2-bit hybrid adaptive delta modulation, the adaptation is performed both at the frame level and the sample level, where the estimated variance is used to determine the initial quantization step size. In the latter algorithm, the estimated variance is used to scale the quantizer codebook optimally designed assuming Laplace distribution of the input signal. The algorithms are tested using speech signal and compared to constant factor delta modulation, continuously variable slope delta modulation and instantaneously adaptive 2-bit delta modulation, showing that the proposed algorithms offer higher performance and significantly wider dynamic range. [less ▲]

Detailed reference viewed: 81 (0 UL)
Full Text
Peer Reviewed
See detailOptimal fractional linear prediction with restricted memory
Skovranek, Tomas; Despotovic, Vladimir UL; Peric, Zoran

in IEEE Signal Processing Letters (2019), 26(5), 760-764

Linear prediction is extensively used in modeling, compression, coding, and generation of speech signal. Various formulations of linear prediction are available, both in time and frequency domain, which ... [more ▼]

Linear prediction is extensively used in modeling, compression, coding, and generation of speech signal. Various formulations of linear prediction are available, both in time and frequency domain, which start from different assumptions but result in the same solution. In this letter, we propose a novel, generalized formulation of the optimal low-order linear prediction using the fractional (non-integer) derivatives. The proposed fractional derivative formulation allows for the definition of predictor with versatile behavior based on the order of fractional derivative. We derive the closed-form expressions of the optimal fractional linear predictor with restricted memory, and prove that the optimal first-order and the optimal second-order linear predictors are only its special cases. Furthermore, we empirically prove that the optimal order of fractional derivative can be approximated by the inverse of the predictor memory, and thus, it is a priori known. Therefore, the complexity is reduced by optimizing and transferring only one predictor coefficient, i.e., one parameter less in comparison to the second-order linear predictor, at the same level of performance. [less ▲]

Detailed reference viewed: 16 (1 UL)
Full Text
Peer Reviewed
See detailSignal prediction using fractional derivative models
Skovranek, Tomas; Despotovic, Vladimir UL

in Bǎleanu, Dumitru; Mendes Lopes, António (Eds.) Handbook of Fractional Calculus with Applications (2019)

In this chapter the linear prediction (LP) and its generalisation to fractional linear prediction (FLP) is described with the possible applications to one-dimensional (1D) and two-dimensional (2D) signals ... [more ▼]

In this chapter the linear prediction (LP) and its generalisation to fractional linear prediction (FLP) is described with the possible applications to one-dimensional (1D) and two-dimensional (2D) signals. Standard test signals, such as the sine wave, the square wave, and the sawtooth wave, as well as the real-data signals, such as speech, electrocardiogram and electroencephalogram are used for the numerical experiments for the 1D case, and grayscale images for the 2D case. The 1D FLP model is proposed to have a similar construction as the LP model, i.e. it uses linear combination of fractional derivatives with different values of the fractional order. The 2D FLP model uses linear combination of the fractional derivatives in two directions, horizontal and vertical. The scheme for the computation of the optimal predictor coefficients for both 1D and 2D FLP models is also provided. The performance of the proposed FLP models is compared to the performance of the LP models, confirming that the proposed FLP can be successfully applied in processing of 1D and 2D signals, giving comparable or better performance using the same or even smaller number of parameters. [less ▲]

Detailed reference viewed: 23 (3 UL)
Full Text
Peer Reviewed
See detailAn efficient two-digit adaptive delta modulation for Laplacian source coding
Peric, Zoran; Denic, Bojan; Despotovic, Vladimir UL

in International Journal of Electronics (2019), 106(7), 1085-1100

Delta Modulation (DM) is a simple waveform coding algorithm used mostly when timely data delivery is more important than the transmitted data quality. While the implementation of DM is fairly simple and ... [more ▼]

Delta Modulation (DM) is a simple waveform coding algorithm used mostly when timely data delivery is more important than the transmitted data quality. While the implementation of DM is fairly simple and inexpensive, it suffers from several limitations, such as slope overload and granular noise, which can be overcome using Adaptive Delta Modulation (ADM). This paper presents novel 2-digit ADM with six-level quantization using variable-length coding, for encoding the time-varying signals modelled by Laplacian distribution. Two variants of quantizer are employed, distortion-constrained quantizer which is optimally designed for minimal mean-squared error (MSE), and rate-constrained quantizer, which is suboptimal in the minimal MSE sense, but enables minimal loss in SQNR for the target bit rate. Experimental results using real speech signal are provided, indicating that the proposed configuration outperforms the baseline ADM algorithms, including Constant Factor Delta Modulation (CFDM), Continuously Variable Slope Delta Modulation (CVSDM), 2-digit and 2-bit ADM, and operates in a much wider dynamic range. [less ▲]

Detailed reference viewed: 20 (3 UL)
Full Text
Peer Reviewed
See detailOne-parameter fractional linear prediction
Despotovic, Vladimir UL; Skovranek, Tomas; Peric, Zoran

in Computers and Electrical Engineering (2018), 69

The one-parameter fractional linear prediction (FLP) is presented and the closed-form expressions for the evaluation of FLP coefficients are derived. Contrary to the classical first-order linear ... [more ▼]

The one-parameter fractional linear prediction (FLP) is presented and the closed-form expressions for the evaluation of FLP coefficients are derived. Contrary to the classical first-order linear prediction (LP) that uses one previous sample and one predictor coefficient, the one-parameter FLP model is derived using the memory of two, three or four samples, while not increasing the number of predictor coefficients. The first-order LP is only a special case of the proposed one-parameter FLP when the order of fractional derivative tends to zero. Based on the numerical experiments using test signals (sine test waves), and real-data signals (speech and electrocardiogram), the hypothesis for estimating the fractional derivative order used in the model is given. The one-parameter FLP outperforms the classical first-order LP in terms of the prediction gain, having comparable performance with the second-order LP, although using one predictor coefficient less. [less ▲]

Detailed reference viewed: 23 (0 UL)
Full Text
Peer Reviewed
See detailMachine learning techniques for semantic analysis of dysarthric speech: An experimental study
Despotovic, Vladimir UL; Walter, Oliver; Haeb-Umbach, Reinhold

in Speech Communication (2018), 99

We present an experimental comparison of seven state-of-the-art machine learning algorithms for the task of semantic analysis of spoken input, with a special emphasis on applications for dysarthric speech ... [more ▼]

We present an experimental comparison of seven state-of-the-art machine learning algorithms for the task of semantic analysis of spoken input, with a special emphasis on applications for dysarthric speech. Dysarthria is a motor speech disorder, which is characterized by poor articulation of phonemes. In order to cater for these non- canonical phoneme realizations, we employed an unsupervised learning approach to estimate the acoustic models for speech recognition, which does not require a literal transcription of the training data. Even for the subsequent task of semantic analysis, only weak supervision is employed, whereby the training utterance is accompanied by a semantic label only, rather than a literal transcription. Results on two databases, one of them containing dysarthric speech, are presented showing that Markov logic networks and conditional random fields substantially outperform other machine learning approaches. Markov logic networks have proved to be espe- cially robust to recognition errors, which are caused by imprecise articulation in dysarthric speech. [less ▲]

Detailed reference viewed: 21 (4 UL)
Full Text
Peer Reviewed
See detailForward Adaptive Laplacian Source Coding Based on Restricted Quantization
Denic, Bojan; Peric, Zoran; Despotovic, Vladimir UL et al

in Information Technology and Control (2018), 47(2), 209-219

A novel solution for Laplacian source coding based on three-level quantization is proposed in this paper. The restricted three-level quantizer is designed by assuming the restricted Laplacian distribution ... [more ▼]

A novel solution for Laplacian source coding based on three-level quantization is proposed in this paper. The restricted three-level quantizer is designed by assuming the restricted Laplacian distribution of the input signal. Quantizer and Huffman encoder are jointly designed. Forward adaptive scheme was employed, where the adaptation to the signal variance (power) was performed on frame-by frame basis. We employ switched model that consists of two restricted quantizers having unequal support regions. The simulation results (measured as SQNR) of the proposed scheme with a switched restricted three-level quantizer are compared to the cases when it involves three-level unrestricted quantizer and the Lloyd-Max quantizers having N=2 and N=4 levels. It is shown that the proposed solution offers performance comparable to the one of N=4 levels Lloyd-Max’s baseline with large savings in bit rate, while outperforming two other baselines. [less ▲]

Detailed reference viewed: 42 (0 UL)
Full Text
Peer Reviewed
See detailDual-mode quasi-logarithmic quantizer with embedded G.711 codec
Denic, Bojan; Peric, Zoran; Despotovic, Vladimir UL et al

in Journal of Electrical Engineering (2018), 69(1), 46-51

The G.711 codec has been accepted as a standard for high quality coding in many applications. A dual-mode quantizer, which combines the nonlinear logarithmic quantizer for restricted input signals and G ... [more ▼]

The G.711 codec has been accepted as a standard for high quality coding in many applications. A dual-mode quantizer, which combines the nonlinear logarithmic quantizer for restricted input signals and G.711 quantizer for unrestricted input signals is proposed in this paper. The parameters of the proposed quantizer are optimized, where the minimal distortion is used as the criterion. It is shown that the optimized version of the proposed quantizer provides 5.4 dB higher SQNR (Signal to Quantization Noise Ratio) compared to G.711 quantizer, or equivalently it performs savings in the bit rate of approximately 0.9 bit/sample for the same signal quality. Although the complexity is slightly increased, we believe that due to the superior performance it can be successfully implemented for high-quality quantization. [less ▲]

Detailed reference viewed: 11 (0 UL)
See detailLinear prediction of speech: The fractional derivative formula
Despotovic, Vladimir UL; Skovranek, Tomas

in Book of Abstracts, 2017 International Workshop on Fractional Calculus and Its Applications (2017, May)

Detailed reference viewed: 33 (1 UL)
Full Text
Peer Reviewed
See detailSentiment Analysis of Microblogs Using Multilayer Feed-forward Artificial Neural Networks
Despotovic, Vladimir UL; Tanikic, Dejan

in Computing and Informatics (2017), 36(5), 11271142

Sentiment analysis aims to extract public opinion on a particular topic and microblogs, especially Twitter as the most influential platform, represent a significant source of information. The application ... [more ▼]

Sentiment analysis aims to extract public opinion on a particular topic and microblogs, especially Twitter as the most influential platform, represent a significant source of information. The application to microblogs has to cope with difficulties, such as informal language with abbreviations, internet jargons, emoticons, hashtags that do not appear in conventional text documents. Sentiment analysis technique for microblogs based on a feed-forward artificial neural network (ANN) with sigmoid activation function is proposed in this paper and compared to machine learning approaches, i.e. Multinomial Naive Bayes, Support Vector Machines and Maximum Entropy. Experiments were performed on Stanford Twitter Sentiment corpus, a balanced dataset which contains noisy training labels weakly annotated using emoticons as sentiment indicators; and SemEval-2014 Task 9 corpus, an unbalanced dataset which contains manually annotated training examples. The obtained results show that ANN produces superior or at least comparable results to state-of-the-art machine learning techniques. [less ▲]

Detailed reference viewed: 28 (0 UL)
Full Text
Peer Reviewed
See detailFractional-order speech prediction
Despotovic, Vladimir UL; Skovranek, Tomas

in International Conference on Fractional Differentiation and its Applications (ICFDA ‘16) (2016, July)

Detailed reference viewed: 32 (1 UL)
Full Text
Peer Reviewed
See detailSemantic Analysis of Spoken Input Using Markov Logic Networks
Despotovic, Vladimir UL; Walter, Oliver; Haeb-Umbach, Reinhold

in Proceedings of the 16th Annual Conference of the International Speech Communication Association (INTERSPEECH 2015) (2015, September)

We present a semantic analysis technique for spoken input using Markov Logic Networks (MLNs). MLNs combine graphical models with first-order logic. They are particularly suitable for providing inference ... [more ▼]

We present a semantic analysis technique for spoken input using Markov Logic Networks (MLNs). MLNs combine graphical models with first-order logic. They are particularly suitable for providing inference in the presence of inconsistent and in- complete data, which are typical of an automatic speech recognizer’s (ASR) output in the presence of degraded speech. The target application is a speech interface to a home automation system to be operated by people with speech impairments, where the ASR output is particularly noisy. In order to cater for dysarthric speech with non-canonical phoneme realizations, acoustic representations of the input speech are learned in an unsupervised fashion. While training data transcripts are not required for the acoustic model training, the MLN training requires supervision, however, at a rather loose and abstract level. Results on two databases, one of them for dysarthric speech, show that MLN-based semantic analysis clearly outperforms baseline approaches employing non-negative matrix factorization, multinomial naive Bayes models, or support vector machines. [less ▲]

Detailed reference viewed: 27 (1 UL)
Full Text
Peer Reviewed
See detailArtificial Intelligence Techniques for Modelling of Temperature in the Metal Cutting Process
Tanikic, Dejan; Despotovic, Vladimir UL

in Metallurgy – Advances in Materials and Processes (2014)

Detailed reference viewed: 92 (1 UL)
Full Text
Peer Reviewed
See detailAn Evaluation of Unsupervised Acoustic Model Training for a Dysarthric Speech Interface
Walter, Oliver; Despotovic, Vladimir UL; Haeb-Umbach, Reinhold et al

in Proceedings of the 15th Annual Conference of the International Speech Communication Association (INTERSPEECH 2014) (2014, September)

In this paper, we investigate unsupervised acoustic model training approaches for dysarthric-speech recognition. These models are first, frame-based Gaussian posteriorgrams, obtained from Vector ... [more ▼]

In this paper, we investigate unsupervised acoustic model training approaches for dysarthric-speech recognition. These models are first, frame-based Gaussian posteriorgrams, obtained from Vector Quantization (VQ), second, so-called Acoustic Unit Descriptors (AUDs), which are hidden Markov models of phone-like units, that are trained in an unsupervised fashion, and, third, posteriorgrams computed on the AUDs. Experiments were carried out on a database collected from a home automation task and containing nine speakers, of which seven are considered to utter dysarthric speech. All unsupervised modeling approaches delivered significantly better recognition rates than a speaker-independent phoneme recognition baseline, showing the suitability of unsupervised acoustic model training for dysarthric speech. While the AUD models led to the most compact representation of an utterance for the subsequent semantic inference stage, posteriorgram-based representations resulted in higher recognition rates, with the Gaussian posteriorgram achieving the highest slot filling F-score of 97.02%. [less ▲]

Detailed reference viewed: 29 (2 UL)
Full Text
Peer Reviewed
See detailThe Artificial Neural Network Based System for Validation of Thermocouples Used in Biomedicine
Tanikic, Dejan; Despotovic, Vladimir UL; Djenadic, Dalibor et al

in Proceedings of the 13th International Conference on Environment and Electrical Engineering (EEEIC) (2013, November)

Machining operations are widely used in the orthopedic surgery. The temperature which occurs in the cutting zone, during the machining of the bones, may have many negative consequences in the ... [more ▼]

Machining operations are widely used in the orthopedic surgery. The temperature which occurs in the cutting zone, during the machining of the bones, may have many negative consequences in the postoperative period. Therefore, the measuring and the modeling of this parameter is a very important task. In this paper, the thermocouples are presented as a potential tool for the temperature measuring. The paper also deals with the system for validation of the thermocouples. The artificial neural network is used for modeling of the relationship between the electromotive force (as the thermocouple output) and the corresponding temperature. It is shown that the results of the modeling are in good correlation with the measured data. [less ▲]

Detailed reference viewed: 15 (1 UL)
Full Text
Peer Reviewed
See detailDesign of nonlinear predictors for adaptive predictive coding of speech signals
Despotovic, Vladimir UL; Peric, Zoran

in Proceedings of the 21st Telecommunications Forum Telfor (TELFOR) (2013, November)

Linear predictive coding is probably the most frequently used technique in speech signal processing. Its main advantage comes from the analogy of the simplified vocal tract model with speech production ... [more ▼]

Linear predictive coding is probably the most frequently used technique in speech signal processing. Its main advantage comes from the analogy of the simplified vocal tract model with speech production system. However, this neglects nonlinearities in the speech production process. The paper deals with nonlinear prediction of speech based on truncated Volterra series. Long-term one-tap Volterra predictor is designed in order to decrease computational complexity. Further improvements are obtained using frame/subframe structure and fractional delay. [less ▲]

Detailed reference viewed: 54 (0 UL)
Full Text
Peer Reviewed
See detailLow-Order Volterra Long-Term Predictors
Despotovic, Vladimir UL; Goertz, Norbert; Peric, Zoran

in Proceedings of the 10. ITG Symposium on Speech Communication (2012, September)

Models based on linear prediction have been used for several decades in different areas of speech signal processing. While the linear approach has led to great advances in the last 40 years, it neglects ... [more ▼]

Models based on linear prediction have been used for several decades in different areas of speech signal processing. While the linear approach has led to great advances in the last 40 years, it neglects nonlinearities present in the speech production mechanism. This paper compares the results of long-term nonlinear prediction based on second-order and third-order Volterra filters. Additional improvement can be obtained using fractionaldelay long-term prediction. Experimental results reveal that the proposed method outperforms linear long-term prediction techniques in terms of prediction gain. [less ▲]

Detailed reference viewed: 11 (0 UL)
Full Text
Peer Reviewed
See detailImproved Non-Linear Long-Term Predictors based on Volterra Filters
Despotovic, Vladimir UL; Goertz, Norbert; Peric, Zoran

in Proceedings ELMAR-2012 (2012, September)

Speech prediction is extensively based on linear models. However, components generated by nonlinear effects are also contained in speech signals, which is neglected using linear techniques. This paper ... [more ▼]

Speech prediction is extensively based on linear models. However, components generated by nonlinear effects are also contained in speech signals, which is neglected using linear techniques. This paper presents long-term nonlinear predictor based on second-order Volterra filters that is shown to be superior to linear long-term predictor with only a minimal increase in complexity and the number of coefficients. It can be used connected in cascade with short-term linear predictor. The frame/subframe structure is proposed, where each frame is divided into four subframes. Second order Volterra long-term prediction is applied to each subframe separately. [less ▲]

Detailed reference viewed: 9 (0 UL)