Temporal Pattern Classification using Kernel Methods for Speech
There are two paradigms for modelling the varying length temporal data namely, modelling the sequences of feature vectors as in the hidden Markov model-based approaches for speech recognition and modelling the sets of feature vectors as in the Gaussian mixture model (GMM)-based approaches for speech emotion recognition. In this paper, the methods using discrete hidden Markov models (DHMMs) in the kernel feature space and string kernel-based SVM classifier for classification of discretised representation of sequence of feature vectors obtained by clustering and vector quantisation in the kernel feature space are presented. The authors then present continuous density hidden Markov models (CDHMMs) in the explicit kernel feature space that use the continuous valued representation of features extracted from the temporal data. The methods for temporal pattern classification by mapping a varying length sequential pattern to a fixed-length sequential pattern and then using an SVM-based classifier for classification are also presented. The task of recognition of spoken letters in E-set, it is possible to build models that use a discretised representation and string kernel SVM based classification and obtain a classification performance better than that of models using the continuous valued representation is demonstrated. For modelling sets of vectors-based representation of temporal data, two approaches in a hybrid framework namely, the score vector-based approach and the segment modelling based approach are presented. In both approaches, a generative model-based method is used to obtain a fixed length pattern representation for a varying length temporal data and then a discriminative model is used for classification. These two approaches are studied for speech emotion recognition task. The segment modelling based approach gives a better performance than the score vector-based approach and the GMM-based classifiers for speech emotion recognition.
Defence Science Journal, 2010, 60(4), pp.348-363, DOI:http://dx.doi.org/10.14429/dsj.60.492
Where otherwise noted, the Articles on this site are licensed under Creative Commons License: CC Attribution-Noncommercial-No Derivative Works 2.5 India