Temporal Pattern Classification using Kernel Methods for Speech

  • Chandra Sekhar Indian Institute of Technology Madras, Chennai
  • S. Chandrakala Indian Institute of Technology Madras, Chennai
Keywords: Hidden Markov model, Support vector machine, string kernel, Gaussian mixture model, Score vector

Abstract

There are two paradigms for modelling the varying length temporal data namely, modelling the sequences of feature vectors as in the hidden Markov model-based approaches for speech recognition and modelling the sets of feature vectors as in the Gaussian mixture model (GMM)-based approaches for speech emotion recognition. In this paper, the methods using discrete hidden Markov models (DHMMs) in the kernel feature space and string kernel-based SVM classifier for classification of discretised representation of sequence of feature vectors obtained by clustering and vector quantisation in the kernel feature space are presented. The authors then present continuous density hidden Markov models (CDHMMs) in the explicit kernel feature space that use the continuous valued representation of features extracted from the temporal data. The methods for temporal pattern classification by mapping a varying length sequential pattern to a fixed-length sequential pattern and then using an SVM-based classifier for classification are also presented. The task of recognition of spoken letters in E-set, it is possible to build models that use a discretised representation and string kernel SVM based classification and obtain a classification performance better than that of models using the continuous valued representation is demonstrated. For modelling sets of vectors-based representation of temporal data, two approaches in a hybrid framework namely, the score vector-based approach and the segment modelling based approach are presented. In both approaches, a generative model-based method is used to obtain a fixed length pattern representation for a varying length temporal data and then a discriminative model is used for classification. These two approaches are studied for speech emotion recognition task. The segment modelling based approach gives a better performance than the score vector-based approach and the GMM-based classifiers for speech emotion recognition.

Defence Science Journal, 2010, 60(4), pp.348-363, DOI:http://dx.doi.org/10.14429/dsj.60.492

Author Biographies

Chandra Sekhar, Indian Institute of Technology Madras, Chennai

Associate Professor
Department of Computer Science and Engineering
Indian Institute of Technology Madras

S. Chandrakala, Indian Institute of Technology Madras, Chennai

Received her MTech (Comp Sci and Engg) from SASTRA University, Thanjavur in 2002. Presently, she is pursuing PhD in the Department of Computer Science and Engineering, Indian nstitute of Technology Madras, Chennai.

Published
2010-07-09
How to Cite
Sekhar, C., & Chandrakala, S. (2010). Temporal Pattern Classification using Kernel Methods for Speech. Defence Science Journal, 60(4), 348-363. https://doi.org/10.14429/dsj.60.492