Spoken language identifcation (LID) in telephone speech signals is an important and difficult classification task. Language identifcation modules can be used as front end signal routers for multilanguage speech recognition or transcription devices. Gaussian Mixture Models (GMM's) can be utilized to effectively model the distribution of feature vectors present in speech signals for classification. Common feature vectors used for speech processing include Linear Prediction (LP-CC), Mel-Frequency (MF-CC), and Perceptual Linear Prediction derived Cepstral coefficients (PLP-CC). This thesis compares and examines the recently proposed type of feature vector called the Shifted Delta Cepstral (SDC) coefficients. Utilization of the Shifted Delta Cepstral coefficients has been shown to improve language identification performance. This thesis explores the use of different types of shifted delta cepstral feature vectors for spoken language identification of telephone speech using a simple Gaussian Mixture Models based classifier for a 3-language task. The OGI Multi-language Telephone Speech Corpus is used to evaluate the system.
Library of Congress Subject Headings
Speech processing systems; Automatic speech recognition; Speech--Data processing; Computational linguistics; Sound--Classification; Gaussian processes
Department, Program, or Center
Computer Science (GCCIS)
Lareau, Jonathan, "Application of shifted delta cepstral features for GMM language identification" (2006). Thesis. Rochester Institute of Technology. Accessed from
RIT – Main Campus