LANGUAGE IDENTIFICATION SYSTEM USING MFCC AND SDC FEATURE
Main Article Content
Abstract
Speech recognition is technology which recognizes the spoken words and phrases and converts them to a machine-readable format. Speech recognition gives information about the spoken word, speaker, and language. According to this information, speech recognition has classes as text recognition, speaker recognition, and language identification. This system is to find out specific language from speech samples. The speech signal is basically intended to carry the information about the linguistic message, it also contains the language-specific information. In this regard, this work undertakes the study and implementation of Language Identification System using GMM classifiers. This system is based on Melfrequency Cepstral Coefficients and Shifted Delta Cepstral feature extraction techniques. MFCC gives the information about human vocal tract shape and SDC gives the information about phonemes. In this language identification system combination of MFCC and SDC feature is used for better results and Gaussian Mixture Model is used as a classifier to increase the accuracy of identifying a language. This system works with 17 languages as Eastern Arabic, Bengali, German, Hindi, Hungarian, Japanese, Kannada, Malayalam, Kashmiri, Portuguese, Urdu, Russian, Spanish, Marathi, Tamil, Panjabi, and English.
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.