Automatic Recognition of Pathological Phoneme Pronunciation Using HMM and DTW methods

Robert Wielgat¹, Daniel Król¹, Tomasz P. Zielinski ², Tomasz Wozniak³, Stanislaw Grabias³

Abstract. Proper diagnosis and therapy of pathological pronunciation of phonemes plays an important role in the modern logopedy. To enhance effectivity of such a diagnosis and therapy an automatic recognition of pathological phoneme pronunciation is used. In order to solve this problem, recently proposed Human Factor Cepstral Coefficients (HFCC) were implemented. The speech recordings come from speech impaired Polish children. Efficiency of the HFCC approach is compared to application of the standard Mel-Frequency Cepstral Coefficients (MFCC) as a feature vector. The optimal HFCC based method has been found. Both dynamic time warping (DTW), working on whole words or embedded phoneme patterns, and hidden Markov models (HMM) are used as classifiers in the presented research. HMM classifier was based on whole word models as well as phoneme models. Obtained results demonstrate superiority of combining HFCC features and modified phoneme-based DTW classifier. Some consideration on speech signal acquisition and preprocessing has been also presented.

Robert Wielgat¹, Daniel Król¹, Tomasz P. Zielinski ², Tomasz Wozniak³, Stanislaw Grabias³ ¹Department of Technology, Higher State Vocational School in Tarnów, Tarnów, Poland ² Department of Telecommunications,
AGH University of Science and Technology, Krakow, Poland
³Division of Logopedics and Applied Linguistics,
Maria Curie-Sklodowska University, Lublin, Poland

Robert Wielgat: rwielgat@poczta.onet.pl