An ensemble-based framework for mispronunciation detection of Arabic phonemes

Calık, Sükrü; KÜÇÜKMANİSA, AYHAN; KİLİMCİ, ZEYNEP

doi:10.1016/j.apacoust.2023.109593

An ensemble-based framework for mispronunciation detection of Arabic phonemes

Calık S. S., KÜÇÜKMANİSA A., KİLİMCİ Z. H.

Applied Acoustics, cilt.212, 2023 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 212
Basım Tarihi: 2023
Doi Numarası: 10.1016/j.apacoust.2023.109593
Dergi Adı: Applied Acoustics
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, Communication & Mass Media Index, Compendex, ICONDA Bibliographic, INSPEC, DIALNET
Anahtar Kelimeler: Arabic pronunciation detection, Computer aided language learning, Ensemble learning, Machine learning, Voting classifier
Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu
Kocaeli Üniversitesi Adresli: Evet

Özet

Determination of mispronunciations and ensuring feedback to users are maintained by computer-assisted language learning (CALL) systems. In this work, we introduce an ensemble model that defines the mispronunciation of Arabic phonemes and assists learning of Arabic, effectively. To the best of our knowledge, this is the very first attempt to determine the mispronunciations of Arabic phonemes employing ensemble learning techniques and conventional machine learning models, comprehensively. In order to observe the effect of feature extraction techniques, mel-frequency cepstrum coefficients (MFCC), and Mel-spectrogram are blended with each learning algorithm. To show the success of proposed model, 29 letters in the Arabic phonemes, 8 of which are hafiz, are voiced by a total of 11 different person. The amount of data set has been enhanced employing the methods of adding noise, time shifting, time stretching, pitch shifting. Extensive experiment results demonstrate that the utilization of voting classifier as an ensemble algorithm with Mel-spectrogram feature extraction technique exhibits remarkable classification result with 95.9% of accuracy.