The Evaluation of Word Embedding Models and Deep Learning Algorithms for Turkish Text Classification


Kilimci Z. H., Akyokus S.

4th International Conference on Computer Science and Engineering, UBMK 2019, Samsun, Türkiye, 11 - 15 Eylül 2019, ss.548-553 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası:
  • Doi Numarası: 10.1109/ubmk.2019.8907027
  • Basıldığı Şehir: Samsun
  • Basıldığı Ülke: Türkiye
  • Sayfa Sayıları: ss.548-553
  • Anahtar Kelimeler: Word2Vec, Glove, FastText, recurrent neural networks, long short term memory, convolutional neural networks, text categorization
  • Kocaeli Üniversitesi Adresli: Hayır

Özet

The use of word embedding models and deep learning algorithms are currently the most common and popular trends to enhance the overall performance of a text classification/categorization system. Word embedding models are vectors that provide a mapping of words with similar meaning to own a similar representation which is learned from a corpus. Deep learning algorithms successful produce more successful results in many areas of their applications when they are compared to the conventional machine learning algorithms. In this study, three different word embedding models Word2Vec, Glove, and FastText are employed fur word representation. Instead of using conventional classification algorithms, three different deep learning architectures Recurrent Neural Networks (RNN), Long Short Term Memory Networks (LSTM) and Convolutional Neural Networks (CNN) are used for classification task by performing experiments on collections of different Turkish documents. Experimental results show that the usage of deep learning algorithms together with word embedding models advances the performance of text classification systems.