Comparative Analysis of Transformer Based Vector Models for Hate Speech in Turkish Tweets Turkce Tweetlerde Nefret Soyleminin Transformer Tabanli Vektor Modelleri ile Karsilastirmali Analizi


Gumus E., Sariyar R., GÖZ F.

8th International Artificial Intelligence and Data Processing Symposium, IDAP 2024, Malatya, Türkiye, 21 - 22 Eylül 2024, (Tam Metin Bildiri) identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/idap64064.2024.10710825
  • Basıldığı Şehir: Malatya
  • Basıldığı Ülke: Türkiye
  • Anahtar Kelimeler: machine learning, sentence transformers, Turkish hate speech detection
  • Kocaeli Üniversitesi Adresli: Evet

Özet

Hate speech is a rapidly spreading issue on social media platforms and poses a significant problem. This study aims to examine the effects of different sentence transformer models on machine learning algorithms for Turkish hate speech detection. Multilingual-MiniLM, DistilUSE, LaBSE, and DistilBERT, which provide multilingual support, were selected as the sentence transformer models. The study was conducted on a Turkish Twitter dataset created for hate speech detection. The dataset was transformed into vectors using Transformer models and tested with six different machine learning algorithms: Naive Bayes (NB), Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM) and K-Nearest Neighbors (KNN). The evaluation was performed using precision, recall, and F1 score metrics.