7th International Congress on Human-Computer Interaction, Optimization and Robotic Applications, ICHORA 2025, Ankara, Türkiye, 23 - 24 Mayıs 2025, (Tam Metin Bildiri)
In this study, we investigate multilingual and multiclass sentiment classification by analyzing datasets in Turkish, English, and Italian. The proposed approach consists of three main stages: sentence representation extraction, classification, and performance evaluation. First, sentence representations were extracted from these datasets using the distiluse-base-multilingual-cased-v1, sentence-transformers/LaBSE, and Alibaba-NLP/gte-multilingual-base models. These representations were then used as input for Logistic Regression (LR), Support Vector Machines (SVM), Random Forest (RF), and Naive Bayes (NB). Additionally, fine-tuned BERT-base-multilingual-cased and GPT-4o-mini were directly employed as end-to-end models. The classification performance of the models was evaluated using accuracy, F1-score, precision, and recall. Additionally, a confusion matrix analysis was conducted for each dataset to examine classification performance in detail. The experimental results show the influence of embedding models and learning algorithms on multilingual multi-class sentiment classification.