2025 Innovations in Intelligent Systems and Applications Conference, ASYU 2025, Bursa, Türkiye, 10 - 12 Eylül 2025, (Tam Metin Bildiri)
This study aims to perform text classification on a large-scale dataset of academic abstracts. The dataset, specifically compiled for this research, consists of 121,000 academic abstracts from various disciplines, obtained through the Web of Science (WoS) portal, and serves as a unique resource. To predict the academic field to which each abstract belongs, both traditional and deep learning-based classification algorithms - such as Naive Bayes, Support Vector Machines (SVM), Random Forest, and BERT - were applied. Throughout the study, various combinations were explored by tuning vectorization methods, hyperparameters, and model architectures, allowing for a comprehensive comparative analysis of different approaches. As a result of extensive experimentation, the methods that yielded the highest accuracy, precision, recall, and F1 scores were identified and used to finalize the study. This strategy not only improved classification performance but also significantly enhanced the scientific originality and contribution of the research. The results demonstrate that the BERT model outperforms the other approaches in terms of classification accuracy, although it requires greater computational resources and longer processing times. Overall, this study provides a comparative evaluation of the effectiveness of different classification methods on large and balanced datasets, while also highlighting the practical potential of deep learning-based models.