International Conference on Emerging Trends and Applications in Artificial Intelligence, ICETAI 2023, İstanbul, Türkiye, 8 - 09 Eylül 2023, cilt.960, ss.318-332
In this study focusing on extractive automatic text summarization, popular text summarization algorithms commonly used in other languages were compared. Due to their suffix-based structure, the impact of these algorithms on Turkish may not be as effective as English and Chinese, which have been extensively studied. Accordingly, the most commonly used extractive text summarization approaches were investigated, and some of them were tested and compared on Turkish texts. In line with the study, five summaries were generated using the TextRank, LexRank, Luhn algorithms, and two word frequency-based summarization algorithms that we developed, based on a dataset of 130 news texts summarized by three individuals. The similarity metrics were calculated using the Rouge Metric algorithm by comparing the output summaries with the reference summaries. The selected summarization algorithms were chosen among the most commonly used extractive text summarization algorithms, and they are all extractive text summarization algorithms. As a result of the comparison, it was observed that the algorithm developed based on sentence selection using the frequency of word stems had the highest similarity value. The study’s outcome will involve the identification of the most suitable automatic summarization algorithm for Turkish. In this context, conclusions can be drawn regarding the applicability of various methods, the potential for achieving more advantageous results when approached from specific angles, and the aspects requiring reinforcement. This way, the aim is to facilitate the attainment of proficient outcomes in Turkish-specific summarization, thus ensuring a professional culmination.