IFIT: an unsupervised discretization method based on the Ramer-Douglas-Peucker algorithm

Mutlu, ALEV; Göz, FURKAN; Akbulut, ORHAN

doi:10.3906/elk-1806-192

IFIT: an unsupervised discretization method based on the Ramer-Douglas-Peucker algorithm

Atıf İçin Kopyala

Mutlu A., Göz F., Akbulut O.

TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, cilt.27, sa.3, ss.2344-2360, 2019 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 27 Sayı: 3
Basım Tarihi: 2019
Doi Numarası: 10.3906/elk-1806-192
Dergi Adı: TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, TR DİZİN (ULAKBİM)
Sayfa Sayıları: ss.2344-2360
Anahtar Kelimeler: Unsupervised discretization, the Ramer-Douglas-Peucker algorithm, polyline simplification, the standard error of the estimate
Kocaeli Üniversitesi Adresli: Evet

Özet

Discretization is the process of converting continuous values into discrete values. It is a preprocessing step of several machine learning and data mining algorithms and the quality of discretization may drastically affect the performance of these algorithms. In this study we propose a discretization algorithm, namely line fitting-based discretization (IFIT), based on the Ramer-Douglas-Peucker algorithm. It is a static, univariate, unsupervised, splitting-based, global, and incremental discretization method where intervals are determined based on the Ramer-Douglas- Peucker algorithm and the quality of partitioning is assessed based on the standard error of the estimate. To evaluate the performance of the proposed method, a set of experiments are conducted on ten benchmark datasets and the achieved results are compared to those obtained by eight state-of-the-art discretization methods. Experimental results show that IFIT achieves higher predictive accuracy and produces less number of inconsistency while it generates larger number of intervals. The obtained results are also validated through Friedman's test and Holm's post hoc test which revealed the fact that IFIT produces discretization schemes that statistically comply both with supervised and unsupervised discretization methods.