Effectiveness evaluation of different feature extraction methods for classification of covid-19 from computed tomography images: A high accuracy classification study

Al-Areqi, Farid; KONYAR, MEHMET

doi:10.1016/j.bspc.2022.103662

Effectiveness evaluation of different feature extraction methods for classification of covid-19 from computed tomography images: A high accuracy classification study

Al-Areqi F., KONYAR M. Z.

BIOMEDICAL SIGNAL PROCESSING AND CONTROL, cilt.76, 2022 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 76
Basım Tarihi: 2022
Doi Numarası: 10.1016/j.bspc.2022.103662
Dergi Adı: BIOMEDICAL SIGNAL PROCESSING AND CONTROL
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, EMBASE, INSPEC
Anahtar Kelimeler: Covid-19, CT images, Diagnosis, Features, Machine learning
Kocaeli Üniversitesi Adresli: Evet

Özet

Rapid diagnosis of the Covid-19 disease is the best way to prevent infection. In this paper, it is proposed to use machine learning methods to aid diagnoses quickly Covid-19 and focused on effect of several features on classification accuracy. In the proposed method 746 axial computed tomography (CT) images of the lung; 349 Covid19 (positives) and 397 non-Covid-19 (negative) are used. Gray-level texture, shape and first order statistical features were extracted from the images. The feature vector for model training is constructed with one feature group or combination of more than one group. We then classified with Support Vector Machine, Random Forest, k-nearest neighbor and XGBoost classifier models. The hyperparameter of the models were controlled by the tuning test. Experimental results obtained with 10-fold cross-validation. The results of cross-validation verified with the additionally independent test. The best overall accuracy was 98.65% with first order statistics features classified with XGBoost. In the gray level features, the best individual results given by GLSZM as 81.25%, and the best combination result is with GLDM, GLRLM and GLSZM features as 85.52%. An important finding of this paper is that, for Covid-19 classification, the shape and first order statistics features are more valuable than gray level features. The proposed results compared with the literature studies under some Covid-19 dataset for accuracy, precision, sensitivity and F1-score metrics. Also, the literature studies which used the different Covid-19 dataset were compared with the proposed study. Our results have the significant superiority when compared with the literature studies.