Random Forest in Splice Site Prediction of Human Genome


Pashaei E., Ozen M., AYDIN N.

14th Mediterranean Conference on Medical and Biological Engineering and Computing (MEDICON), Paphos, CYPRUS, 31 Mart - 02 Nisan 2016, cilt.57, ss.512-517 identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası: 57
  • Doi Numarası: 10.1007/978-3-319-32703-7_99
  • Basıldığı Şehir: Paphos
  • Basıldığı Ülke: CYPRUS
  • Sayfa Sayıları: ss.512-517
  • Anahtar Kelimeler: Random Forest, Feature ranking, Splice site prediction
  • Kocaeli Üniversitesi Adresli: Hayır

Özet

With the rapid growth of huge amounts of DNA sequence, genes identification has become an important task in bioinformatics. To detect genes, it is important to accurately predict splice sites, i.e. exon intron boundaries. Moreover, in biology where structures are described by a large number of features as splice sites, the feature selection is an important step toward the classification task. It provides useful biological knowledge and allows for a faster and better classification. Feature selection techniques can be divided into two groups: feature-ranking and feature-subset selection. This paper investigates the performance of combining support vector machine (SVM) with two different feature ranking methods, namely F-score and Random Forest feature ranking competitively in splice site detection of Human genome. Also a new classification method based on Random Forest for splice site prediction is presented.