A novel multimodal AI framework for early diagnosis of idiopathic Parkinson’s disease


Creative Commons License

Taşyürek E. Y., Altun Ş. M., Uncu A. E., Tunca S., İLHAN OMURCA S., KURT PEHLİVANOĞLU M., ...Daha Fazla

Medical and Biological Engineering and Computing, 2026 (SCI-Expanded, Scopus) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Basım Tarihi: 2026
  • Doi Numarası: 10.1007/s11517-026-03547-7
  • Dergi Adı: Medical and Biological Engineering and Computing
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, ABI/INFORM, BIOSIS, CINAHL, Compendex, EMBASE, INSPEC, MEDLINE
  • Anahtar Kelimeler: Artificial intelligence, Decision support system, Early diagnosis, Multimodal classification, Neurodegenerative disorder detection, Parkinson’s disease
  • Kocaeli Üniversitesi Adresli: Evet

Özet

Abstract: Parkinson’s disease (PD) is a progressive neurodegenerative disorder marked by motor symptoms, but early diagnosis is challenging due to symptom overlap with other conditions and a lack of definitive biomarkers (clinical assessments). In this study, we propose a novel multimodal artificial intelligence (AI)-based decision support system aimed at the early diagnosis of idiopathic PD. To the best of our knowledge, this is the first framework to enable the synchronous analysis of four distinct modalities: walking, facial expression, voice, and posture, whereas prior studies have typically focused on unimodal or partially multimodal approaches. We also constructed a new dataset by establishing a controlled clinical testing environment equipped with an L-shaped walking track and an integrated audiovisual recording system to capture natural walking, turning, facial, vocal, and postural characteristics. For each modality, specialized AI models were developed and evaluated. For the walking modality, the proposed Bidirectional GRU model achieved the best performance in terms of both score (92.74%) and area under the curve (AUC) (97.86%), demonstrating superior gait-based classification performance. Similarly, in the face modality, the ensemble model integrating eXtreme Gradient Boosting (XGBoost), Random Forest (RF), and Categorical Boosting (CatBoost) yielded the highest score (92.31%) while also achieving the best AUC (97.96%). For the voice and posture modalities, although the highest scores were not obtained, the RF-based models achieved the highest AUC values (99.85% and 97.56%, respectively) within their respective modality comparisons in the literature, reflecting strong class separability and discriminative capability.