Explainable reinforcement learning for glucose monitoring based on shapley value analysis

Adjevi, Arsene; Abdirashid, Abdiwahab; Aktaş, FARUK; Uçar, MUSTAFA; Solak, SERDAR

doi:10.1016/j.cmpb.2026.109266

Explainable reinforcement learning for glucose monitoring based on shapley value analysis

Adjevi A., Abdirashid A. M., Aktaş F., Uçar M. H. B., Solak S.

Computer Methods and Programs in Biomedicine, cilt.278, 2026 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 278
Basım Tarihi: 2026
Doi Numarası: 10.1016/j.cmpb.2026.109266
Dergi Adı: Computer Methods and Programs in Biomedicine
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, BIOSIS, Compendex, EMBASE, INSPEC, MEDLINE
Anahtar Kelimeler: Deep reinforcement learning, Explainable AI, Glucose-level prediction, SHAP values, Time series prediction
Kocaeli Üniversitesi Adresli: Evet

Özet

Background and Objective: Effective diabetes management requires continuous regulation of blood glucose in response to complex factors such as diet, activity, stress, and medication. Advances in continuous glucose monitoring and machine learning have improved short-term glucose prediction. However, preprocessing of signals like insulin, carbohydrate intake, heart rate, and activity to better capture metabolic dynamics remains underexplored. Similarly, the integration of predictive models with preventive strategies for guiding interventions is still limited. Methods: We propose a research-only decision-support framework combining signal preprocessing, CNN-based glucose prediction, Shapley Additive Explanations (SHAP) values attribution, and an Actor–Critic Reinforcement Learning (RL) agent. Exponential decay models preprocess inputs, a compact CNN forecasts short-term glucose levels, and SHAP values highlights the most influential input features; however, these attributions reflect associative patterns in the data and do not establish or map to causal clinical mechanisms. These SHAP-derived attributions guide the RL agent, which issues bounded one-step behavioral adjustments. Because SHAP-guided RL remains stochastic and uncertain, the proposed system is exploratory and not clinically safe, serving solely as a simulation framework. Results: Using the OhioT1DM dataset, the model achieved state-of-the-art RMSE across prediction horizons with a compact size of 7̃4 KB per patient and training under one minute for 1000 epochs. Over 98% of predictions fell within Clarke Error Grid Zones A and B, confirming safe 5–20 min forecasts. The preventive component corrected hyper- and hypoglycemia in 2̃5% of cases within 10 min when predictions were near 80–120 mg/dL (±10 mg/dL). When deviations exceed ±10 mg/dL, the RL agent is unable to fully restore blood glucose to the target range within 10 min but can bring it as close as possible to the defined interval. Conclusions: This study presents a significant innovation by bridging predictive accuracy, adaptability, and transparency in diabetes management. The integration of a predictive model with Reinforcement Learning (RL) guided by SHAP values, which are typically used for interpretability but here are employed in the learning process, delivers a powerful decision support framework. This approach advances the field toward next-generation, personalized digital health tools.