A generalized hybrid machine learning framework for predicting biohydrogen production via dark fermentation of organic wastes


Mougari N. E., Ghersi D. E., Iachachene F., Largeau J. F., ARICI M.

Bioprocess and Biosystems Engineering, 2025 (SCI-Expanded, Scopus) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Basım Tarihi: 2025
  • Doi Numarası: 10.1007/s00449-025-03255-w
  • Dergi Adı: Bioprocess and Biosystems Engineering
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, BIOSIS, Chemical Abstracts Core, Compendex, EMBASE, MEDLINE
  • Anahtar Kelimeler: Artificial neural network, Bayesian optimization, Biohydrogen production, Dark fermentation, Kinetic modeling
  • Kocaeli Üniversitesi Adresli: Evet

Özet

The rising global demand for sustainable energy has directed significant attention towards biohydrogen production via dark fermentation of organic wastes. Accurate yield prediction is crucial for optimizing process conditions and enhancing overall process. This study aims to develop a robust and interpretable predictive framework that integrates kinetic modeling with a hybrid Bayesian Optimization–Artificial Neural Network (BO–ANN) approach for precise biohydrogen yield prediction. The core novelty lies in representing each substrate not as a simple category, but by its quantitative kinetic parameters from the Modified Gompertz equation, providing a biologically meaningful input. A comprehensive database compiled from the literature incorporates key process variables, including temperature, pH, residence time, and substrate concentration, along with kinetic parameters from the Modified Gompertz equation characterizing each substrate. The BO algorithm was employed to optimize the ANN architecture, and 5-fold cross-validation was used to evaluate model generalization ability. The proposed hybrid model achieved outstanding predictive performance (R² = 0.9980, RMSE = 0.0117, MAE = 0.0062), confirming its accuracy and robustness. Furthermore, SHAP analysis and correlation metrics provided interpretable insights into feature contributions, particularly the relevance of kinetic descriptors. Overall, the proposed BO–ANN framework offers a scalable, interpretable, and biologically grounded tool to improve predictive accuracy and support the design of more efficient and sustainable biohydrogen production systems.