BUILDINGS, cilt.15, sa.19, 2025 (SCI-Expanded, Scopus)
Accurate prediction of earthquake-induced building damage is essential for timely disaster response and effective risk mitigation. This study explores a machine learning (ML)-based classification approach using data from the 2015 Gorkha, Nepal earthquake, with a specific focus on reinforced concrete (RC) structures. The original dataset from the 2015 Nepal earthquake contained 762,094 building entries across 127 variables describing structural, functional, and contextual characteristics. Three ensemble ML modelsGradient Boosting Machine (GBM), Extreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM) were trained and tested on both the full dataset and a filtered RC-only subset. Two target variables were considered: a three-class variable (damage_class) and the original five-level damage grade (damage_grade). To address class imbalance, oversampling and undersampling techniques were applied, and model performance was evaluated using accuracy and F1 scores. The results showed that LightGBM consistently outperformed the other models, especially when oversampling was applied. For the RC dataset, LightGBM achieved up to 98% accuracy for damage_class and 93% accuracy for damage_grade, along with high F1 scores ranging between 0.84 and 1.00 across all classes. Feature importance analysis revealed that structural characteristics such as building area, age, and height were the most influential predictors of damage. These findings highlight the value of building-type-specific modeling combined with class balancing techniques to improve the reliability and generalizability of ML-based earthquake damage prediction.