Attention-based deep learning for tire defect detection: Fusing local and global features in an industrial case study


SALEH R. A. A., ERTUNÇ H. M.

Expert Systems with Applications, cilt.269, 2025 (SCI-Expanded, Scopus) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 269
  • Basım Tarihi: 2025
  • Doi Numarası: 10.1016/j.eswa.2025.126473
  • Dergi Adı: Expert Systems with Applications
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Anahtar Kelimeler: Attention modules, Convolutional neural network (CNN), Lightweight attention based inception module, Tire defect detection, Vision transformer (ViT), X-ray images
  • Kocaeli Üniversitesi Adresli: Evet

Özet

In tire manufacturing, quality inspections of tires are paramount due to the potential for explosive failures in defective tires, especially during high-speed driving events like races. Addressing this concern requires rigorous post-production visual inspections. However, the diverse textures and structures of tires make defect detection a challenging task. In light of this challenge, this paper introduces an innovative solution in the form of a hybrid ViT-CNN model designed specifically for tire defect detection. Firstly, we propose a lightweight attention-based inception module, which serves as a primary component in the attention-based CNN model we propose. This attention-based CNN extracts local feature mapping from the entire image, while the ViT captures global features from image patches. A dataset comprising 83985 X-ray images of tires without defects and 38710 of tires with defects, representing 15 different types of defects across 50 design patterns, is used to train and test the model. Result shows the superiority of the ViT-CNN model with a recall rate of 95.48%, precision of 96.1%, F1 score of 95.79%, and overall accuracy of 97.33%. Statistical tests, including the Friedman and Wilcoxon signed-rank tests, were employed to assess the presence of significant differences between the proposed ViT-CNN model and the individual models. The results obtained from these tests underscore the significant differences that exist between models. Furthermore, the ViT-CNN model succeeds in identifying diverse tire defects and complex textures, making it effective in real-world scenarios. This research increases automated tire flaw identification, addressing the industry's requirement for precise and dependable inspection while improving human safety by offering a high-performing model.