Skip to content

Road damage detection#1085

Open
Adhavan1801 wants to merge 18 commits into
abhisheks008:mainfrom
Adhavan1801:road-damage-detection
Open

Road damage detection#1085
Adhavan1801 wants to merge 18 commits into
abhisheks008:mainfrom
Adhavan1801:road-damage-detection

Conversation

@Adhavan1801

Copy link
Copy Markdown

Pull Request for DL-Simplified 💡

Issue Title : Add Multi-Modal Deep Learning Framework for Road Damage Detection

  • Info about the related issue (Aim of the project) : Build, train, and evaluate multiple deep learning architectures (EfficientNet-B0, ResNet50, YOLOv8n, Vision Transformer ViT-B/16) on the RDD-2022 dataset to detect and classify 4 types of road damage (Longitudinal Crack, Transverse Crack, Alligator Crack, Pothole) with model comparison and Grad-CAM visualizations.
  • Name: Adhavan
  • GitHub ID: Adhavan1801
  • Email ID: adhavan1801@gmail.com
  • Identify yourself: GSSoC 2026 Participant

Closes: #1074

Describe the add-ons or changes you've made 📃

Added a complete Multi-Modal Road Damage Detection project under Multi-Modal Road Damage Detection/ with the following:

  • Jupyter Notebook (Model/Road_Damage_Detection.ipynb) implementing 4 deep learning models:

    • EfficientNet-B0 — 4.01M params, 71.31% test accuracy
    • ResNet50 — 25M params, 75.86% test accuracy (best classifier)
    • YOLOv8n — object detection with bounding box localization (7.52ms/img fastest latency)
    • Vision Transformer (ViT-B/16) — 86M params, 73.42% test accuracy
  • Dataset: RDD-2022 (Road Damage Dataset) with 26,869 train / 5,758 val / 5,758 test images across 4 damage classes (D00, D10, D20, D40) from multiple countries

  • Training features: Early stopping (patience=5), Cosine Annealing LR scheduler, GPU-accelerated training (RTX 5060, 8.5GB VRAM)

  • Grad-CAM visualizations for all 3 classification models to explain predictions

  • Model results CSV (Model/model_results_summary.csv) with accuracy, precision, recall, F1-score, and inference latency for all models

  • Comprehensive README.md with dataset details, model architectures, training pipeline, EDA plots, results table, and setup instructions

Type of change ☑️

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Code style update (formatting, local variables)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested? ⚙️

  • All 4 models were fully trained and evaluated on the RDD-2022 test set (5,758 images)
  • Classification metrics (accuracy, precision, recall, F1-score) computed using scikit-learn
  • Inference latency measured per image on NVIDIA RTX 5060 GPU
  • Grad-CAM visualizations generated and verified for all 3 CNN/ViT models
  • Notebook executed end-to-end without errors in a local virtual environment with CUDA 12.8

Test Results:

Model Accuracy F1-Score Latency
EfficientNet-B0 71.31% 0.7131 8.48 ms
ResNet50 75.86% 0.7573 8.64 ms
YOLOv8n 56.59% 0.5707 7.52 ms
ViT-B/16 73.42% 0.7275 12.95 ms

Checklist: ☑️

  • My code follows the guidelines of this project.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly wherever it was hard to understand.
  • I have made corresponding changes to the documentation.
  • My changes generate no new warnings.
  • I have added things that prove my fix is effective or that my feature works.
  • Any dependent changes have been merged and published in downstream modules.

@github-actions

Copy link
Copy Markdown

Our team will soon review your PR. Thanks @Adhavan1801 :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multi-Modal Deep Learning Framework for AI-Based Road Damage Detection and Severity Assessment

1 participant