Abstract for presentation (Poster or Podium) with a Paper in the Conference Proceedings
Highway Pavements
Proma Dutta
PhD Student
Kennesaw State University
Marietta, GA, United States
Proma Dutta
PhD Student
Kennesaw State University
Marietta, GA, United States
Kanchon Kanti Podder
PhD student
Kennesaw State University
Marietta, GA, United States
Jian Zhang, n/a
Assistant professor
Kennesaw State University
Marietta, Georgia, United States
Christian Hecht, n/a
Intern
Rowan Creates
Medford, New Jersey, United States
Surya Teja Swarna, n/a
Postdoctoral Research Associate
Rowan University/CREATES
Glassboro, New Jersey, United States
Parth Bhavsar, n/a
Assistant Professor
Kennesaw State University
Marietta, Georgia, United States
Proma Dutta
Kennesaw State University
Marietta, Georgia, United States
One of the challenges for local/regional governments is acquiring federal/state funding for projects related to the repair or replacement of local roadways. Each project requires pavement distress data collection to justify the need. Current methods of data collection are either expensive or time-consuming. This research presents a unique approach for automatic pothole and crack recognition in roads utilizing a deep learning framework to optimize cost and time. In this research, we developed a Masked Autoencoder (MAE) with Vision Transformer (ViT) based architecture as the encoder backbone. We first pre-trained the MAE on the ImageNet-1K (IN1K) training set. Then, we further fine-tuned this pre-trained MAE to capture the road features using a custom image dataset from the Center for Research and Education in Advanced Transportation Systems (CREATEs). This dataset represents the roadway network in southern New Jersey. The primary purpose of fine-tuning the MAE on our custom dataset was to generate relatable low-dimensional embeddings from road images representing potholes, cracks, or normal roads. Following the completion of this training, we only keep the obtained encoder and subsequently integrate it into a linear probing scheme to perform a classification task. We developed the classification layers added to the end of the ViT-based encoder to identify two cases: (1) Potholes Vs. No Pothole and (2) No Cracks vs. Moderate Cracks vs. Severe Cracks. The ViT architecture made it simpler to extract complicated spatial data, while the classification layers allowed for exact categorization. The combined model was adjusted to better match the encoder's attributes to the purpose of anomaly classification. We compared our proposed model with pre-trained and finetuned version of ResNet18 and MobileNetV2. The proposed model outperformed Convolutional Neural Networks (CNNs) based classification approaches in classifying imperfections in the road surface. The model's performance was rigorously evaluated using a variety of real-world road images, with promising results in terms of efficiency and accuracy.