Enhanced Feature Representation in Multi-Modal Learning for Driving Safety Assessment

dc.contributor.authorShi, Liangen
dc.contributor.committeechairGuo, Fengen
dc.contributor.committeememberXing, Xinen
dc.contributor.committeememberDeng, Xinweien
dc.contributor.committeememberLeman, Scott C.en
dc.contributor.departmentStatisticsen
dc.date.accessioned2024-12-04T09:00:12Zen
dc.date.available2024-12-04T09:00:12Zen
dc.date.issued2024-12-03en
dc.description.abstractThis dissertation explores innovative approaches in driving safety through the development of multi-modal learning frameworks that leverage high-frequency, high-resolution driving data and videos to detect safety-critical events (SCEs). The research unfolds across four methodologies, each contributing to advance the field. The introductory chapter sets the stage by outlining the motivations and challenges in driving safety research, highlighting the need for advanced data-driven approaches to improve SCE prediction and detection. The second chapter presents a framework that combines Convolutional Neural Networks (CNN) and Gated Recurrent Units (GRU) with XGBoost. This approach reduces dependency on domain expertise and effectively manages imbalanced crash data, enhancing the accuracy and reliability of SCE detection. In the third chapter, a two-stream network architecture is introduced, integrating optical flow with TimeSFormer with a multi-head attention mechanism. This innovative combination achieves exceptional detection accuracy, demonstrating its potential for applications in driving safety. The fourth chapter focuses on the Dual Swin Transformer framework, which enables concurrent analysis of video and time-series data, this methodology shows effective in processing driving front videos for improved SCE detection. The fifth chapter explores the integration of corporate labels' semantic meaning into a classification model and introduces ScVLM, a hybrid approach that merges supervised learning with contrastive learning techniques to enhance understanding of driving videos and improve event description rationality for Vision-Language Models (VLMs). This chapter addresses existing model limitations by providing a more comprehensive analysis of driving scenarios. This dissertation addresses the challenges of analyzing multimodal data and paves the way for future advancements in autonomous driving and traffic safety management. It underscores the potential of integrating diverse data sources to enhance driving safety.en
dc.description.abstractgeneralThis dissertation explores new approaches to enhance driving safety by using advanced learning frameworks that combine video data with high-frequency, high-resolution driving information, introducing innovative techniques to predict and detect critical driving events. The introduction chapter outlines the current challenges in driving safety and emphasizes the potential of data-driven methods to improve predictions and prevent accidents. The second chapter describes a method that uses machine learning models to analyze crash data, reducing the need for expert input and effectively handling data imbalances. This approach improves the accuracy of predicting safety-critical events. The third chapter introduces a two-stream network that processes both sensor data and video frames, achieving high accuracy in detecting safety-related driving incidents. The fourth chapter presents a framework that simultaneously analyzes video and time-series data, validated using a comprehensive driving study dataset. This technique enhances the detection of complex driving scenarios. The fifth chapter introduces a hybrid learning approach that improves understanding of driving videos and event descriptions. By combining different learning techniques, this method addresses limitations in existing models. This work tackles challenges in analyzing multimodal data and sets the stage for future advancements in autonomous driving and traffic safety management. It highlights the potential of integrating diverse data types to create safer driving environments.en
dc.description.degreeDoctor of Philosophyen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:41742en
dc.identifier.urihttps://hdl.handle.net/10919/123730en
dc.language.isoenen
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectulti-Modal Learningen
dc.subjectTraffic Safety-Critical Event Detectionen
dc.subjectDeep Learning in Traffic Analysisen
dc.subjectAutonomous Driving Safetyen
dc.subjectDriving Data Analyticsen
dc.titleEnhanced Feature Representation in Multi-Modal Learning for Driving Safety Assessmenten
dc.typeDissertationen
thesis.degree.disciplineStatisticsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.leveldoctoralen
thesis.degree.nameDoctor of Philosophyen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Shi_L_D_2024.pdf
Size:
29.91 MB
Format:
Adobe Portable Document Format