Enhanced Feature Representation in Multi-Modal Learning for Driving Safety Assessment

Shi, Liang

Enhanced Feature Representation in Multi-Modal Learning for Driving Safety Assessment

dc.contributor.author	Shi, Liang	en
dc.contributor.committeechair	Guo, Feng	en
dc.contributor.committeemember	Xing, Xin	en
dc.contributor.committeemember	Deng, Xinwei	en
dc.contributor.committeemember	Leman, Scott C.	en
dc.contributor.department	Statistics	en
dc.date.accessioned	2024-12-04T09:00:12Z	en
dc.date.available	2024-12-04T09:00:12Z	en
dc.date.issued	2024-12-03	en
dc.description.abstract	This dissertation explores innovative approaches in driving safety through the development of multi-modal learning frameworks that leverage high-frequency, high-resolution driving data and videos to detect safety-critical events (SCEs). The research unfolds across four methodologies, each contributing to advance the field. The introductory chapter sets the stage by outlining the motivations and challenges in driving safety research, highlighting the need for advanced data-driven approaches to improve SCE prediction and detection. The second chapter presents a framework that combines Convolutional Neural Networks (CNN) and Gated Recurrent Units (GRU) with XGBoost. This approach reduces dependency on domain expertise and effectively manages imbalanced crash data, enhancing the accuracy and reliability of SCE detection. In the third chapter, a two-stream network architecture is introduced, integrating optical flow with TimeSFormer with a multi-head attention mechanism. This innovative combination achieves exceptional detection accuracy, demonstrating its potential for applications in driving safety. The fourth chapter focuses on the Dual Swin Transformer framework, which enables concurrent analysis of video and time-series data, this methodology shows effective in processing driving front videos for improved SCE detection. The fifth chapter explores the integration of corporate labels' semantic meaning into a classification model and introduces ScVLM, a hybrid approach that merges supervised learning with contrastive learning techniques to enhance understanding of driving videos and improve event description rationality for Vision-Language Models (VLMs). This chapter addresses existing model limitations by providing a more comprehensive analysis of driving scenarios. This dissertation addresses the challenges of analyzing multimodal data and paves the way for future advancements in autonomous driving and traffic safety management. It underscores the potential of integrating diverse data sources to enhance driving safety.	en
dc.description.abstractgeneral	This dissertation explores new approaches to enhance driving safety by using advanced learning frameworks that combine video data with high-frequency, high-resolution driving information, introducing innovative techniques to predict and detect critical driving events. The introduction chapter outlines the current challenges in driving safety and emphasizes the potential of data-driven methods to improve predictions and prevent accidents. The second chapter describes a method that uses machine learning models to analyze crash data, reducing the need for expert input and effectively handling data imbalances. This approach improves the accuracy of predicting safety-critical events. The third chapter introduces a two-stream network that processes both sensor data and video frames, achieving high accuracy in detecting safety-related driving incidents. The fourth chapter presents a framework that simultaneously analyzes video and time-series data, validated using a comprehensive driving study dataset. This technique enhances the detection of complex driving scenarios. The fifth chapter introduces a hybrid learning approach that improves understanding of driving videos and event descriptions. By combining different learning techniques, this method addresses limitations in existing models. This work tackles challenges in analyzing multimodal data and sets the stage for future advancements in autonomous driving and traffic safety management. It highlights the potential of integrating diverse data types to create safer driving environments.	en
dc.description.degree	Doctor of Philosophy	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:41742	en
dc.identifier.uri	https://hdl.handle.net/10919/123730	en
dc.language.iso	en	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	ulti-Modal Learning	en
dc.subject	Traffic Safety-Critical Event Detection	en
dc.subject	Deep Learning in Traffic Analysis	en
dc.subject	Autonomous Driving Safety	en
dc.subject	Driving Data Analytics	en
dc.title	Enhanced Feature Representation in Multi-Modal Learning for Driving Safety Assessment	en
dc.type	Dissertation	en
thesis.degree.discipline	Statistics	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	doctoral	en
thesis.degree.name	Doctor of Philosophy	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Shi_L_D_2024.pdf
Size:: 29.91 MB
Format:: Adobe Portable Document Format

Download

Collections

Doctoral Dissertations