Big Data Visualization and Spatiotemporal Modeling of Risky Driving

TR Number
Date
2020-07
Journal Title
Journal ISSN
Volume Title
Publisher
SAFE-D: Safety Through Disruption National University Transportation Center
Abstract

Statistical evidence shows the role of risky driving as a contributing factor in roadway collisions, highlighting the importance of identifying such driving behavior. With the advent of new technologies, vehicle kinematic data can be collected at high frequency to enable driver behavior monitoring. The current project aims at mining a huge amount of driving data to identify risky driving behavior. Relational and non-relational database management systems (DBMSs) were adopted to process this big data and compare query performances. Two relational DBMSs, PostgreSQL and PostGIS, performed better than a non-relational DBMS, MongoDB, on both nonspatial and spatial queries. Supervised and unsupervised learning methods were utilized to classify risky driving. Cluster analysis as an unsupervised learning approach was used to label risky driving during short monitoring periods. Labeled driving data, including kinematic information, were employed to develop random forest models for predicting risky driving. These models showed high prediction performance. Open source and enterprise visualization tools were also developed to illustrate risky driving moments in space and time. These tools can be used by researchers and practitioners to explore where and when risky driving events occur and prioritize countermeasures for locations in highest need of improvement.

Description
Keywords
transportation safety, risky driving, data mining, Machine learning
Citation