Comparative Analysis of Parametric and Non-Parametric Data-Driven Models to Predict Road Crash Severity among Elderly Drivers Using Synthetic Resampling Techniques

dc.contributor.authorAlrumaidhi, Mubaraken
dc.contributor.authorFarag, Mohamed M. G.en
dc.contributor.authorRakha, Hesham A.en
dc.date.accessioned2023-07-13T17:07:18Zen
dc.date.available2023-07-13T17:07:18Zen
dc.date.issued2023-06-21en
dc.date.updated2023-07-13T14:06:56Zen
dc.description.abstractAs the global elderly population continues to rise, the risk of severe crashes among elderly drivers has become a pressing concern. This study presents a comprehensive examination of crash severity among this demographic, employing machine learning models and data gathered from Virginia, United States of America, between 2014 and 2021. The analysis integrates parametric models, namely logistic regression and linear discriminant analysis (LDA), as well as non-parametric models like random forest (RF) and extreme gradient boosting (XGBoost). Central to this study is the application of resampling techniques, specifically, random over-sampling examples (ROSE) and the synthetic minority over-sampling technique (SMOTE), to address the dataset’s inherent imbalance and enhance the models’ predictive performance. Our findings reveal that the inclusion of these resampling techniques significantly improves the predictive power of parametric models, notably increasing the true positive rate for severe crash prediction from 6% to 60% and boosting the geometric mean from 25% to 69% in logistic regression. Likewise, employing SMOTE resulted in a notable improvement in the non-parametric models’ performance, leading to a true positive rate increase from 8% to 36% in XGBoost. Moreover, the study established the superiority of parametric models over non-parametric counterparts when balanced resampling techniques are utilized. Beyond predictive modeling, the study delves into the effects of various contributing factors on crash severity, enhancing the understanding of how these factors influence elderly road safety. Ultimately, these findings underscore the immense potential of machine learning models in analyzing complex crash data, pinpointing factors that heighten crash severity, and informing targeted interventions to mitigate the risks of elderly driving.en
dc.description.versionPublished versionen
dc.format.mimetypeapplication/pdfen
dc.identifier.citationAlrumaidhi, M.; Farag, M.M.G.; Rakha, H.A. Comparative Analysis of Parametric and Non-Parametric Data-Driven Models to Predict Road Crash Severity among Elderly Drivers Using Synthetic Resampling Techniques. Sustainability 2023, 15, 9878.en
dc.identifier.doihttps://doi.org/10.3390/su15139878en
dc.identifier.urihttp://hdl.handle.net/10919/115764en
dc.language.isoenen
dc.publisherMDPIen
dc.rightsCreative Commons Attribution 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/en
dc.subjectcrash severityen
dc.subjectmachine learningen
dc.subjectresampling techniquesen
dc.subjectimbalance dataen
dc.subjectroad safetyen
dc.subjectelderly driversen
dc.subjecttransportation safetyen
dc.titleComparative Analysis of Parametric and Non-Parametric Data-Driven Models to Predict Road Crash Severity among Elderly Drivers Using Synthetic Resampling Techniquesen
dc.title.serialSustainabilityen
dc.typeArticle - Refereeden
dc.type.dcmitypeTexten

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
sustainability-15-09878-v2.pdf
Size:
3.67 MB
Format:
Adobe Portable Document Format
Description:
Published version
License bundle
Now showing 1 - 1 of 1
Name:
license.txt
Size:
0 B
Format:
Item-specific license agreed upon to submission
Description: