Browsing by Author "Xu, Jingbin"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- Does Eyeglance Affect Lane Change Safety: Analysis of Eyeglance Pattern Prior to Lane ChangeGuo, Feng; Han, Shu; Xu, Jingbin (National Surface Transportation Safety Center for Excellence, 2022-09-23)The driver’s eyeglance patterns prior to lane change can have a major impact on crash risk. This study focuses on the area-of-interest (AOI) in eyeglances related to lane changes, including rearview mirror, left/right window, left/right mirror, windshield, and over-the-shoulder (OTS) checks of corresponding lane change direction. Key AOI characteristics such as type, percentage, duration, timing, and time-varying properties were examined thoroughly. We also evaluated driver attention on the driving task and how it changed over time by event type using the AttenD algorithm to reconstruct eyeglance data into a continuous variable. The AttenD score incorporates the glance history in the profile to reflect how effectively a driver may be allocating attention and storing information about the roadway and other vehicles. A higher AttenD score indicates more attention on primary driving tasks. Baselines had drivers with significantly higher attention scores and lower variance than near-crashes and crashes. This indicates that drivers who conducted a safe lane change tended to look away from the road less often and were more consistent in allocating eyeglances forward and on the surrounding environment.
- Statistical Learning for Sequential Unstructured DataXu, Jingbin (Virginia Tech, 2024-07-30)Unstructured data, which cannot be organized into predefined structures, such as texts, human behavior status, and system logs, often presented in a sequential format with inherent dependencies. Probabilistic model are commonly used to capture these dependencies in the data generation process through latent parameters and can naturally extend into hierarchical forms. However, these models rely on the correct specification of assumptions about the sequential data generation process, which often limits their scalable learning abilities. The emergence of neural network tools has enabled scalable learning for high-dimensional sequential data. From an algorithmic perspective, efforts are directed towards reducing dimensionality and representing unstructured data units as dense vectors in low-dimensional spaces, learned from unlabeled data, a practice often referred to as numerical embedding. While these representations offer measures of similarity, automated generalizations, and semantic understanding, they frequently lack the statistical foundations required for explicit inference. This dissertation aims to develop statistical inference techniques tailored for the analysis of unstructured sequential data, with their application in the field of transportation safety. The first part of dissertation presents a two-stage method. It adopts numerical embedding to map large-scale unannotated data into numerical vectors. Subsequently, a kernel test using maximum mean discrepancy is employed to detect abnormal segments within a given time period. Theoretical results showed that learning from numerical vectors is equivalent to learning directly through the raw data. A real-world example illustrates how driver mismatched visual behavior occurred during a lane change. The second part of the dissertation introduces a two-sample test for comparing text generation similarity. The hypothesis tested is whether the probabilistic mapping measures that generate textual data are identical for two groups of documents. The proposed test compares the likelihood of text documents, estimated through neural network-based language models under the autoregressive setup. The test statistic is derived from an estimation and inference framework that first approximates data likelihood with an estimation set before performing inference on the remaining part. The theoretical result indicates that the test statistic's asymptotic behavior approximates a normal distribution under mild conditions. Additionally, a multiple data-splitting strategy is utilized, combining p-values into a unified decision to enhance the test's power. The third part of the dissertation develops a method to measure differences in text generation between a benchmark dataset and a comparison dataset, focusing on word-level generation variations. This method uses the sliced-Wasserstein distance to compute the contextual discrepancy score. A resampling method establishes a threshold to screen the scores. Crash report narratives are analyzed to compare crashes involving vehicles equipped with level 2 advanced driver assistance systems and those involving human drivers.