A Dataset of Vehicle and Pedestrian Trajectories from Normal Driving and Crash Events in One Year of Virginia Traffic Camera Data

TR Number



Journal Title

Journal ISSN

Volume Title


Virginia Tech


Traffic cameras are those cameras operated with the purpose of observing traffic, often streaming video in real-time to traffic management centers. These camera video streams allow transportation authorities to respond to traffic events and maintain situational awareness. However, traffic cameras also have the potential to directly capture crashes and conflicts, providing enough information to perform reconstruction and gain insights regarding causation and remediation. Beyond crash events, traffic camera video also offers an opportunity to study normal driving. Normal driver behavior is important for traffic planners, vehicle designers, and in the form of numerical driver models is vital information for the development of automated vehicles. Traffic cameras installed by state departments of transportation have already been placed in locations relevant to their interests. A wide range of driver behavior can be studied from these locations by observing vehicles at all times and under all weather conditions. Current systems to analyze traffic camera video focus on detecting when traffic events occur, with very little information about the specifics of those events. Prior studies into traffic event detection or reconstruction used 1-7 cameras placed by the researchers and collected dozens of hours of video. Crashes and other interesting events are rare and cannot be sufficiently characterized by camera installations of that size. The objective of this dissertation was to explore the utility of traffic camera data for transportation research by modeling and characterizing crash and non-crash behavior in pedestrians and drivers using a captured dataset of traffic camera video from the Commonwealth of Virginia, named the VT-CAST (Virginia Traffic Cameras for Advanced Safety Technologies) 2020 dataset. A total of 6,779,726 hours of traffic camera video was captured from live internet streams from December 17, 2019 at 4:00PM to 11:59PM on December 31, 2020. Video was analyzed by a custom R-CNN convolutional neural network keypoint detector to identify the locations of vehicles on the ground. The OpenPifPaf model was used to identify the locations of pedestrians on the ground. The location, pan, tilt, zoom, and altitude of each traffic camera was reconstructed to develop a mapping between the locations of vehicles and pedestrians on-screen and their physical location on the surface of the Earth. These physical detections were tracked across time to determine the trajectories on the surface of the Earth for each visible vehicle and pedestrian in a random sample of the captured video. Traffic camera video offers a unique opportunity to study crashes in-depth which are not police reported. Crashes in the traffic camera video were identified, analyzed, and compared to nationally representative datasets. Potential crashes were identified during the study interval by inspecting Virginia 511 traffic alerts for events which occurred near traffic cameras and impacted the flow of traffic. The video from these cameras was manually reviewed to determine whether a crash was visible. Pedestrian crashes, which did not significantly impact traffic, were identified from police accident reports (PARs) as a separate analysis. A total of 292 crashes were identified from traffic alerts, and six pedestrian crashes were identified from PARs. Road departure and rear-end crashes occurred in similar proportions to national databases, but intersection crashes were underrepresented and severe and rollover cases were overrepresented. Among these crashes, 32% of single-vehicle crashes and 50% of multi-vehicle crashes did not appear in the Virginia crash database. This finding shows promise for traffic cameras as a future data source for crash reconstruction, indicating traffic cameras are a capable tool to study unreported crashes. The safe operation of autonomous vehicles requires perception systems which make accurate short-term predictions of driver and pedestrian behavior. While road user behavior can be observed by the autonomous vehicles themselves, traffic camera video offers another potential information source for algorithm development. As a fixed roadside data source, these cameras capture a very large number of traffic interactions at a single location. This allows for detailed analyses of important roadway configurations across a wide range of drivers. To evaluate the efficacy of this approach, a total of 58 intersections in the VT-CAST 2020 dataset were sampled for driver trajectories at intersection entry, yielding 58,180 intersection entry trajectories. K-means clustering was used to group these trajectories into a family of 45 trajectory clusters. Likely as a function of signal phase, distinct groups of accelerating, constant speed, and decelerating trajectories were present. Accelerating and decelerating trajectories each occurred more frequently than constant speed trajectories. The results indicate that roadside data may be useful for understanding broad trends in typical intersection approaches for application to automated vehicle systems or other investigations; however, data utility would be enhanced with detailed signal phase information. A similar analysis was conducted of the interactions between drivers and pedestrians. A total of 35 crosswalks were identified in the VT-CAST 2020 dataset with sufficient trajectory information, yielding 1,488 trajectories of drivers interacting with pedestrians. K-means clustering was used to group these trajectories into a family of 16 trajectory clusters. Distinct groups of accelerating, constant speed, and decelerating trajectories were present, including trajectory clusters which described vehicles slowing down around pedestrians. Constant speed trajectories occurred the most often, followed by accelerating trajectories and decelerating trajectories. As with the prior investigation, this finding suggests that roadside data may be used in the development of driver-pedestrian interaction models for automated vehicles and other use cases involving a combination of pedestrians and vehicles. Overall, this dissertation demonstrates the utility of standard traffic camera data for use in traffic safety research. As evidence, there are already three current studies (beyond this dissertation) using the video data and trajectories from the VT-CAST 2020 dataset. Potential future studies include analyzing the mobile phone use of pedestrians, analyzing mid-block pedestrian crossings, automatically performing roadway safety assessments, considering the behavior of drivers following congested driving, evaluating the effectiveness of work zone hazard countermeasures, and understanding roadway encroachments.



traffic camera, deep learning, CNNs, driver behavior, pedestrian behavior, computer vision