Segmentation and Recognition of Highway Assets using Image-based 3D Point Clouds and Semantic Texton Forests
Files
TR Number
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Efficient data collection of high-quantity and low-cost highway assets such as road signs, traffic signals, light poles, and guardrails is a critical element to the operation, maintenance, and preservation of transportation infrastructure systems. Despite the importance, current practice of highway asset data collection is time-consuming, subjective, and potentially unsafe. The high volume of the data that needs to be collected can also negatively impact the quality of the analysis. To address these limitations, this paper proposes a new algorithm for semantic segmentation and recognition of highway assets using video frames collected from a car-mounted camera. The proposed set of algorithms (1) takes the captured frames and using a pipeline of Structure from Motion and Multi View Stereo reconstructs a 3D point cloud model of the highway and surrounding assets; (2) using a Semantic Texton Forest classifier, each geo-registered 2D video frame at the pixel-level is segmented based on shape, texture, and color of the highway assets; and finally (3) based on the results of the 2D segmentation and a new voting scheme, each reconstructed 3D point in the cloud is also categorized for one type of asset and is color coded accordingly. The resulting augmented reality environment which integrates the color coded point clouds with the geo-registered video frames enables a user to conduct visual walk through and query different categories of assets. Experiments were performed on a challenging video dataset containing sequences filmed from a moving car on a 2.2-mile-long, two-lane highway research facility. Experimental results with an average accuracy of 76.50% and 86.75% in segmentation and pixel-level recognition of 12 types of asset categories reflect the promise of the applicability of this approach for segmentation and recognition of highway assets from image-based 3D point clouds. It also enables future algorithmic developments for 3D localization of traffic signs and other assets that are detected using the state-of-the-art vision-based methods.