Unstructured to Actionable: Extracting wind event impact data for enhanced infrastructure resilience

TR Number



Journal Title

Journal ISSN

Volume Title


Virginia Tech


The United States experiences more extreme wind events than any other country, owing to its extensive coastlines, central regions prone to tornadoes, and varied climate that together create a wide array of wind phenomena. Despite advanced meteorological forecasts, these events continue to have significant impacts on infrastructure due to the knowledge gap between hazard prediction and tangible impact. Consequently, disaster managers are increasingly interested in understanding the impacts of past wind events that can assist in formulating strategies to enhance community resilience. However, this data is often non-structured and embedded in various agency documents. This makes it challenging to access and use the data effectively. Therefore, it is important to investigate approaches that can distinguish and extract impact data from non-essential information. This research aims at exploring methods that can identify, extract, and summarize sentences containing impact data. The significance of this study lies in addressing the scarcity of historical impact data related to structural and community damage, given that such information is dispersed across multiple briefings and damage reports. The research has two main objectives. The first is to extract sentences providing information on infrastructure, or community damage. This task uses Zero-shot text classification with the large version of the Bidirectional and Auto-Regressive Transformers model (BART-large) pre-trained on the multi-nominal language inference (MNLI) dataset. The model identifies the impact sentences by evaluating entailment probabilities with user-defined impact keywords. This method addresses the absence of manually labeled data and establishes a framework applicable to various reports. The second objective transforms this extracted data into easily digestible summaries. This is achieved by using a pre-trained BART-large model on the Cable News Network (CNN) Daily Mail dataset to generate abstractive summaries, making it easier to understand the key points from the extracted impact data. This approach is versatile, given its dependence on user-defined keywords, and can adapt to different disasters, including tornadoes, hurricanes, earthquakes, floods, and more. A case study will demonstrate this methodology, specifically examining the Hurricane Ian impact data found in the Structural Extreme Events Reconnaissance (StEER) damage report.



Wind disaster and resilience, community damage, Text mining, Zero-shot text classification, Impact-based forecasting