Integrated Predictive Modeling and Analytics for Crisis Management

TR Number

Date

2024-05-15

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

The surge in the application of big data and predictive analytics in fields of crisis management, such as pandemics and epidemics, highlights the vital need for advanced research in these areas, particularly in the wake of the COVID-19 pandemic. Traditional methods, which typically rely on historical data to forecast future trends, fall short in addressing the complex and ever-changing nature of challenges like pandemics and public health crises. This inadequacy is further underscored by the pandemic's significant impact on various sectors, notably healthcare, government, and the hotel industry. Current models often overlook key factors such as static spatial elements, socioeconomic conditions, and the wealth of data available from social media, which are crucial for a comprehensive understanding and effective response to these multifaceted crises.

This thesis employs spatial forecasting and predictive analytics to address crisis management in several distinct but interrelated contexts: the COVID-19 pandemic, the opioid crisis, and the impact of the pandemic on the hotel industry. The first part of the study focuses on using big data analytics to explore the relationship between socioeconomic factors and the spread of COVID-19 at the zip code level, aiming to predict high-risk areas for infection. The second part delves into the opioid crisis, utilizing semi-supervised deep learning techniques to monitor and categorize drug-related discussions on Reddit. The third part concentrates on developing spatial forecasting and providing explanations of the rising epidemic of drug overdose fatalities. The fourth part of the study extends to the realm of the hotel industry, aiming to optimize customer experience by analyzing online reviews and employing a localized Large Language Model to generate future customer trends and scenarios. Across these studies, the thesis aims to provide actionable insights and comprehensive solutions for effectively managing these major crises.

For the first work, the majority of current research in pandemic modeling primarily relies on historical data to predict dynamic trends such as COVID-19. This work makes the following contributions in spatial COVID-19 pandemic forecasting: 1) the development of a unique model solely employing a wide range of socioeconomic indicators to forecast areas most susceptible to COVID-19, using detailed static spatial analysis, 2) identification of the most and least influential socioeconomic variables affecting COVID-19 transmission within communities, 3) construction of a comprehensive dataset that merges state-level COVID-19 statistics with corresponding socioeconomic attributes, organized by zip code.

For the second work, we make the following contributions in detecting drug Abuse crisis via social media: 1) enhancing the Dynamic Query Expansion (DQE) algorithm to dynamically detect and extract evolving drug names in Reddit comments, utilizing a list curated from government and healthcare agencies, 2) constructing a textual Graph Convolutional Network combined with word embeddings to achieve fine-grained drug abuse classification in Reddit comments, identifying seven specific drug classes for the first time, 3) conducting extensive experiments to validate the framework, outperforming six baseline models in drug abuse classification and demonstrating effectiveness across multiple types of embeddings.

The third study focuses on developing spatial forecasting and providing explanations of the escalating epidemic of drug overdose fatalities. Current research in this field has shown a deficiency in comprehensive explanations of the crisis, spatial analyses, and predictions of high-risk zones for drug overdoses. Addressing these gaps, this study contributes in several key areas: 1) Establishing a framework for spatially forecasting drug overdose fatalities predominantly affecting U.S. counties, 2) Proposing solutions for dealing with scarce and heterogeneous data sets, 3) Developing an algorithm that offers clear and actionable insights into the crisis, and 4) Conducting extensive experiments to validate the effectiveness of our proposed framework.

In the fourth study, we address the profound impact of the pandemic on the hotel industry, focusing on the optimization of customer experience. Traditional methodologies in this realm have predominantly relied on survey data and limited segments of social media analytics. Those methods are informative but fall short of providing a full picture due to their inability to include diverse perspectives and broader customer feedback. Our study aims to make the following contributions: 1) the development of an integrated platform that distinguishes and extracts positive and negative Memorable Experiences (MEs) from online customer reviews within the hotel industry, 2) The incorporation of an advanced analytical module that performs temporal trend analysis of MEs, utilizing sophisticated data mining algorithms to dissect customer feedback on a monthly and yearly scale, 3) the implementation of an advanced tool that generates prospective and unexplored Memorable Experiences (MEs) by utilizing a localized Large Language Model (LLM) with keywords extracted from authentic customer experiences to aid hotel management in preparing for future customer trends and scenarios.

Building on the integrated predictive modeling approaches developed in the earlier parts of this dissertation, this final section explores the significant impacts of the COVID-19 pandemic on the airline industry. The pandemic has precipitated substantial financial losses and operational disruptions, necessitating innovative crisis management strategies within this sector. This study introduces a novel analytical framework, EAGLE (Enhancing Airline Groundtruth Labels and Review rating prediction), which utilizes Large Language Models (LLMs) to improve the accuracy and objectivity of customer sentiment analysis in strategic airline route planning. EAGLE leverages LLMs for zero-shot pseudo-labeling and zero-shot text classification, to enhance the processing of customer reviews without the biases of manual labeling. This approach streamlines data analysis, and refines decision-making processes which allows airlines to align route expansions with nuanced customer preferences and sentiments effectively. The comprehensive application of LLMs in this context underscores the potential of predictive analytics to transform traditional crisis management strategies by providing deeper, more actionable insights.

Description

Keywords

Spatial forecasting, Big data analytics, social-media data mining, pandemic forecasting, crisis mitigation

Citation