Crisis Events Text Summarization

Abstract

From mass shootings to public health emergencies, crisis events have unfortunately become a prevalent part of today’s world. This project contributes to the advancement of crisis response capabilities by providing an accessible tool for extracting key insights from diverse sources of information. The system allows users to create collections to aggregate articles and data relevant to specific crisis events. Users can upload files in various formats, including text files of links, zip files containing articles in text format, or zip files with HTML content. The program extracts and organizes information from these sources, storing it efficiently in a SQLite database for future retrieval and analysis. One of the key features of the system is its flexibility in text summarization. The current summarizers available are BERT, T5, and NLTK, but it would be relatively easy to add new summarizers at a later date. Currently, the NLTK and T5 summarizers work relatively quickly, but the BERT summarizer takes minutes before it finishes summarizing. This is because the BERT summarizer is the most powerful, being a larger model and requiring more processing. The front-end of the application is written in React.js using JavaScript. The back-end is composed of the database, the scraper, and the summarizers. The code for accessing the database is written in Python. The Flask framework facilitates back-end operations, allowing seamless integration between frontend and database functionalities. The code for the summarizers is also written using Python. The libraries used in the summarizer code are NLTK, Transformers, PyTorch, and Summarizer. The code for the web scraper is also written using Python and utilizes the BeautifulSoup4 library for parsing HTML. Overall, this project aims to empower users with a crisis information management tool that efficiently aggregates, extracts, and summarizes data to aid in crisis response and decision-making.

Description

Keywords

Citation