VTechWorks staff will be away for the Thanksgiving holiday beginning at noon on Wednesday, November 27, through Friday, November 29. We will resume normal operations on Monday, December 2. Thank you for your patience.
 

Natural Language Processing: Generating a Summary of Flood Disasters

Abstract

In the event of a natural disaster like a flood, news outlets are in a rush to produce coverage for the general public. People may want a clear, concise summary of the event without having to read through hundreds of documents describing the event in different ways. The report of our work describes how to use computation techniques in Natural Language Processing (NLP) to automatically generate a summary on an instance of a flood event given a collection of diverse text documents. The body of this document covers NLP topics and techniques utilizing the NLTK Python library and Apache Hadoop to analyze and summarize a corpus. While this document describes the usage of such tools, it does not give an in-depth explanation of how these tools work, but rather focuses on their application to generating a summary of a flood event.

Description

"Flood Presentation" in both PowerPoint and PDF formats is from the final in-class presentation. Floods_Group_H.pdf is the PDF version of the final report document. LatexDocument.zip has the original version of that document.

Keywords

Natural Language Processing, Flooding, Machine learning, named entity recognition, NER, Hadoop, Mahout, Big Data, NLTK

Citation

NSF DUE-1141209 and IIS-1319578