VTechWorks staff will be away for the Thanksgiving holiday beginning at noon on Wednesday, November 27, through Friday, November 29. We will resume normal operations on Monday, December 2. Thank you for your patience.
 

Integrated Web App for Crisis Events Crawling

dc.contributor.authorHong, Michelle en
dc.contributor.authorRathje, Sondra en
dc.contributor.authorAngeley, Stephen en
dc.contributor.authorTeaford, Jordan en
dc.contributor.authorBraun, Kristian en
dc.date.accessioned2024-09-03T18:25:08Zen
dc.date.available2024-09-03T18:25:08Zen
dc.date.issued2024-04en
dc.descriptionTwo previous projects built two web apps for retrieving webpages about a crisis event. The first web app provided a nice web interface for building and using one-class classification to judge if a webpage is related to a crisis event or not and the second web app provided a nice web interface for crawling the WWW about webpages related to a crisis event. In this project, we would like to merge these two web apps into one, where we will have an integrated web interface for preparing the one class classifier and then using it to crawl the web.en
dc.description.abstractThe integration of a web crawler and a text classifier into a unified web application is a practical advancement in digital tools for crisis event information retrieval and parsing. This project combines HTML text processing techniques and a priority-based web crawling algorithm into a system capable of gathering and classifying web content with high relevance to specific crisis events. Utilizing the classifier project’s model trained with targeted data, the application enhances the crawler's capability to identify and prioritize content that is most pertinent to the crisis at hand. The transition from Firebase to MongoDB for backend services provides a much more flexible, accessible, and permanent database solution. As well as this, the system’s backend is further supported by a Flask API, which facilitates the interaction between the frontend, the machine learning model, and the database. This setup not only streamlines the data flow within the application but also simplifies the maintenance and scalability of the system. This integrated web app aims to serve as a valuable tool for stakeholders involved in crisis management, such as journalists, first responders, and policy makers, enabling them to access timely and relevant information swiftly. During development of this project there were many challenges with fixing the two projects; out of the box neither was functional when they were obtained from their respective repositories. As well as this, the projects had incomplete documentation, leaving a lot for our team to figure out on our own. The results of our team is a redesigned frontend, backend, and MongoDB local database together into a cohesive, full application.en
dc.description.sponsorshipDr. Mohammad Faragen
dc.identifier.urihttps://hdl.handle.net/10919/121055en
dc.language.isoen_USen
dc.publisherVirginia Techen
dc.rightsCC0 1.0 Universalen
dc.rights.urihttp://creativecommons.org/publicdomain/zero/1.0/en
dc.subjecttext classifieren
dc.subjecttext classificationen
dc.subjectweb crawleren
dc.subjectinformation retrievalen
dc.subjectweb crawlingen
dc.subjectcrisis eventen
dc.titleIntegrated Web App for Crisis Events Crawlingen
dc.title.alternativeBuilding an Integrated Web App for crawling and classifying webpages about a crisis event.en
dc.typeReporten

Files

Original bundle
Now showing 1 - 5 of 5
Loading...
Thumbnail Image
Name:
IntegratedWebAppforCrisisEventsReport.pdf
Size:
3.78 MB
Format:
Adobe Portable Document Format
Name:
IntegratedWebAppforCrisisEventsReport.docx
Size:
4.61 MB
Format:
Microsoft Word XML
Name:
IntegratedWebAppforCrisisEventsPresentation.pptx
Size:
1.48 MB
Format:
Microsoft Powerpoint XML
Loading...
Thumbnail Image
Name:
IntegratedWebAppforCrisisEventsPresentation.pdf
Size:
957.46 KB
Format:
Adobe Portable Document Format
Name:
IntegratedWebAppforCrisisEventsRepository.zip
Size:
51.19 MB
Format:
License bundle
Now showing 1 - 1 of 1
Name:
license.txt
Size:
1.5 KB
Format:
Item-specific license agreed upon to submission
Description: