Crisis Events One-Class Text Classification

dc.contributor.authorJonnavithula, Prabhathen
dc.contributor.authorSanghi, Nekunjen
dc.contributor.authorHolder, Gabrielen
dc.contributor.authorSrinivas, Anaven
dc.contributor.authorYirdaw, Menaseen
dc.date.accessioned2025-06-04T15:16:06Zen
dc.date.available2025-06-04T15:16:06Zen
dc.date.issued2025-05-07en
dc.description.abstractThis project aims to design and develop a one-class text classification system tailored to process crisis-related web pages to gain data insights at a high precision. Unlike traditional binary classifiers, our approach addresses the practical challenge of classifying documents when only examples of one class - i.e., the crisis event and related articles are available - and the negative class is undefined or highly variable. One-class classification (OCC) offers a more effective solution for this problem by treating non-crisis content as outliers or anomalies. The final deliverable will be an integrated web application that allows users to input URLs related to a crisis event. The backend will scrape, clean, and preprocess webpage content using tools such as requests and BeautifulSoup. The core machine learning engine, implemented using both traditional OCC algorithms (One-Class SVM) and advanced deep learning methods (specifically the DOCC method with PyTorch), will evaluate each page for relevance. Results will be presented through a React-based user interface, supported by a FastAPI backend and SQLite database for persistent storage and retrieval. Our pipeline consists of data collection, preprocessing, model training, evaluation and visualization, all integrated into a web app, developed through end-to-end testing. After finalizing the technology stack and dividing roles, we have currently implemented the first version of our front-end and ML model. This project not only serves a practical societal need by identifying and surfacing timely crisis information but also deepens our understanding of anomaly detection and full-stack application development in a real-world setting.en
dc.identifier.urihttps://hdl.handle.net/10919/135044en
dc.language.isoen_USen
dc.publisherVirginia Techen
dc.titleCrisis Events One-Class Text Classificationen
dc.typeReporten

Files

Original bundle
Now showing 1 - 5 of 5
Loading...
Thumbnail Image
Name:
Final Report.pdf
Size:
3.13 MB
Format:
Adobe Portable Document Format
Name:
Final Report.docx
Size:
3.6 MB
Format:
Microsoft Word XML
Loading...
Thumbnail Image
Name:
Final Presentation.pdf
Size:
586.73 KB
Format:
Adobe Portable Document Format
Name:
Final Presentation.pptx
Size:
2.23 MB
Format:
Microsoft Powerpoint XML
Name:
MultiMedia_Project-main.zip
Size:
59.16 KB
Format:
License bundle
Now showing 1 - 1 of 1
Name:
license.txt
Size:
1.5 KB
Format:
Item-specific license agreed upon to submission
Description: