Browsing by Author "Xu, Chao"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
- Abstractive Text Summarization of the Parkland Shooting CollectionKingery, Ryan; Yellapantula, Sudha Ravali; Xu, Chao; Huang, Li Jun; Ye, Jiacheng (Virginia Tech, 2018-12-12)We analyze various ways to perform abstractive text summarization on an entire collection of news articles. We specifically seek to summarize the collection of web-archived news articles relating to the 2018 shooting at Marjory Stoneman Douglas High School in Parkland, Florida. The original collection contains about 10,100 archived web pages that mostly relate to the shooting, which after pre-processing reduces to about 3,900 articles that directly relate to the shooting. We then explore several ways to generate abstractive summaries for the collection using deep learning methods. Since current deep learning methods for abstract summarization are only capable of summarizing text at the single-article level or below, to perform summarization on our collection, we identify a set of representative articles from the collection, summarize each of those articles using our deep learning models, and then concatenate those summaries together to produce a summary for the entire collection. To identify the representative articles to summarize we investigate various unsupervised methods to partition the space of articles into meaningful groups. We try choosing these articles by random sampling from the collection, by using topic modeling, and by sampling from clusters obtained from clustering on Doc2Vec embeddings. To summarize each individual article we explore various state of the art deep learning methods for abstractive summarization: a sequence-to-sequence model, a pointer generator network, and a reinforced extractor-abstractor network. To evaluate the quality of our summaries we employ two methods. The first is a subjective method, where each person subjectively ranked the quality of each summary. The second is an objective method which used various ROUGE metrics to compare each summary to an independently-generated gold standard summary. We found that most ROUGE scores were pretty low overall, with only the pointer-generator network on random articles picking up a ROUGE score above 0.15. This suggests that such deep learning techniques still have a lot of room for improvement if they are to be viable for collection summarization.
- Defending Against GPS Spoofing by Analyzing Visual CuesXu, Chao (Virginia Tech, 2020-05-21)Massive GPS navigation services are used by billions of people in their daily lives. GPS spoofing is quite a challenging problem nowadays. Existing Anti-GPS spoofing systems primarily focus on expensive equipment and complicated algorithms, which are not practical and deployable for most of the users. In this thesis, we explore the feasibility of a simple text-based system design for Anti-GPS spoofing. The goal is to use the lower cost and make the system more effective and robust for general spoofing attack detection. Our key idea is to only use the textual information from the physical world and build a real-time system to detect GPS spoofing. To demonstrate the feasibility, we first design image processing modules to collect sufficient textual information in panoramic images. Then, we simulate real-world spoofing attacks from two cities to build our training and testing datasets. We utilize LSTM to build a binary classifier which is the key for our Anti-GPS spoofing system. Finally, we evaluate the system performance by simulating driving tests. We prove that our system can achieve more than 98% detection accuracy when the ratio of attacked points in a driving route is more than 50%. Our system has a promising performance for general spoofing attack strategies and it proves the feasibility of using textual information for the spoofing attack detection.
- Front-End Kibana (FEK) CS5604 Fall 2019Powell, Edward; Liu, Han; Huang, Rong; Sun, Yanshen; Xu, Chao (Virginia Tech, 2020-01-13)During the last two decades, web search engines have been driven to new quality levels due to the continuous efforts made to optimize the effectiveness of information retrieval. More and more people are becoming satisfied during their information retrieval processes, and web search has gradually replaced older methods, where people obtained information from each other or from libraries. Information retrieval systems are in constant interaction with users and help users interpret and analyze data. Currently, we are building the front end of a search engine, where users can explore information related to Tobacco Settlement documents from the University of California, San Francisco, as well as the Electronic Theses and Dissertations (ETDs) of Virginia Tech (and possibly other sites). This submission introduces the current work of the front-end team to build a functional user interface, which is one of the key components of a larger project to build a state-of-the-art search engine for two large datasets. We also seek to understand how users search for data, and accordingly provide the users with more insight and utilities from the two datasets with the help of the visualization tool Kibana. Already, a search website, where users can explore the two datasets, Tobacco Settlement dataset and ETDs dataset, has been created. A series of functionalities of the searching page have been realized, for instance, the login system, searching, filter functions, a Q&A page, and a visualization page.