Browsing by Author "Fan, Jixiang"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
- Big Data Text Summarization for the NeverAgain MovementArora, Anuj; Miller, Chreston; Fan, Jixiang; Liu, Shuai; Han, Yi (Virginia Tech, 2018-12-10)When you are browsing social media websites such as Twitter and Facebook, have you ever seen hashtags like #NeverAgain and #EnoughIsEnough? Do you know what they mean? Never Again is an American student-led political movement for gun control to prevent gun violence. In the United States, gun control has long been debated. According to the data from the Gun Violence Archive (http://www.shootingtracker.com/), in 2017, the U.S. saw a total of 346 mass shootings. Supporters claim that the proliferation of firearms is the direct spark of a series of social unrest factors such as robbery, sexual crimes, and theft, while others believe the gun culture represents an integral part of their freedom. For the Never Again Gun Control Movement, we would like to generate a human readable summary based on deep learning methods so that one can study incidents of gun violence that shocked the world such as the 2017 Las Vegas shooting, in order to figure out the impact of gun proliferation. Our project includes three steps: pre-processing, topic modeling, and abstractive summarization using deep learning. We began with a large collection of news articles associated with the #NeverAgain movement. The raw news articles needed to be pre-processed in multiple ways. An ArchiveSpark script was used to convert the WARC and CDX files to a readable and parseable JSON. However, we figured out that at least forty percent of the data was noise. A series of restrictive word filters was applied to remove noise. After noise removal, we identified the most frequent words to get a preliminary idea whether we were filtering noise properly. We used the Natural Language Toolkit’s (NLTK) Named Entity chunker to generate named entities, which are phrases that form important nouns (people, places, organizations, etc.) in a sentence. For Topic Modeling, we classified sentences into different buckets or topics, which identified distinct themes in the collection. While we were performing the dictionary creation and document vectorization, the Latent Dirichlet allocation algorithm (for topic modeling) did not take the normalized and tokenized word corpus directly. It had to be converted into a vector for each article in the collection. We chose to use the Bag Of Words (BOW) approach. The Bag Of Words method is a simplifying representation used in natural language processing and information retrieval. In this model, text (such as a sentence or a document) is represented as the bag (multiset) of its words, disregarding grammar and even word order, but keeping multiplicity. According to topic modeling, we needed to choose the number of topics, which means one must guess how many topics are present in a collection. There is no foolproof way of replacing human logic to weave keywords into topics with semantic meaning. To address this we tried the coherence score approach. Coherence score is an attempt to mimic the human readability of the topic, and the higher the coherence score, the more ”coherent” the topics are considered. The last step for topic modeling is Latent Dirichlet Allocation (LDA). Latent Dirichlet allocation (LDA) is a generative statistical model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar. Compared with some other algorithms, LDA is a probabilistic one, which means that LDA is better at handling topic mixtures in different documents. In addition, LDA identifies topics coherently whereas the topics from other algorithms are more disjoint. After we had our topics (three in total), we filtered the article collection based on these topics. What resulted was three distinct collections of articles on which we could apply an abstractive summarization algorithm to produce a coherent summary. We chose to use a Pointer-Generator Network (PGN), a deep learning approach designed to create abstractive summaries, to produce said summaries. We created a summary for each identified topic and performed post-processing to produce one summary that connected the three topics (which are related) into a summary that flowed. The result was a summary that reflected the main themes of the article collection and informed the reader of the contents of said collection in less than two pages.
- Chatterbox Opener: A Game to Support Healthy Communication and RelationshipsWang, Wei-Lu; Haqq, Derek; Saaty, Morva; Cao, Yusheng; Fan, Jixiang; Patel, Jaitun V.; McCrickard, D. Scott (ACM, 2023-10-06)Computer Mediation Communication (CMC) applications are utilized to foster closer relationships between individuals. Various shared experience strategy designs were widely applied to technologies in order to enhance communications and interactions in family relationships. However, there needs to be more research on how shared experience approaches work in different family communication patterns. This paper presents insights into the effectiveness of three types of shared experience approaches for different family communication patterns and design considerations for game design from a diary study of Chatterbox Opener, the game we developed for families and couples to enhance communication orientation.
- Education in HCI Outdoors: A Diary Study ApproachFan, Jixiang; Saaty, Morva; McCrickard, D. Scott (ACM, 2024-06-05)To assist students and educators in more deeply grasping user technology needs in busy outdoor settings, we recommend using diary study assignments adapted from social science and humancomputer interaction (HCI) research. This suggestion is based on insights that the field of HCI has expanded from computer use in controlled, indoor environments to technology application research in broader contexts, especially outdoor environments, where diary studies yield important insights. This can be seen in areas like social media, augmented reality, citizen science, and geolocationbased games, where it is difficult to understand the user experience for these areas through short-term, controlled exposure. Instead, educators must encourage students to step out of the classroom and into the real world to observe and experience interactions during multiple-use sessions over an extended time period, which offers students in-depth insights into real-world technology use, thereby setting the stage for them to design more human-focused technology applications and services that better meet user needs. This paper explores the utilization of the diary study methodology within the context of HCI education, examining its distinctive benefits and exposing tradeoffs in its challenges. Benefits discussed in the paper include adaptability to a wide array of user needs and circumstances, the capability to yield profound insights into the application of technology in real-world settings, and effectiveness in uncovering privacy concerns in daily life. Concurrently, we identify some practical challenges and introduce targeted strategies for addressing them, such as maintaining consistent student engagement, devising creative approaches for analyzing data, and encouraging deeper reflective practices among students. In so doing, this manuscript seeks to provide actionable guidance for crafting more impactful and immersive HCI educational initiatives through diary study assignments.