Browsing by Author "Rho, Ha Rim"
Now showing 1 - 16 of 16
Results Per Page
Sort Options
- Analyzing and Navigating Electronic Theses and DissertationsAhuja, Aman (Virginia Tech, 2023-07-21)Electronic Theses and Dissertations (ETDs) contain valuable scholarly information that can be of immense value to the scholarly community. Millions of ETDs are now publicly available online, often through one of many digital libraries. However, since a majority of these digital libraries are institutional repositories with the objective being content archiving, they often lack end-user services needed to make this valuable data useful for the scholarly community. To effectively utilize such data to address the information needs of users, digital libraries should support various end-user services such as document search and browsing, document recommendation, as well as services to make navigation of long PDF documents easier. In recent years, with advances in the field of machine learning for text data, several techniques have been proposed to support such end-user services. However, limited research has been conducted towards integrating such techniques with digital libraries. This research is aimed at building tools and techniques for discovering and accessing the knowledge buried in ETDs, as well as to support end-user services for digital libraries, such as document browsing and long document navigation. First, we review several machine learning models that can be used to support such services. Next, to support a comprehensive evaluation of different models, as well as to train models that are tailored to the ETD data, we introduce several new datasets from the ETD domain. To minimize the resources required to develop high quality training datasets required for supervised training, a novel AI-aided annotation method is also discussed. Finally, we propose techniques and frameworks to support the various digital library services such as search, browsing, and recommendation. The key contributions of this research are as follows: - A system to help with parsing long scholarly documents such as ETDs by means of object-detection methods trained to extract digital objects from long documents. The parsed documents can be used for further downstream tasks such as long document navigation, figure and/or table search, etc. - Datasets to support supervised training of object detection models on scholarly documents of multiple types, such as born-digital and scanned. In addition to manually annotated datasets, a framework (along with the resulting dataset) for AI-aided annotation also is proposed. - A web-based system for information extraction from long PDF theses and dissertations, into a structured format such as XML, aimed at making scholarly literature more accessible to users with disabilities. - A topic-modeling based framework to support exploration tasks such as searching and/or browsing documents (and document portions, e.g., chapters) by topic, document recommendation, topic recommendation, and describing temporal topic trends.
- Behind the Counter: Exploring the Motivations and Perceived Effectiveness of Online Counterspeech Writing and the Potential for AI-Mediated AssistanceKumar, Anisha (Virginia Tech, 2024-01-11)In today's digital age, social media platforms have become powerful tools for communication, enabling users to express their opinions while also exposing them to various forms of hateful speech and content. While prior research has often focused on the efficacy of online counterspeech, little is known about peoples' motivations for engaging in it. Based on a survey of 458 U.S. participants, we develop and validate a multi-item scale for understanding counterspeech motivations, revealing that differing motivations impact counterspeech engagement between those that do and not find counterspeech to be an effective mechanism for counteracting online hate. Additionally, our analysis explores peoples' perceived effectiveness of their self-written counterspeech to hateful posts, influenced by individual motivations to engage in counterspeech and demographic factors. Finally, we examine peoples' willingness to employ AI assistance, such as ChatGPT, in their counterspeech writing efforts. Our research provides insight into the factors that influence peoples' online counterspeech activity and perceptions, including the potential role of AI assistance in countering online hate.
- Contextual Impact on SNS Users’ Privacy Decisions: A Cross Cultural StudyRho, Ha Rim; Li, Yao (2018-04-21)Social network users with different cultural backgrounds have different privacy attitudes and behaviors. This study is to explore the mechanisms behind the cultural differences in privacy decisions. The findings have implications on customizing privacy technologies in different cultures.
- The Design of Online Environments (Political Hashtags) and the Quality of Democratic Discourse At-ScaleRho, Ha Rim (University of California, Irvine, 2020-07-24)Facilitating democratic discourse, or people's ability to access factual information in service of thoughtful discussion of social issues, is critical for democracies to function properly. However, with the rise of online fake news, misinformation, and political extremism, it is becoming increasingly difficult to have civil conversations on the internet. As a first step to addressing this issue, scholars need to understand how the current design of online environments shapes people’s ability to respectfully engage across social and political differences. In this dissertation, I investigate how common social media design features, such as hashtags directly impact the quality of democratic discourse at-scale. Using natural language processing, statistics, and experimental design, I empirically demonstrate how linguistic behavior and the presence of political hashtags in online social media news articles impact the quality of discussions surrounding race, gender, and equality. Through my findings, I provide a theoretical examination of functionality and intertextuality as critical aspects of online design. Online design considerations that consider functionality alone tend to promote a digital public sphere that predominantly favors hashtag (or content) producers over non-users and passive content consumers. The sole emphasis on the functionality of design features drives frequency-driven research practices that prioritize discourse conditions for hashtag producers through volume-based definitions of discussion quality. Collectively, the research studies in this thesis are motivated by a desire to understand how online spaces can be better designed to foster interaction and discourse that can bridge rather than sharpen social differences. Results from this dissertation research strongly indicate that scholars, designers, and engineers need to rethink and evaluate how current methodological approaches that prioritize the functionality of online design choices are limiting the way we understand the quality of democratic discourse on the internet. As a step towards this direction, I evoke Kristeva’s notion of intertextuality to demonstrate how online design choices facilitate the power of language in which important social topics are discussed across networks.
- Differences in Online Privacy & Security Attitudes based on Economic Living Standards: A Global Study of 24 CountriesRho, Ha Rim; Kobsa, Alfred; Nguyen, M.-H. Carolyn (Association for Information Systems (AIS) ECIS Proceedings, 2018-06-27)This work explores online privacy and security attitudes from 24,143 individuals across 24 countries with diverse economic living standards. By using k-mode analysis, we identified three distinct profiles based on similarity in Internet security and privacy attitudes measured by 83 items. By comparing the aggregated dissimilarity measures between each respondent and the centroid values of the three profiles at the country level, we assigned each country to their best-fitting privacy profile. We found significant differences in GDP per capita between profiles 1 (highest GDP) to 3 (lowest). People in profiles with higher GDP per capita have significantly greater privacy concerns in relation to information being monitored or bought and sold. These individuals are also more reluctant towards government surveillance of online communication as well as less likely to agree that governments should work with other public and private entities to develop online security laws. As economic living standards improve, the proportion of individuals increases in profile 1, decreases in profile 2, and most rapidly drops in profile 3. To the best of our knowledge, it is the first research that systematically examines country-level privacy in relation to a national economic variable using GDP per capita.
- The Impact of Offspring Hashtags on Semantic Polarization in Online Social Movements: Evidence from the Indian Farmer's ProtestLeekha, Rohan Singh (Virginia Tech, 2023-07-06)In this work, we investigate the role of offspring hashtags on the semantic polarization of online discourse between the protest and counter-protest communities over time through the lens of the 2021 farmers' protest in India. Offspring hashtags are those that first appear alongside their more widely known "parent" hashtag (e.g., #WhyIDidntReport and #YesAllWomen are offspring hashtags that first co-appeared alongside their more famous and mainstream parent hashtag, #MeToo). The prominence of parent hashtags and their visible role in facilitating modern day protests have dominated scholarly efforts in understanding the socio-technical influence of social movement hashtags. By contrast, scholarship on the impact of the lesser-known offspring hashtags is rare and if any, typically examined through the lens of its primary parent tag. Our work aims to address this gap. In this research, we examine how the protest and counter-protest communities use offspring hashtags in their tweets to discuss and frame farmers - the key social group at the center of the farmers' protest (RQ1). Our findings reveal that both protests and counter-protests use offspring hashtags in a manner that further polarizes rather than bridges perspectives on core issues - focusing on themes that malign the other side (RQ2). We then measure and track how the semantic polarization in the use of the term "farmer" by the protest vs. counter-protest communities who use offspring hashtags evolves over time in relation to key protest events (RQ3). Finally, to empirically test and demonstrate whether and how the volume of offspring hashtags throughout the protest period influences semantic polarization trends between the protest and counter-protest discussion of farmers, we create a series of time-series models for causal inference. We use Granger-causality to test whether and how fluctuations in the volume of offspring hashtags significantly predict how the protest and counter-protest communities semantically diverge in how they discuss farmers over time (RQ4). By analyzing offspring hashtags, this work provides a detailed understanding of the nuanced themes and narratives that may be lost under parent hashtags, but significantly influence online discourse between the protest and counter-protest communities.
- Intelligent Agents in Everyday Settings: Leveraging a Multi-Methods ApproachJagannath, Krithika; Rho, Ha Rim (2018-04-21)Conversational Agents (CAs) or Intelligent Personal Assistants (IPAs) (e.g., Apple’s Siri, Microsoft’s Cortana; Amazon’s Alexa and Google’s Google Assistant) are voice-based interfaces designed for tasks in everyday life including: retrieval of information (e.g., weather, traffic, news), streaming of music, online shopping, controlling of home appliances, and voicecalls within the home and automobiles. Continuous enhancements of their natural language processing abilities, seamless set up of miniaturized hardware, and large-scale cloud-based infrastructures render CAs as unobtrusive, artificially intelligent voice sensors. With CAs rapidly making their way into the home market, the social implications remain unclear. Some product companies have released open-source software platforms that allow third-party developers and the general public to contribute software towards the growth of CAs. However, research around userinteraction with CAs in social settings is still at a nascent stage. In this workshop paper, we unpack the methods used in our ongoing work on people’s social interactions with CAs in order to generate discussion around how the research community can leverage various methodologies using both qualitative and quantitative techniques.
- Investigating the Effects of Nudges for Facilitating the Use of Trigger Warnings and Content WarningsAltland, Emily Caroline (Virginia Tech, 2024-06-27)Social media can trigger past traumatic memories in viewers when posters post sensitive content. Strict content moderation and blocking/reporting features do not work when triggers are nuanced and the posts may not violate site guidelines. Viewer-side interventions exist to help filter and hide certain content but these put all the responsibility on the viewer and typically act as 'aftermath interventions'. Trigger and content warnings offer a unique solution giving viewers the agency to scroll past content they may want to avoid. However, there is a lack of education and awareness for posters for how to add a warning and what topics may require one. We conducted this study to determine if poster-side interventions such as a nudge algorithm to add warnings to sensitive posts would increase social media users' knowledge and understanding of how and when to add trigger and content warnings. To investigate the effectiveness of a nudge algorithm, we designed the TWIST (Trigger Warning Includer for Sensitive Topics) app. The TWIST app scans tweet content to determine whether a TW/CW is needed and if so, nudges the social media poster to add one with an example of what it may look like. We then conducted a 4-part mixed methods study with 88 participants. Our key findings from this study include (1) Nudging social media users to add TW/CW educates them on triggering topics and raises their awareness when posting in the future, (2) Social media users can learn how to add a trigger/content warning through using a nudge app, (3) Researchers grew in understanding of how a nudge algorithm like TWIST can change people's behavior and perceptions, and (4) We provide empirical evidence of the effectiveness of such interventions (even in short-time use).
- Linguistically Differentiating Acts and Recalls of Racial Microaggressions on Social MediaGunturi, Uma Sushmitha (Virginia Tech, 2023-07-11)Experiences of interpersonal racism persist as a prevalent reality for BIPOC (Black, Indigenous, People of Color) in the United States. One form of racism that often goes unnoticed is racial microaggressions. These are subtle acts of racism that leave victims questioning the intent of the aggressor. The line of offense is often unclear, as these acts are disguised through humor or seemingly harmless intentions. In this study, we analyze the language used in online racial microaggressions ("Acts") and compare it to personal narratives recounting experiences of such aggressions ("Recalls") by Black social media users. We curated a corpus of acts and recalls from social media discussions on platforms like Reddit and Tumblr. Additionally, we collaborated with Black participants in a workshop to hand-annotate and verify the corpus. Using natural language processing techniques and qualitative analysis, we examine the language underlying acts and recalls of racial microaggressions. Our goal is to understand the lexical patterns that differentiate the two in the context of racism in the U.S. Our findings indicate that neural language models can accurately classify acts and recalls, revealing contextual words that associate Blacks with objects that perpetuate negative stereotypes. We also observe overlapping linguistic signatures between acts and recalls, serving different purposes, which have implications for current challenges in social media content moderation systems.
- Narrative Characteristics in Refugee Discourse: An Analysis of American Public Opinion on Afghan Refugee Crisis After the Taliban TakeoverDogan, Hulya (Virginia Tech, 2023-06-22)The United States (U.S.) military withdrawal from Afghanistan in August 2021 was met with turmoil as Taliban regained control of most of the country, including Kabul. These events have affected many and were widely discussed on social media, especially in the U.S. In this work, we focus on Twitter discourse regarding these events, especially potential opinion shifts over time and the effect social media posts by established U.S. legislators might have had on online public perception. To this end, we investigate two datasets on the war in Afghanistan, consisting of Twitter posts by self-identified U.S. accounts and conversation threads initiated by U.S. politicians. We find that Twitter users' discussions revolve around the Kabul airport event, President Biden's handling of the situation, and people affected by the U.S. withdrawal. Microframe analysis indicates that discourse centers the humanitarianism underlying these occurrences and politically leans liberal, focusing on care and fairness. Lastly, network analysis shows that Republicans are far more active on Twitter compared to Democrats and there is more positive sentiment than negative in their conversations.
- Privacy Norms in the Context of Connected & Self-Driving CarsRho, Ha Rim (Springer, 2017-02-25)The upcoming transition to self-driving cars could lead to a seismic shift in society, one that affects industry practices, regulation landscapes, as well as personal decision-making and social norms around privacy. Major tech companies and traditional auto manufacturers have started working together to conceive an optimal regulatory environment for autonomous vehicles. However, the issue of privacy in the process of collecting, managing, and using data generated from self-driving and connected cars remain one of the biggest challenges yet to be solved. In this position paper, I highlight key privacy challenges and issues in the context of connected and self-driving cars.
- Real Memes In-The-Wild: Explainable Classification of Hateful vs. Non-Hateful MemesRho, Ha Rim; Leekha, Rohan (2023-04-23)The virality of hateful or violent memes over the recent years has encouraged deep learning research on hateful meme classification. These models, however, are typically trained to classify memes based on synthetically generated data. Synthetically generated meme data, such as the widely used Hateful Memes Challenge dataset from Meta AI was created by interchanging random texts with random images. Such artificially generated memes often exclude neologisms, insider- expressions, slangs and other linguistic nuances, which are prevalent across real memes that actually circulate online. As a result, current state-of-the-art classifiers are limited in accurately predicting hateful memes in-the-wild. Furthermore, prior research tend to focus on the prediction task rather than explaining the characteristics that make memes hateful. Addressing these challenges, we introduce "RealMemes," a manually curated dataset comprising 3,142 in-the-wild memes collected from various social platforms including Instagram and Reddit, as well as WhatsApp and Telegram groups. Furthermore, we propose an interpretable multimodal classification system designed to not only distinguish between hateful and non-hateful memes, but also elucidate the specific textual and visual elements that contribute to a meme's classification.
- Supporting and Transforming High-Stakes Investigations with Expert-Led CrowdsourcingVenkatagiri, Sukrit (Virginia Tech, 2022-12-20)Expert investigators leverage their advanced skills and deep experience to solve complex investigations, but they face limits on their time and attention. In contrast, crowds of novices can be highly scalable and parallelizable, but lack expertise and may engage in vigilante behavior. In this dissertation, I introduce and evaluate the framework of expert-led crowdsourcing through three studies across two domains, journalism and law enforcement. First, through an ethnographic study of two law enforcement murder investigations, I uncover tensions in a real-world crowdsourced investigation and introduce the expert-led crowdsourcing framework. Second, I instantiate expert-led crowdsourcing in two collaboration systems: GroundTruth and CuriOSINTy. GroundTruth is focused on one specific investigative task, image geolocation. CuriOSINTy expands the flexibility and scope of expert-led crowdsourcing to handle more complex and multiple investigative tasks: identifying and debunking misinformation. Third, I introduce a framework for understanding how expert-led crowdsourced investigations work and how to better support them. Finally, I conclude with a discussion of how expert-led crowdsourcing enables experts and crowds to do more than either could alone, as well as how it can be generalized to other domains.
- Understanding Social Media Users' Perceptions of Trigger and Content WarningsGupta, Muskan (Virginia Tech, 2023-10-18)The prevalence of distressing content on social media raises concerns about users' mental well-being, prompting the use of trigger warnings (TW) and content warnings (CW). However, varying practices across platforms indicate a lack of clarity among users regarding these warnings. To gain insight into how users experience and use these warnings, we conducted interviews with 15 regular social media users. Our findings show that users generally have a positive view of warnings, but there are differences in how they understand and use them. Challenges related to using TW/CW on social media emerged, making it a complex decision when dealing with such content. These challenges include determining which topics require warnings, navigating logistical complexities related to usage norms, and considering the impact of warnings on social media engagement. We also found that external factors, such as how the warning and content are presented, and internal factors, such as the viewer's mindset, tolerance, and level of interest, play a significant role in the user's decision-making process when interacting with content that has TW/CW. Participants emphasized the need for better education on warnings and triggers in social media and offered suggestions for improving warning systems. They also recommended post-trigger support measures. The implications and future directions include promoting author accountability, introducing nudges and interventions, and improving post-trigger support to create a more trauma-informed social media environment.
- Understanding the Impact of Data Privacy Regulations on Software and Its StakeholdersFranke, Lucas James (Virginia Tech, 2023-07-06)The General Data Protection Regulation (GDPR) is a comprehensive data privacy law that limits how businesses can collect personal information about their consumers living in the European Union. For our research, we aimed to evaluate the impact that the GDPR has on the open-source community, an online community that encourages open collaboration between software developers. We conducted a quantitative analysis of GitHub pull requests in which we compared pull requests explicitly related to the GDPR to other non-GDPR pull requests from the same projects. We also conducted a qualitative pilot study in which we interviewed software developers with experience implementing GDPR requirements in industry or in open-source. From our research, we found that GDPR-related pull requests had significantly more activity than other pull requests, but that open-source developers did not perceive a significant impact on their software development processes when implementing GDPR compliance. Industry developers, on the other hand, had a more negative outlook on the GDPR, and found implementation to be difficult. Our results indicate a need to involve software developers in the lawmaking process in order to create direct and realistic expectations for developers when implementing privacy policies.
- Why Did You Post That GIF? Understanding Relationship between User Identity and Self Expression through GIFs on Social MediaWang, Boyuan (Virginia Tech, 2023-08-02)GIFs afford a great degree of personalization as they are often created from popular movie and video clips, with diverse and real characters, each expressing a nuanced affect state through a combination of characters' own unique bodily gesture and distinctive visual background. This highly personalized and embodied property gave us an unique window to explore how individuals represent and express themselves on social media, through the lens of GIFs they use. In this study, we explore how do Twitter users express their gender and racial identities through that of characters in gifs. We conducted a behavioral study (n=398) to simulate a series of tweeting and gif picking scenario and we found that gender and race identities have significant impact on users' choice of GIFs and that source familiarity and perceived audience also have significant impacts on whether a user will choose race and gender matching GIFs.