Journal Articles, Association for Computing Machinery (ACM)

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 20 of 293
  • Privacy-Preserving and Diversity-Aware Trust-based Team Formation in Online Social Networks
    Mahajan, Yash; Cho, Jin-Hee; Chen, Ing-Ray (ACM, 2024-07)
    As online social networks (OSNs) become more prevalent, a new paradigm for problem-solving through crowd-sourcing has emerged. By leveraging the OSN platforms, users can post a problem to be solved and then form a team to collaborate and solve the problem. A common concern in OSNs is how to form effective collaborative teams, as various tasks are completed through online collaborative networks. A team's diversity in expertise has received high attention to producing high team performance in developing team formation (TF) algorithms. However, the effect of team diversity on performance under different types of tasks has not been extensively studied. Another important issue is how to balance the need to preserve individuals' privacy with the need to maximize performance through active collaboration, as these two goals may conflict with each other. This research has not been actively studied in the literature. In this work, we develop a team formation (TF) algorithm in the context of OSNs that can maximize team performance and preserve team members' privacy under different types of tasks. Our proposed PRivAcy-Diversity-Aware Team Formation framework, called PRADA-TF, is based on trust relationships between users in OSNs where trust is measured based on a user's expertise and privacy preference levels. The PRADA-TF algorithm considers the team members' domain expertise, privacy preferences, and the team's expertise diversity in the process of team formation. Our approach employs game-theoretic principles Mechanism Design to motivate self-interested individuals within a team formation context, positioning the mechanism designer as the pivotal team leader responsible for assembling the team. We use two real-world datasets (i.e., Netscience and IMDb) to generate different semi-synthetic datasets for constructing trust networks using a belief model (i.e., Subjective Logic) and identifying trustworthy users as candidate team members. We evaluate the effectiveness of our proposed PRADA-TF scheme in four variants against three baseline methods in the literature. Our analysis focuses on three performance metrics for studying OSNs: social welfare, privacy loss, and team diversity.
  • Exploiting Update Leakage in Searchable Symmetric Encryption
    Haltiwanger, Jacob; Hoang, Thang (ACM, 2024-06-19)
    Dynamic Searchable Symmetric Encryption (DSSE) provides efficient techniques for securely searching and updating an encrypted database. However, efficient DSSE schemes leak some sensitive information to the server. Recent works have implemented forward and backward privacy as security properties to reduce the amount of information leaked during update operations. Many attacks have shown that leakage from search operations can be abused to compromise the privacy of client queries. However, the attack literature has not rigorously investigated techniques to abuse update leakage. In this work, we investigate update leakage under DSSE schemes with forward and backward privacy from the perspective of a passive adversary. We propose two attacks based on a maximum likelihood estimation approach, the UFID Attack and the UF Attack, which target forward-private DSSE schemes with no backward privacy and Level II backward privacy, respectively. These are the first attacks to show that it is possible to leverage the frequency and contents of updates to recover client queries. We propose a variant of each attack which allows the update leakage to be combined with search pattern leakage to achieve higher accuracy. We evaluate our attacks against a real-world dataset and show that using update leakage can improve the accuracy of attacks against DSSE schemes, especially those without backward privacy.
  • SRAM Imprinting for System Protection and Differentiation
    Mahmod, Jubayer; Hicks, Matthew (ACM, 2024-07-01)
    The foundation of trusted computation depends on the ability to verify the authenticity of the underlying hardware. This need is further compounded by the presence of counterfeit components in the market, highlighting the necessity for pre-deployment and run-time chip identification techniques. Current solutions involve burning authentication information in physical fuses or creating a unique mask for each integrated circuit, which are either costly or susceptible to forgery. While many solutions have been proposed to prevent chip counterfeiting at design time, no accurate, reference-free, and cost-effective solutions exist for chip buyers to authenticate their purchases in the pre-deployment phase and enable software-level verification at runtime. The lack of industrystandard authentication methods forces chip buyers to either adopt expensive solutions, such as X-Ray imaging, or simply rely on blind faith. This paper presents SKU-RAM, a technique for chip identification that allows manufacturers to embed their signature into integrated circuits, provides per-device identification, and facilitates hardwareenforced time-limited licensing functionality. Our approach takes advantage of the aging-induced power-on state changes in SRAM to encode authentication data into an already fabricated device, without modifying the mask of the chips. This hardware-overheadfree augmentation to the chips eliminates numerous instances of chip counterfeiting and enables software-level authentication. We demonstrate the effectiveness of SKU-RAM as a comprehensive and scalable anti-counterfeiting solution for existing and future computing devices using commercial off-the-shelf microcontrollers and microprocessors and multi-year real-time experiments.
  • TriSAS: Toward Dependable Inter-SAS Coordination with Auditability
    Shi, Shanghao; Xiao, Yang; Du, Changlai; Shi, Yi; Wang, Chonggang; Gazda, Robert; Hou, Y. Thomas; Burger, Eric W.; Dasilva, Luiz; Lou, Wenjing (ACM, 2024-07-01)
    To facilitate dynamic spectrum sharing, the FCC has designated certified SAS administrators to implement their own spectrum access systems (SASs) that manage the shared spectrum usage in the novel CBRS band. As a premise, different SAS servers must conduct periodic inter-SAS coordination to synchronize service states and avoid allocation conflicts. However, SAS servers may inevitably stop service for regular upgrades, crash down, or even perform maliciously that deviate from the normal routines, posing a fundamental operation security problem — the system shall be robust against these faults to guarantee secure and efficient spectrum sharing service. Unfortunately, the incumbent inter-SAS coordination mechanism, CPAS, is prone to SAS failures and does not support real-time allocation. Recent proposals that rely on blockchain smart contracts or state machine replication mechanisms to realize faulttolerant inter-SAS coordination require all SASs to follow a unified allocation algorithm. They however face performance bottlenecks and cannot accommodate the current fact that different SASs hold their own proprietary allocation algorithms. In this work, we propose TriSAS—a novel inter-SAS coordination mechanism to facilitate secure, efficient, and dependable spectrum allocation that is fully compatible with the existing SAS infrastructure. TriSAS decomposes the coordination process into two phases including input synchronization and decision finalization. The first phase ensures participants share a common input set while the second one fulfills a fair and verifiable spectrum allocation selection, which is generated efficiently via SAS proposers’ proprietary allocation algorithms and evaluated by a customized designed allocation evaluation algorithm (AEA), in the face of no more than one-third of malicious participants. We implemented a prototype of TriSAS on the AWS cloud computing platform and evaluated its throughput and latency performance. The results show that TriSAS achieves high transaction throughput and low latency under various practical settings.
  • Secure Data-Binding in FPGA-based Hardware Architectures utilizing PUFs
    Frank, Florian; Schmid, Martin; Klement, Felix; Palani, Purushothaman; Weber, Andreas; Kavun, Elif Bilge; Xiong, Wenjie; Arul, Tolga; Katzenbeisser, Stefan (ACM, 2024-07-01)
    In this work, a novel FPGA-based data-binding architecture incorporating PUFs and a user-specific encryption key to protect the confidentiality of data on external non-volatile memories is presented. By utilizing an intrinsic PUF derived from the same memory, the confidential data is additionally bound to the device. This feature proves valuable in cases where software is restricted to be executed exclusively on specific hardware or privacy-critical data is not allowed to be decrypted elsewhere. To improve the resistance against hardware attacks, a novel method to randomly select memory cells utilized for PUF measurements is presented. The FPGA-based design presented in this work allows for low latency as well as small area utilization, offers high adaptability to diverse hardware and software platforms, and is accessible from bare-metal programs to full Linux kernels. Moreover, a detailed performance and security evaluation is conducted on five boards. A single read or write operation can be executed in 0.58 𝜇𝑠 when utilizing the lightweight PRINCE cipher on an AMD Zync 7000 MPSoC. Furthermore, the entire architecture occupies only about 10% of the FPGA’s available space on a resource-constrained AMD PYNQ-Z2. Ultimately, the implementation is demonstrated by storing confidential user data on new generations of network base stations equipped with FPGAs
  • SHARP: Exploring Version Control Systems in Live Coding Music
    Manesh, Daniel; Bowman, Douglas A.; Lee, Sang Won (ACM, 2024-06-23)
    Version control systems, which have proven essential for software engineering, can also provide value to creative and artistic practices. In this paper, we explore version control in the creative domain of live coding music, a generative performance practice where programmers edit and run code live to generate audiovisual artifacts. To that end, we developed SHARP, a lightweight version control system that live coders can use during performances as well as in preparation or practice sessions. We conducted a user study where live coders used SHARP for several weeks, wrote diary entries reflecting on their sessions, recorded a performance using SHARP, and participated in exit interviews. We found that SHARP enabled participants to engage with musical form on the fly in novel ways. In addition, the study revealed multifaceted perspectives on how and when versioning can be useful in the context of live coding. Our results inform the design of versioning systems for live coding and more generally for performance and generative arts practices.
  • Understanding the Impact of Branch Edit Features for the Automatic Prediction of Merge Conflict Resolutions
    Aldndni, Waad; Servant, Francisco; Meng, Na (ACM, 2024-04-15)
    Developers regularly have to resolve merge conflicts, i.e., two conflicting sets of changes to the same files in different branches, which can be tedious and error-prone. To resolve conflicts, developers typically: keep the local version (KL) or the remote version (KR) of the code. They also sometimes manually edit both versions into a single one (ME). However, most existing techniques only support merging the local and remote versions (the ME strategy). We recently proposed RPRedictoR, a machine learning-based approach to support developers in choosing how to resolve a conflict (by KL, KR, or ME), by predicting their resolution strategy. In its original design, RPRedictoR uses a set of Evolution History Features (𝐸𝐻𝐹 s) that capture: the magnitude of the changes in conflict, their evolution, and the experience of the developers involved. In this paper, we proposed and evaluated a new set of Branch Edit Features (𝐵𝐸𝐹 s), that capture the fine-grained edits that were performed on each branch of the conflict. We learned multiple lessons. First, 𝐵𝐸𝐹 s provided lower effectiveness (F-score) than the original 𝐸𝐻𝐹 s. Second, combining 𝐵𝐸𝐹 s with 𝐸𝐻𝐹 s still did not improve the effectiveness of 𝐸𝐻𝐹 s, it provided the same f-score. Third, the feature set that provided highest effectiveness in our experiments was the combination of 𝐸𝐻𝐹𝑠 with a subset of 𝐵𝐸𝐹 s that captures the number of insertions performed in the local branch, but this combination only improved 𝐸𝐻𝐹 s by 3 pp. f-score. Finally, our experiments also share the lesson that some feature sets provided higher C-score (i.e., the safety of the technique’s mistakes) as a trade-off for lower f-scores. This may be valued by developers and we believe that it should be studied in the future.
  • Swap It Like Its Hot: Segmentation-based spoof attacks on eye-tracking images
    Narkar, Anish S.; David-John, Brendan (ACM, 2024-06-04)
    Video-based eye trackers capture the iris biometric and enable authentication to secure user identity. However, biometric authentication is susceptible to spoofing another user’s identity through physical or digital manipulation. The current standard to identify physical spoofing attacks on eye-tracking sensors uses liveness detection. Liveness detection classifies gaze data as real or fake, which is sufficient to detect physical presentation attacks. However, such defenses cannot detect a spoofing attack when real eye image inputs are digitally manipulated to swap the iris pattern of another person. We propose IrisSwap as a novel attack on gaze-based liveness detection. IrisSwap allows attackers to segment and digitally swap in a victim’s iris pattern to fool iris authentication. Both offline and online attacks produce gaze data that deceives the current state-of-the-art defense models at rates up to 58% and motivates the need to develop more advanced authentication methods for eye trackers.
  • Education in HCI Outdoors: A Diary Study Approach
    Fan, Jixiang; Saaty, Morva; McCrickard, D. Scott (ACM, 2024-06-05)
    To assist students and educators in more deeply grasping user technology needs in busy outdoor settings, we recommend using diary study assignments adapted from social science and humancomputer interaction (HCI) research. This suggestion is based on insights that the field of HCI has expanded from computer use in controlled, indoor environments to technology application research in broader contexts, especially outdoor environments, where diary studies yield important insights. This can be seen in areas like social media, augmented reality, citizen science, and geolocationbased games, where it is difficult to understand the user experience for these areas through short-term, controlled exposure. Instead, educators must encourage students to step out of the classroom and into the real world to observe and experience interactions during multiple-use sessions over an extended time period, which offers students in-depth insights into real-world technology use, thereby setting the stage for them to design more human-focused technology applications and services that better meet user needs. This paper explores the utilization of the diary study methodology within the context of HCI education, examining its distinctive benefits and exposing tradeoffs in its challenges. Benefits discussed in the paper include adaptability to a wide array of user needs and circumstances, the capability to yield profound insights into the application of technology in real-world settings, and effectiveness in uncovering privacy concerns in daily life. Concurrently, we identify some practical challenges and introduce targeted strategies for addressing them, such as maintaining consistent student engagement, devising creative approaches for analyzing data, and encouraging deeper reflective practices among students. In so doing, this manuscript seeks to provide actionable guidance for crafting more impactful and immersive HCI educational initiatives through diary study assignments.
  • A Family of Fast and Memory Efficient Lock- and Wait-Free Reclamation
    Nikolaev, Ruslan; Ravindran, Binoy (ACM, 2024-06-20)
    Historically, memory management based on lock-free reference counting was very inefficient, especially for read-dominated workloads. Thus, approaches such as epoch-based reclamation (EBR), hazard pointers (HP), or a combination thereof have received significant attention. EBR exhibits excellent performance but is blocking due to potentially unbounded memory usage. In contrast, HP are non-blocking and achieve good memory efficiency but are much slower. Moreover, HP are only lock-free in the general case. Recently, several new memory reclamation approaches such as WFE and Hyaline have been proposed. WFE achieves wait-freedom, but is less memory efficient and performs suboptimally in oversubscribed scenarios; Hyaline achieves higher performance and memory efficiency, but lacks wait-freedom. We present a family of non-blocking memory reclamation schemes, called Crystalline, that simultaneously addresses the challenges of high performance, high memory efficiency, and wait-freedom. Crystalline can guarantee complete wait-freedom even when threads are dynamically recycled, asynchronously reclaims memory in the sense that any thread can reclaim memory retired by any other thread, and ensures (an almost) balanced reclamation workload across all threads. The latter two properties result in Crystalline's high performance and memory efficiency. Simultaneously ensuring all three properties requires overcoming unique challenges. Crystalline supports ubiquitous x86-64 and ARM64 architectures, while achieving superior throughput than prior fast schemes such as EBR as the number of threads grows. We also accentuate that many recent approaches, unlike HP, lack strict non-blocking guarantees when used with multiple data structures. By providing full wait-freedom, Crystalline addresses this problem as well.
  • Securing Agile: Assessing the Impact of Security Activities on Agile Development
    Thool, Arpit; Brown, Chris (ACM, 2024-06-18)
    Software systems are expected to be secure and robust. To verify and ensure software security, it is vital to include security activities, or development practices to detect and prevent security vulnerabilities, into the software development process. Agile software development is a popular software engineering (SE) process used by many organizations and development teams. However, while Agile aims to be a lightweight and responsive process, security activities are typically more cumbersome and involve more documentation and tools–violating the core principles of Agile. This work investigates the impact of security activities on various aspects of Agile development. To understand how software engineers perceive incorporating security practices into Agile methodologies, we distributed an online survey to collect data from software practitioners with experience working in Agile teams. Our results from 34 survey participants show most software practitioners believe security activities are beneficial to development overall but lack confidence in their impact on the security of software systems. Our findings provide insight into how security activities affect Agile development and provide implications to help SE teams better incorporate security activities into implementing Agile development processes.
  • A Combinatorial Approach to Hyperparameter Optimization
    Khadka, Krishna; Chandrasekaran, Jaganmohan; Lei, Yu; Kacker, Raghu N.; Kuhn, D. Richard (ACM, 2024-04-14)
    In machine learning, hyperparameter optimization (HPO) is essential for effective model training and significantly impacts model performance. Hyperparameters are predefined model settings which fine-tune the model’s behavior and are critical to modeling complex data patterns. Traditional HPO approaches such as Grid Search, Random Search, and Bayesian Optimization have been widely used in this field. However, as datasets grow and models increase in complexity, these approaches often require a significant amount of time and resources for HPO. This research introduces a novel approach using 𝑡-way testing—a combinatorial approach to software testing used for identifying faults with a test set that covers all 𝑡-way interactions—for HPO. 𝑇 -way testing substantially narrows the search space and effectively covers parameter interactions. Our experimental results show that our approach reduces the number of necessary model evaluations and significantly cuts computational expenses while still outperforming traditional HPO approaches for the models studied in our experiments.
  • Unpacking Task Management Tools, Values, and Worker Dynamics
    Hu, Donghan; Bhuiyan, Md Momen; Lim, Sol; Wiese, Jason; Lee, Sang Won (ACM, 2024-06-25)
    As the complexity of daily tasks grows, knowledge workers experience challenges in managing tasks and risk skipping over some. Fortunately, various task management tools have become available, ranging from traditional tools, such as sticky notes, to complex project management software. In this exploratory study, we aim to understand the landscape of task management tools that knowledge workers use and identify the value they seek from such tools. In addition, we investigate how such value relates to workers’ personality traits and job characteristics. For this purpose, we conducted a series of formative studies and an online survey (𝑁 = 248) to evaluate the perceived importance of various attributes of taskmanagement tools, followed by an exploratory factor analysis to identify the latent structure within that. This process revealed six underlying dimensions for task management tools: communicability, structure, portability, adaptability, physicality, and visualizability. Applying regression analysis, we found connections between latent dimensions and both personality traits and job characteristics. Our findings inform the design of future task management tools with guidance on choosing features and functionality that will meet the needs of their target populations.
  • Fastmove: A Comprehensive Study of On-Chip DMA and its Demonstration for Accelerating Data Movement in NVM-based Storage Systems
    Li, Jiahao; Su, Jingbo; Chen, Luofan; Li, Cheng; Zhang, Kai; Yang, Liang; Noh, Sam; Xu, Yinlong (ACM, 2024)
    Data-intensive applications executing on NVM-based storage systems experience serious bottlenecks when moving data between DRAM and NVM. We advocate for the use of the long-existing but recently neglected on-chip DMA to expedite data movement with three contributions. First, we explore new latency-oriented optimization directions, driven by a comprehensive DMA study, to design a high-performance DMA module, which significantly lowers the I/O size threshold to observe benefits. Second, we propose a new data movement engine, Fastmove, that coordinates the use of the DMA along with the CPU with judicious scheduling and load splitting such that the DMA?s limitations are compensated, and the overall gains are maximized. Finally, with a general kernel-based design, simple APIs, and DAX file system integration, Fastmove allows applications to transparently exploit the DMA and its new features without code change. We run three data-intensive applications MySQL, GraphWalker, and Filebench atop NOVA, ext4-DAX, and XFS-DAX, with standard benchmarks like TPC-C, and popular graph algorithms like PageRank. Across single- and multi-socket settings, compared to the conventional CPU-only NVM accesses, Fastmove introduces to TPC-C with MySQL 1.13-2.16× speedups of peak throughput, reduces the average latency by 17.7-60.8%, and saves 37.1-68.9% CPU usage spent in data movement. It also shortens the execution time of graph algorithms with GraphWalker by 39.7-53.4%, and introduces 1.12-1.27× throughput speedups for Filebench.
  • Neural Methods for Data-to-text Generation
    Sharma, Mandar; Gogineni, Ajay; Ramakrishnan, Naren (ACM, 2024)
    The neural boom that has sparked natural language processing (NLP) research throughout the last decade has similarly led to significant innovations in data-to-text generation (DTG). This survey offers a consolidated view into the neural DTG paradigm with a structured examination of the approaches, benchmark datasets, and evaluation protocols. This survey draws boundaries separating DTG from the rest of the natural language generation (NLG) landscape, encompassing an up-to-date synthesis of the literature, and highlighting the stages of technological adoption from within and outside the greater NLG umbrella. With this holistic view, we highlight promising avenues for DTG research that not only focus on the design of linguistically capable systems but also systems that exhibit fairness and accountability.
  • CHUNAV: Analyzing Hindi Hate Speech and Targeted Groups in Indian Election Discourse
    Jafri, Farhan; Rauniyar, Kritesh; Thapa, Surendrabikram; Siddiqui, Mohammad; Khushi, Matloob; Naseem, Usman (ACM, 2024)
    In the ever-evolving landscape of online discourse and political dialogue, the rise of hate speech poses a significant challenge to maintaining a respectful and inclusive digital environment. The context becomes particularly complex when considering the Hindi language—a low-resource language with limited available data. To address this pressing concern, we introduce the CHUNAV dataset—a collection of 11,457 Hindi tweets gathered during assembly elections in various states. CHUNAV is purpose-built for hate speech categorization and the identification of target groups. The dataset is a valuable resource for exploring hate speech within the distinctive socio-political context of Indian elections. The tweets within CHUNAV have been meticulously categorized into "Hate" and "Non-Hate" labels, and further subdivided to pinpoint the specific targets of hate speech, including "Individual", "Organization", and "Community" labels (as shown in Figure 1). Furthermore, this paper presents multiple benchmark models for hate speech detection, along with an innovative ensemble and oversampling-based method. The paper also delves into the results of topic modeling, all aimed at effectively addressing hate speech and target identification in the Hindi language. This contribution seeks to advance the field of hate speech analysis and foster a safer and more inclusive online space within the distinctive realm of Indian Assembly Elections. The dataset is available at https://github.com/Farhan-jafri/Chunav
  • Multi-Label Zero-Shot Product Attribute-Value Extraction
    Gong, Jiaying; Eldardiry, Hoda (ACM, 2024-05-13)
    E-commerce platforms should provide detailed product descriptions (attribute values) for effective product search and recommendation. However, attribute value information is typically not available for new products. To predict unseen attribute values, large quantities of labeled training data are needed to train a traditional supervised learning model. Typically, it is difficult, time-consuming, and costly to manually label large quantities of new product profiles. In this paper, we propose a novel method to efficiently and effectively extract unseen attribute values from new products in the absence of labeled data (zero-shot setting).We propose HyperPAVE, a multilabel zero-shot attribute value extraction model that leverages inductive inference in heterogeneous hypergraphs. In particular, our proposed technique constructs heterogeneous hypergraphs to capture complex higher-order relations (i.e. user behavior information) to learn more accurate feature representations for graph nodes. Furthermore, our proposed HyperPAVE model uses an inductive link prediction mechanism to infer future connections between unseen nodes. This enables HyperPAVE to identify new attribute values without the need for labeled training data. We conduct extensive experiments with ablation studies on different categories of the MAVE dataset. The results demonstrate that our proposed HyperPAVE model significantly outperforms existing classificationbased, generation-based large language models for attribute value extraction in the zero-shot setting.
  • Towards Understanding Family Privacy and Security Literacy Conversations at Home: Design Implications for Privacy Literacy Interfaces
    Alghythee, Kenan; Hrncic, Adel; Singh, Karthik; Kunisetty, Sumanth; Yao, Yaxing; Soni, Nikita (ACM, 2024-05-11)
    Policymakers and researchers have emphasized the crucial role of parent-child conversations in shaping children’s digital privacy and security literacy. Despite this emphasis, little is known about the current nature of these parent-child conversations, including their content, structure, and children’s engagement during these conversations. This paper presents the findings of an interview study involving 13 parents of children ages under 13 reflecting on their privacy literacy practices at home. Through qualitative thematic analysis, we identify five categories of parent-child privacy and security conversations and examine parents’ perceptions of their children’s engagement during these discussions. Our findings show that although parents used different conversation approaches, rule-based conversations were one of the most common approaches taken by our participants, with example-based conversations perceived to be effective by parents. We propose important design implications for developing effective privacy educational technologies for families to support parent-child conversations.
  • Wrist-bound Guanxi, Jiazu, and Kuolie: Unpacking Chinese Adolescent Smartwatch-Mediated Socialization
    Liu, Lanjing; Zhang, Chao; Lu, Zhicong (ACM, 2024-05-11)
    Adolescent peer relationships, essential for their development, are increasingly mediated by digital technologies. As this trend continues, wearable devices, especially smartwatches tailored for adolescents, is reshaping their socialization. In China, smartwatches like XTC have gained wide popularity, introducing unique features such as “Bump-to-Connect” and exclusive social platforms. Nonetheless, how these devices infuence adolescents’ peer experience remains unknown. Addressing this, we interviewed 18 Chinese adolescents (age: 11—16), discovering a smartwatch-mediated social ecosystem. Our fndings highlight the ice-breaking role of smartwatches in friendship initiation and their use for secret messaging with local peers. Within the online smartwatch community, peer status is determined by likes and visibility, leading to diverse pursuit activities (i.e., chu guanxi, jiazu, kuolie) and negative social dynamics. We discuss the core afordances of smartwatches and Chinese cultural factors that infuence adolescent social behavior, and ofer implications for designing future wearables that responsibly and safely support adolescent socialization.
  • Leveraging Prompt-Based Large Language Models: Predicting Pandemic Health Decisions and Outcomes Through Social Media Language
    Ding, Xiaohan; Carik, Buse; Gunturi, Uma Sushmitha; Reyna, Valerie; Rho, Eugenia (ACM, 2024-05-11)
    We introduce a multi-step reasoning framework using prompt-based LLMs to examine the relationship between social media lan guage patterns and trends in national health outcomes. Grounded in fuzzy-trace theory, which emphasizes the importance of “gists” of causal coherence in effective health communication, we introduce Role-Based Incremental Coaching (RBIC), a prompt-based LLM framework, to identify gists at-scale. Using RBIC, we systematically extract gists from subreddit discussions opposing COVID-19 health measures (Study 1). We then track how these gists evolve across key events (Study 2) and assess their influence on online engage ment (Study 3). Finally, we investigate how the volume of gists is associated with national health trends like vaccine uptake and hospitalizations (Study 4). Our work is the first to empirically link social media linguistic patterns to real-world public health trends, highlighting the potential of prompt-based LLMs in identifying critical online discussion patterns that can form the basis of public health communication strategies.