Journal Articles, Association for Computing Machinery (ACM)

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 20 of 526
  • Examining Age-Bias and Stereotypes of Aging in LLMs
    Dewan, Sherwin; Shaikh, Ismail; Shaw, Connie; Sahoo, Abhilash; Jha, Akshita; Pradhan, Alisha (ACM, 2025-10-26)
    Large Language Models (LLMs) are increasingly being used across applications, ranging from content generation to decision-making, raising concerns about biases embedded in them. While biases related to gender, race, and culture have been studied extensively, understanding age-bias and stereotypes of aging in LLMs remain underexplored. This study analyzes LLM-generated responses to prompts related to aging, revealing stereotypical biases about aging pertaining to technology proficiency, cognitive and physical decline, and job roles.We noted that even responses without explicit age bias also had mentions of ageist stereotypes. We discuss considerations for involving older adults’ perspectives through human-in-the-loop methodologies yet exercising caution about aspects like internalized ageism.
  • AmblyOverlay: An Input-Transparent, Assistive Overlay for Binocular Visual Therapy and Dichoptic Filtering
    Thaker, Pooja (ACM, 2025-10-26)
    AmblyOverlay is presented here as a proof-of-concept prototype, demonstrating feasibility of dichoptic overlays for binocular therapy, with clinician-guided validation planned in future work. Am-blyopia, strabismus (including exophoria), and double vision are vision disorders that disrupt binocular coordination and often require visual therapy. Traditional treatments involve patching or the use of specialized games that rely on red-green or red-blue goggles to stimulate cooperation between the eyes. However, these games are primarily designed for young children [20] and may not be engaging for older users [21] who prefer fast-paced, modern games. AmblyOverlay is a software-based assistive tool that enables patients with binocular vision issues to play any game of their choice, including action-heavy titles, while undergoing visual therapy. Developed using Python, OpenCV, NumPy, and PyQt5, the application applies a red-cyan dichoptic filter across the entire screen. The overlay is presented through a transparent, click-through window, allowing the user to interact with their screen normally. Unlike previous approaches that target only specific games or rely on fixed visual content, AmblyOverlay provides a system-wide solution compatible with dynamic and interactive content. The project faced and addressed key technical challenges, including maintaining real-time performance, preserving user interactivity through a transparent overlay, and ensuring stable screen capture. While AI-based methods for adaptive filtering were explored, they were not implemented in the final system due to complexity and resource constraints. Ongoing development focuses on connecting visual calibration to the filter and UI customization, with the long-term goal of making binocular therapy more engaging, accessible, and compatible with how patients use their devices.
  • Interpretive Caption: Real-Time Vocal Emotion Cues for DHH Users
    Ubur, Sunday; Adewale, Sikiru; Chandrashekar, Nikitha; Akli, Enoch; Gracanin, Denis (ACM, 2025-10-26)
    Deaf and Hard-of-Hearing (DHH) individuals increasingly rely on real-time captioning to access spoken content in educational and professional settings. However, traditional captions omit vocal emotional cues, such as intonation and affect which can hinder comprehension and engagement. This work introduces Interpretive Caption, a machine-learning prototype that augments captions with emotion-aware annotations derived from vocal tone. Using letter-coded tags with hover-based tooltips, the system conveys emotional context on demand, balancing clarity with cognitive accessibility. We conducted a qualitative study with eight DHH participants who interacted with the prototype and shared feedback on usability, emotional clarity, and layout design. Findings highlight the value of hover-based emotional cues, customization features, and segmentation aligned with cognitive load principles. Participants appreciated the non-intrusive emotional insights, while also identifying areas for improvement, including accent-inclusive emotion recognition and better mobile accessibility. Our contributions include a real-time captioning prototype integrating speech emotion recognition, a user-controllable emotion display interface, and design insights for affective accessibility in educational contexts. This work offers a foundation for inclusive, expressive captioning and informs future multimodal caption systems that prioritize interpretability, cultural sensitivity, and user agency.
  • Scale, Engage, or Both?: Potential and Perils of Applying Large Language Models in Interview and Conversation-Based Research
    Hwang, Angel Hsing-Chi; Aubin Le Quéré, Marianne; Schroeder, Hope; Cuevas, Alejandro; Dow, Steven; Kapania, Shivani; Rho, Eugenia (ACM, 2025-10-18)
    An increasing number of studies apply tools powered by large language models (LLMs) to interview and conversation-based research, one of the most commonly used research methods in CSCW. This panel invites the CSCW community to critically debate the role of LLMs in reshaping interview-based methods. We aim to explore how these tools might (1) address persistent challenges in conversation-based research, such as limited scalability and participant engagement, (2) introduce novel methodological possibilities, and (3) surface additional practical, technical, and ethical concerns. The panel discussion will be grounded on the panelists’ prior experience applying LLMs to their own interview and conversation-based research.We ask whether LLMs offer unique advantages to enhance interview research, beyond automating certain aspects of the research process. Through this discussion, we encourage researchers to reflect on how applying LLM tools may require rethinking research design, conversational protocols, and ethical practices.
  • Structuring Collaborative Reflection: Integrating Diary Study and Focus Group Discussion
    Fan, Jixiang; Zhao, Jiacheng; Oh, Sunggyeol; Bolmer, Michael; Lee, Yoonje; Flammer, Nick; Chen, Yuhao; McCrickard, D. Scott (ACM, 2025-10-18)
    We present a structured reflection framework integrating diary study and focus group discussion to support collaborative meaningmaking in HCI education. The framework follows a multi-phase design in which students progress from individual journaling to a two-stage group discussion sequence: first within shared application contexts, then across emergent experiential themes. To support this process, we extended DiaryQuest, a lightweight educational tool incorporating AI-assisted grouping, image-based prompts, and a Jigsaw-inspired workflow to scaffold participation. A preliminary classroom deployment with 11 undergraduate students suggests that the approach lowers the barrier to reflective dialogue, encourages cross-perspective engagement, and helps students surface design-relevant insights grounded in lived experience. These findings point to new opportunities for structuring reflection in sociotechnical learning environments.
  • Unraveling the Complexities of MTA-STS Deployment and Management in Securing Email
    Ashiq, Md. Ishtiaq; Fiebig, Tobias; Chung, Taejoong (ACM, 2025-10-28)
    Email has been a cornerstone of online communication for decades, but its lack of built-in confidentiality has left it vulnerable to various attacks. To address this issue, two key protocols are being used: MTA-STS (Mail Transfer Agent Strict Transport Security) and DANE (DNS-based Authentication of Named Entities). While DANE was introduced first, MTA-STS has been actively adopted by major email providers like Google and Microsoft, as it does not require the complex DNSSEC chain that poses a significant challenge in deploying and managing DANE. However, despite its significance, there has been limited research on how MTA-STS is deployed and managed in practice. In this study, we present a thorough, longitudinal investigation of the MTA-STS ecosystem. We base our analysis on a dataset capturing over 87 million domains from DNS scans collected across four TLDs over 31 months, along with 10 months of additional component scanning such as TLS certificates, thereby offering a broad perspective on MTA-STS adoption and its management. Our analysis uncovers a concerning trend of misconfigurations and inconsistencies in MTA-STS setups. In our most recent snapshot, out of 68K domains with MTA-STS record, 29.6% of domains were incorrectly configured, while 3.2% of these should encounter email delivery failure from MTA-STS supporting senders. To gain insights into the challenges faced by email administrators, we surveyed 117 operators. While awareness ofMTA-STS was high (94.7%), many cited operational complexity (48.8%) and a preference for DANE (45.4%) as reasons for not deploying the protocol. Our study not only highlights the growing importance of MTASTS but also reveals the significant challenges in its deployment and management.
  • Contextualizing Introductory Computer Science: Insights from African Faculty
    Tshukudu, Ethel; Sanusi, Ismaila; Ola, Oluwakemi; Hamouda, Sally; Marshall, Linda; Adelakun-Adeyemo, Oluwatoyin; Dodoo, Emma; Korsah, G.; Luvhengo, Sandani; Parkinson, Jack (ACM, 2025-10-21)
    Contextualizing computer science education has been recognized as a key factor in enhancing student engagement and learning outcomes. This study investigates the initial perceptions of university computer science faculty in Africa regarding the benefits, adoption challenges, and institutional support required for the successful integration of contextually relevant materials into introductory computer science (CS1) courses. Faculty then assessed a set of previously developed contextually tailored materials, grounded in Banks’ Additive Approach to curriculum reform and aligned to the CS curricula 2023. The research adopted qualitative methods, gathering data through open-ended surveys from 22 CS faculty across 9 African countries. Thematic analysis identified key patterns in the responses from faculty, who generally expressed positive perceptions of integrating contextualized materials. They agreed such materials could enhance engagement without distracting from core objectives, but emphasized the need for careful integration. Insights from faculty highlighted that successful implementation requires substantial institutional support, including curriculum reform, textbook development, and faculty training, with universities playing a critical role in adoption.
  • The Impostor Phenomenon in the Global Computing Graduate Student Population
    Pechenik, Caroline; Zavaleta Bernuy, Angela; Shah, Selina; de Wit, Shirley; Kolog, Emmanuel Awuni; Karnalim, Oscar; Farghally, Mohammed; Aníbal Suárez, Carlos; Parkinson, Jack; Porter, Leo; Duran, Rodrigo; Vrbik, Paul; Harrington, Brian; Zhang, Lisa; Liut, Michael; Petersen, Andrew (ACM, 2025-10-21)
    Several studies have confirmed that undergraduates in computing programs frequently experience the Impostor Phenomenon (IP). However, this work has largely focused on North America and Europe, and no work has evaluated graduate students in computing. This study evaluates the rate of IP experiences in graduate programs globally to determine whether rates of IP experiences are consistent and whether there are institutions or locations with lower rates of IP that might inform the development of support systems to reduce its prevalence. We perform a multi-institutional, multi-national survey-based study of 11 institutions, with at least one on every populated continent. The survey asks graduate students to complete the Clance IP scale (CIPS), which is the standard evaluation instrument for IP, as well as to answer a number of demographic questions that establish their experience level, gender, and ethnicity.We evaluate the overall level of IP experiences at each institution as well as across regions, and we explore the interaction between CIPS scores, region, and demographic factors.
  • Expanding Contextualized Computer Science Education in Africa: A Collaborative Initiative
    Hamouda, Sally; Tshukudu, Ethel; Marshall, Linda; Aruleba, Kehinde; Kombe, Cleverence; Bih Fofang, Janet Shufor; Bada, Joseph Kizito; Ekwam, Emmanuel; Timm, Nils; Mengistu, Tessema (ACM, 2025-10-21)
    Building on our ITiCSE-Working Group Report 2024 study, this proposal aims to expand and refine contextualized CS1 materials in partnership with African researchers and educators. Earlier findings showed that faculty see value in locally relevant materials for boosting engagement and problem-solving, though challenges remain in adapting them across contexts. This working group will co-design new materials using a structured framework and evaluate them through mixed-methods research with students centrally involved. The project promotes inclusive pedagogy, cross-institutional collaboration, and scalable strategies to better align global computing standards with local needs.
  • Examining and Mitigating Ability-bias in LLMs via Self-Reflection
    Iyer, Neel; Jha, Akshita; Pradhan, Alisha (ACM, 2025-04-28)
    Large language models (LLMs) (e.g., ChatGPT) are rapidly integrating into our daily lives, fundamentally shaping how we engage with, process information or make decisions. Despite their significant potential, LLMs can encode social biases (e.g., gender, culture) that amplify problematic and stereotypical representations of marginalized groups. Given the discriminatory impact that bias in LLMs can have on people with disabilities, in this work we examine ability bias in LLMs. We analyze LLM responses to a set of carefully crafted prompts across different abilities, and explore self-reflection through prompt chaining as a debiasing approach. Our findings surface linguistic associations encoded in LLMs with different disabilities. We note the types of justifications or rationalizations provided as explanations in LLM responses — which has implications on the trust associated with LLM responses. Our proposed approach of model self-reflection demonstrates improvement in LLM responses and thereby contributes to debiasing literature.
  • "Fewer Views If They Have TW.": Understanding Users' Perceptions of Trigger Warning and Content Warning on Social Media Platforms in the U.S.
    Zhang, Xinyi; Gupta, Muskan; Altland, Emily; Lee, Sang Won (ACM, 2025-10-16)
    The prevalence of distressing content on social media raises concerns about users' mental well-being, prompting the use of trigger warnings (TW) and content warnings (CW). However, inconsistent presentation of TW/CW across platforms and the lack of standardized practices confuse users regarding these warnings. To better understand how users experienced and utilized these warnings, we conducted a semi-structured interview study with 15 social media users. Our findings reveal challenges across three key stakeholders: viewers, who need to decide whether to engage with warning-labeled content; posters, who struggle with whether and how to apply TW/CW to the content; and platforms, whose design features shape the visibility and usability of warnings. While users generally expressed positive attitudes toward warnings, their understanding of TW/CW usage was limited. Based on these insights, we reflected on the TW/CW mechanisms from multiple stakeholders' perspectives. Lastly, we further reflected on our findings and discussed the opportunities for social media platforms to enhance users' TW/CW experiences, fostering a more trauma-informed social media environment.
  • WePilot: Integrating Younger Family Members and Chatbot to Support Older Adults Learning Smartphone Usage
    Zhang, Haonan; Zhang, Peng; Chen, Yan; Guo, Meitong; Gu, Hansu; Lu, Tun; Gu, Ning (ACM, 2025-10-16)
    Older adults (OAs) usually face various challenges when using smartphones due to their limited knowledge and the declines in memory and information processing capabilities. Many studies in HCI and CSCW communities have focused on supporting OAs to independently use smartphones. However, compared to independent exploration, support from younger family members (YFMs) has specific advantages in problem understanding, solution personalization, and security protection. However, OAs and YFMs generally have gaps in time, knowledge, and experience, affecting the efficiency of support and their experience. For this problem, we conduct a formative study to gather insights into OAs and YFMs’ perspectives and expectations in the supporting procedure. Then we introduce chatbot to mediate the gaps between OAs and YFMs and build a system namedWePilot to assist them to collaboratively solve smartphone usage problems. Evaluations with 12 pairs of participants (OA and corresponding YFM) suggest WePilot’s strengths in improving problem solving efficiency and OAs and YFMs’ experience. Based on these findings, we propose several insights into the future design of intergenerational technical support systems.
  • Writing Home From Afar: Connecting Distant Families through Sharing of Outdoor Experiences with Digital Diaries
    Wang, Wei-Lu; Andrus, Natalie; Hassan, Taha; Fan, Jixiang; Cao, Yusheng; Asante, Joelle; Saaty, Morva; Haqq, Derek; McCrickard, D. Scott (ACM, 2025-10-16)
    Maintaining emotional connections and fostering meaningful communication among distant family members has long been challenging. Existing communication technologies, such as instant messaging, video-sharing, and social media enable quick exchanges but often lack mechanisms to initiate appropriate conversation topics and support in-depth emotional interactions. This study explores the use of digital diary-sharing in addressing these limitations. We conduct thematic analyses on diaries from a three-week study (N=22) using DailyBean, a diary app, to examine frequent patterns in users' sharing of outdoor experiences with distant family members. We identify five key mechanisms to support connections between distant family members: topic initiation, memory recall, shared moments, joint activities, and future planning. We also highlight frequent conversation topics that facilitate emotional engagement and reflection for distant family members. We conclude our study with design recommendations for effective diary-based family communication.
  • Memory Tiering in Python Virtual Machine
    Li, Yuze; Yao, Shunyu; Mobin, Jaiaid; Zhan, Tianyu; Rafique, M. Mustafa; Nikolopoulos, Dimitrios; Sundararajah, Kirshanthan; Butt, Ali R. (ACM, 2025-10-09)
    Modern Python applications consume massive amounts of memory in data centers. Emerging memory technologies such as CXL have emerged as a pivotal interconnect for memory expansion. Prior efforts in memory tiering that relied on OS page or hardware counters information incurred notable overhead and lacked awareness of fine-grained object access patterns. Moreover, these tiering configurations cannot be tailored to individual Python applications, limiting their applicability in QoS-sensitive environments. In this paper, we introduce Memory Tiering in Python VM (MTP), an extension module built atop the popular CPython interpreter to support memory tiering in Python applications. MTP leverages reference count changes from garbage collection to infer object temperatures and reduces unnecessary migration overhead through a software-defined page temperature table. To the best of our knowledge, MTP is the first framework to offer portability, easy deployment, and per-application tiering customization for Python workloads.
  • reInstruct: Toward OS-aware CPU microcode reprogramming
    Wang, Yubo; Nikolaev, Ruslan; Ravindran, Binoy (ACM, 2025-10-13)
    Historically, the microcode layer has been a proprietary technology which is tightly controlled by the CPU vendors. The microcode layer enables a great flexibility for translating ISAvisible instructions into internal hardware micro-operations. In x86-64, many system-level instructions are microcoded, which enables a great untapped opportunity for OS developers, who want to experiment with future ISA extensions. Recent research work has identified hidden CPU instructions, which are enabled via a firmware exploit, and also partially reverse-engineered and decrypted Intel Goldmont microcode. We go a step further and design an experimental framework for Linux, which allows to transparently modify existing microcoded instructions directly from an OS at runtime. We show how microcode alterations can be used to defeat normal root-privilege isolation in Linux almost without any trace. We also show our new approach which relies on ISA modification via microcode patching to improve performance of commonly-used lightweight Linux system calls. Our approach, effectively, adjusts the CPU ISA to better serve a specific OS kernel and applications, an idea which has been out of reach for commodity hardware previously.
  • An Algorithm for Computing Generalized Hamming Weights and the Sage Package GHWs
    San-José, Rodrigo (ACM, 2025-10)
    We generalize the Brouwer-Zimmermann algorithm, which is the most efficient general algorithm for computing the minimum distance of a random linear code, to the case of generalized Hamming weights. We also adapt this algorithm to compute the relative generalized Hamming weights of a nested pair of linear codes. In the package GHWs we provide an implementation of this algorithm in Sage, as well as several other utilities for working with generalized Hamming weights. With this implementation, we show that the proposed algorithm is faster than the naive approach of computing the generalized Hamming weights using the definition.
  • Towards Safe Agentic AI Performance Engineering
    Williams, Dan; Craun, Milo; Le, Michael V.; Stephen, Julian; Ahmed, Salman; Jamjoom, Hani (ACM, 2025-10-13)
    The emergence of agentic AI—reasoning AI agents that can connect to tools and take actions—offers an enormous potential in performing tasks that currently require highly skilled humans to perform. In this position paper, we discuss AI agents in one such role: performance engineer. A performance engineer is typically highly trained and highly trusted to run performance diagnostic tools—which more often than not require root or administrator privileges—on production machines to diagnose performance issues. Critically, performance engineers are trusted not to cause harm to the production systems they are investigating, including crashing or hanging the systems, extracting sensitive information from them, or negatively affecting their performance. In this paper, we argue that current AI agents have the training, but lack the trust to be performance engineers. We outline four components: prevention, detection/auditing, aborting/rollback, and retry/refocus and highlight gaps where the approaches taken for human-based performance engineers fall short.
  • Top-Down Stochastic Block Partitioning: Turning Graph Clustering Upside Down
    Wanye, Frank; Gleyzer, Vitaliy; Kao, Edward; Feng, Wu-chun (ACM, 2025-07-20)
    Stochastic block partitioning (SBP) is a statistical inference-based algorithm for clustering vertices within a graph. It has been shown to be statistically robust and highly accurate even on graphs with a complex structure, but its poor scalability limits its usability to smaller-sized graphs. In this manuscript we argue that one reason for its poor scalability is the agglomerative, or bottom-up, nature of SBP’s algorithmic design; the agglomerative computations cause high memory usage and create a large search space that slows down statistical inference, particularly in the algorithm’s initial iterations. To address this bottleneck, we propose Top-Down SBP, a novel algorithm that replaces the agglomerative (bottom-up) block merges in SBP with a block-splitting operation. This enables the algorithm to start with all vertices in one cluster and subdivide them over time into smaller clusters. We show that Top-Down SBP is up to 7.7× faster than Bottom-Up SBP without sacrificing accuracy and can process larger graphs than Bottom-Up SBP on the same hardware due to an up to 4.1× decrease in memory usage. Additionally, we adapt existing methods for accelerating Bottom- Up SBP to the Top-Down approach, leading to up to 13.2× speedup over accelerated Bottom-Up SBP and up to 403× speedup over sequential Bottom-Up SBP on 64 compute nodes. Thus, Top-Down SBP represents substantial improvements to the scalability of SBP, enabling the analysis of larger datasets on the same hardware.
  • Can Large Language Models Predict Parallel Code Performance?
    Bolet, Gregory; Georgakoudis, Giorgis; Menon, Harshitha; Parasyris, Konstantinos; Hasabnis, Niranjan; Estes, Hayden; Cameron, Kirk; Oren, Gal (ACM, 2025-07-20)
    Accurate determination of the performance of parallel GPU code typically requires execution-time profiling on target hardware – an increasingly prohibitive step due to limited access to high-end GPUs. This paper explores whether Large Language Models (LLMs) can offer an alternative approach for GPU performance prediction without relying on hardware.We frame the problem as a roofline classification task: given the source code of a GPU kernel and the hardware specifications of a target GPU, can an LLM predict whether the GPU kernel is compute-bound or bandwidth-bound? For this study, we build a balanced dataset of 340 GPU kernels, obtained from HeCBench benchmark and written in CUDA and OpenMP, along with their ground-truth labels obtained via empirical GPU profiling. We evaluate LLMs across four scenarios: (1) with access to profiling data of the kernel source, (2) zero-shot with source code only, (3) few-shot with code and label pairs, and (4) finetuned on a small custom dataset. Our results show that state-of-theart LLMs have a strong understanding of the Roofline model, achieving 100% classification accuracy when provided with explicit profiling data. We also find that reasoning-capable LLMs significantly outperform standard LLMs in zero- and few-shot settings, achieving up to 64% classification accuracy of GPU source codes, without any profiling information. Lastly, we find that model accuracy does not benefit meaningfully from few-shot prompting compared to zero-shot, and that LLM fine-tuning will require much more data than what we currently have available. This work is among the first to use LLMs for source-level roofline performance prediction via classification, and illustrates their potential to guide optimization efforts when runtime profiling is infeasible. Our findings suggest that with better datasets and prompt strategies, LLMs could become practical tools for HPC performance analysis and performance portability. Code and datasets are publicly available at https: //github.com/Scientific-Computing-Lab/ParallelCodeEstimation.
  • A Generalized Web3D API for Metaverse Bookmarks
    Narra, Nikhil; Marisetty, Anuj; Polys, Nicholas; Sandbrook, Ben (ACM, 2025-09-09)
    Sharing identical 3D scene states across different platforms and user sessions poses practical limitations. Existing mechanisms such as X3D anchors, glTF camera tags, and proprietary URL-based encodings typically support only static, author-defined viewpoints or are constrained to a specific viewer implementation. This paper introduces a schema and API for a Web3D Bookmarking system that enables interoperable sharing of precise, user-generated 3D scene contexts. The proposed grammar encodes the complete scene state, including camera parameters, navigation settings, scene composition, and contextual metadata, into a compact descriptor suitable for embedding in hyperlinks or other distribution methods. When accessed, the scene is re-instantiated with the same view and context as intended by the original sharer. We define the grammar using a formal JSON Schema and justify each parameter through a design rationale. Use cases include collaborative VR/AR environments, remote design reviews, and educational applications where reproducible scene context is critical. An initial implementation using X3D and X3DOM demonstrates feasibility. We also compare our approach with existing viewpoint sharing techniques and outline a structured evaluation plan to assess the utility and performance of the proposed system.