Journal Articles, Association for Computing Machinery (ACM)

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 20 of 399
  • CLOSUREX: Compiler Support for Correct Persistent Fuzzing
    Ranjan, Rishi; Paterson, Ian; Hicks, Matthew (ACM, 2025-02-03)
    Fuzzing is a widely adopted and pragmatic methodology for bug hunting as a means of software hardening. Research reveals that increasing fuzzing throughput directly increases bug discovery rate. The highest performance fuzzing strategy is persistent fuzzing, which reuses a single process for all test cases by looping back to the start upon completion, instead of exiting. This eliminates all process creation, initialization, and tear-down costs—which are on-par with execution cost. Unfortunately, persistent fuzzing leads to semantically inconsistent program states because process state changes from one test case remain for subsequent test cases. This semantic inconsistency results in missed crashes, false crashes, and overall incorrectness that undermines fuzzer effectiveness. We observe that existing fuzzing execution mechanisms exist on a continuum, based on the amount of state that gets discarded and restored between test cases. We present ClosureX, a fuzzing execution mechanism that sits at a new spot on this state restoration continuum, where only testcase- execution-specific state is reset. This fine-grain state restoration provides near-persistent performance with the correctness of heavyweight state restoration. We construct ClosureX as a set of LLVM passes that integrate with AFL++. Our evaluation on ten popular open-source fuzzing targets show that ClosureX maintains semantic correctness, while increasing test case execution rate by over 3.5x, on average, compared to AFL++. ClosureX also finds bugs more consistently and 1.9x faster than AFL++, with ClosureX discovering 15 0-day bugs (4 CVEs).
  • Systematic CXL Memory Characterization and Performance Analysis at Scale
    Liu, Jinshu; Hadian, Hamid; Wang, Yuyue; Berger, Daniel; Nguyen, Marie; Jian, Xun; Noh, Sam; Li, Huaicheng (ACM, 2025-03-30)
    Compute Express Link (CXL) has emerged as a pivotal interconnect for memory expansion. Despite its potential, the performance implications of CXL across devices, latency regimes, processors, and workloads remain underexplored. We present Melody, a framework for systematic characterization and analysis of CXL memory performance. Melody builds on an extensive evaluation spanning 265 workloads, 4 real CXL devices, 7 latency levels, and 5 CPU platforms. Melody yields many insights: workload sensitivity to sub-μs CXL latencies (140-410ns), the first disclosure of CXL tail latencies, CPU tolerance to CXL latencies, a novel approach (Spa) for pinpointing CXL bottlenecks, and CPU prefetcher inefficiencies under CXL.
  • PhasePrint: Exposing Cloud FPGA Fingerprints by Inducing Timing Faults at Runtime
    Mahmod, Jubayer; Hicks, Matthew (ACM, 2025-03-30)
    Cloud FPGAs, with their scalable and flexible nature, are rapidly gaining traction as go-to hardware acceleration platforms for compute-intensive workloads. However, their increasing adoption introduces unique security challenges. The hardware-level access that FPGAs provide leads to many vulnerabilities, including the leakage of sensitive information through data remanence and the creation of analog-domain covert channels among users. A foundational requirement in these scenarios is the ability to target an individual FPGA; knowing this, cloud vendors prevent FPGA localization by restricting access to low-level information of the underlying hardware. Beyond aiding adversaries, FPGA localization enables defenders to strategically rotate FPGA usage, preventing prolonged exposure that can lead to confidential data leakage due to long-term data remanence. This paper introduces PhasePrint, a cloud FPGA localization approach using dynamic timing faults in functionally valid circuits. PhasePrint induces timing faults in a specially crafted circuit at runtime and infers delay characteristics from the resulting error pattern—without incorporating information sources blocked by cloud vendors. PhasePrint utilizes an FPGA’s internal clock synthesizer to derive a clock pair with a strict phase relationship. By adjusting the phase relationship of these clocks, PhasePrint intentionally causes timing faults at runtime that reveal manufacturing variations among FPGA chips. We transform these fault locations into feature vectors to create device signatures and train a multiclass classifier on a dataset from 300 unique FPGAs across four AWS geographic regions. This entirely on-chip signature extraction method achieves >99% accuracy, operates 13× faster, and costs 92% less than the state-of-the-art.
  • Practical Federated Recommendation Model Learning Using ORAM with Controlled Privacy
    Liu, Jinyu; Xiong, Wenjie; Suh, G. Edward; Maeng, Kiwan (ACM, 2025-03-30)
    Training high-quality recommendation models requires collecting sensitive user data. The popular privacy-enhancing training method, federated learning (FL), cannot be used practically due to these models’ large embedding tables. This paper introduces FEDORA, a system for training recommendation models with FL. FEDORA allows each user to only download, train, and upload a small subset of the large tables based on their private data, while hiding the access pattern using oblivious memory (ORAM). FEDORA reduces the ORAM’s prohibitive latency and memory overheads by (1) introducing 𝜖-FDP, a formal way to balance the ORAM’s privacy with performance, and (2) placing the large ORAM in a power- and cost-efficient SSD with SSD-friendly optimizations. Additionally, FEDORA is carefully designed to support (3) modern operation modes of FL. FEDORA achieves high model accuracy by using private features during training while achieving, on average, 5× latency and 158× SSD lifetime improvement over the baseline.
  • Stramash: A Fused-kernel Operating System For Cache-Coherent, Heterogeneous-ISA Platforms
    Xing, Tong; Xiong, Cong; Wei, Tianrui; Sanchez, April; Ravindran, Binoy; Balkind, Jonathan; Barbalace, Antonio (ACM, 2025-03-30)
    We live in the world of heterogeneous computing. With specialised elements reaching all aspects of our computer systems and their prevalence only growing,we must act to rein in their inherent complexity. One area that has seen significantly less investment in terms of development is heterogeneous-ISA systems, specifically because of complexity. To date, heterogeneous- ISA processors have required significant software overheads,workarounds, and coordination layers, making the development ofmore advanced software hard, and motivating little further development of more advanced hardware. In this paper, we take a fused approach to heterogeneity, and introduce a new operating system (OS) design, the fused-kernel OS, which goes beyond the multiple-kernel OS design, exploiting cache-coherent shared memory among heterogeneous-ISA CPUs as a first principle – introducing a set of newOS kernel mechanisms.We built a prototype fusedkernel OS, Stramash-Linux, to demonstrate the applicability of our design to monolithic OS kernels.We profile Stramash OS components on real hardware but tested them on an architectural simulator – Stramash-QEMU, which we design and build. Our evaluation begins by validating the accuracy of our simulator, achieving an average of less than4%errors.We then perform a direct comparison between our fused-kernelOSand state-of-the-art multiple-kernel OS designs. Results demonstrate speedups of up to 2.1×onNPBbenchmarks. Further,we provide an in-depth analysis of the differences and trade-offs between fused-kernel and multiple-kernel OS designs.
  • JCDL 2024 Workshop: Generative AI for Resource Discovery in Libraries
    Chen, Yinlin; Yang, Le; Xie, Zhiwu (ACM, 2024-12-16)
    This workshop delves into the transformative role of Generative AI technologies in digital libraries, emphasizing advancements in resource discovery and user engagement. Participants will explore how cutting-edge large language models such as GPT-4 and Llama are leveraged to deliver highly personalized resource recommendations and improve the efficiency and precision of information retrieval processes. Through showcases of capstone projects developed as part of the AI Incubator Program, hands-on sessions, and collaborative discussions, attendees will gain practical insights into deploying AI-driven solutions that streamline library operations and elevate user experience.
  • Solid State Drive Targeted Memory-Efficient Indexing for Universal I/O Patterns and Fragmentation Degrees
    Im, Junsu; Kim, Jeonggyun; Oh, Seonggyun; Koo, Jinhyung; Park, Juhyung; Chwa, Hoon Sung; Noh, Sam H.; Lee, Sungjin (ACM, 2025-03-30)
    Thanks to the advance of device scaling technologies, the capacity of SSDs is rapidly increasing. Such increase, however, comes at the cost of a huge index table requiring large DRAM. To provide reasonable performance with less DRAM, various index structures exploiting locality and regularity of I/O references have been proposed. However, they provide deteriorated performance depending on I/O patterns and storage fragmentation. This paper proposes a novel approximate index structure, called AppL, which combines memoryefficient approximate indices and an LSM-tree that has an append-only and sorted nature. AppL reduces the index size to 6∼8-bits per entry, which is considerably smaller than the typical index structures requiring 32∼64-bits, and maintains such high memory efficiency irrespective of locality and fragmentation. By alleviating memory pressure, AppL achieves 33.6∼72.4% shorter read latency and 28.4%∼83.4% higher I/O throughput than state-of-the-art techniques.
  • Enhancing Immersive Sensemaking with Gaze-Driven Recommendation Cues
    Tahmid, Ibrahim Asadullah; North, Chris; Davidson, Kylie; Whitley, Kirsten; Bowman, Doug (ACM, 2025-03-24)
    Sensemaking is a complex task that places a heavy cognitive demand on individuals. With the recent surge in data availability, making sense of vast amounts of information has become a significant challenge for many professionals, such as intelligence analysts. Immersive technologies such as mixed reality offer a potential solution by providing virtually unlimited space to organize data. However, the difficulty of processing, filtering relevant information, and synthesizing insights remains. We proposed using eye-tracking data from mixed reality head-worn displays to derive the analyst’s perceived interest in documents and words, and convey that part of the mental model to the analyst. The global interest of the documents is reflected in their color, and their order on the list, while the local interest of the documents is used to generate focused recommendations for a document. To evaluate these recommendation cues, we conducted a user study with two conditions: a gaze-aware system, EyeST, and a “Freestyle” system without gaze-based visual cues. Our findings reveal that the EyeST helped analysts stay on track by reading more essential information while avoiding distractions. However, this came at the cost of reduced focused attention and perceived system performance. The results of our study highlight the need for explainable AI in human-AI collaborative sensemaking to build user trust and encourage the integration of AI outputs into the immersive sensemaking process. Based on our findings, we offer a set of guidelines for designing gaze-driven recommendation cues in an immersive environment.
  • KHAIT: K-9 Handler Artificial Intelligence Teaming for Collaborative Sensemaking
    Wilchek, Matthew; Wang, Linhan; Dickinson, Sally; Feuerbacher, Erica N.; Luther, Kurt; Batarseh, Feras A. (ACM, 2025-03-24)
    In urban search and rescue (USAR) operations, communication between handlers and specially trained canines is crucial but often complicated by challenging environments and the specific behaviors canines are trained to exhibit when detecting a person. Since a USAR canine often works out of sight of the handler, the handler lacks awareness of the canine’s location and situation, known as the “sensemaking gap.” In this paper, we propose KHAIT, a novel approach to close the sensemaking gap and enhance USAR effectiveness by integrating object detection-based Artificial Intelligence (AI) and Augmented Reality (AR). Equipped with AI-powered cameras, edge computing, and AR headsets, KHAIT enables precise and rapid object detection from a canine’s perspective, improving survivor localization. We evaluate this approach in a real-world USAR environment, demonstrating an average survival allocation time decrease of 22%, enhancing the speed and accuracy of operations.
  • TSConnect: An Enhanced MOOC Platform for Bridging Communication Gaps Between Instructors and Students in Light of the Curse of Knowledge
    Liu, Qianyu; Li, Xinran; Du, Xiaocong; Li, Quan (ACM, 2025-03-24)
    Knowledge dissemination in educational settings is profoundly influenced by the curse of knowledge, a cognitive bias that causes experts to underestimate the challenges faced by learners due to their own in-depth understanding of the subject. This bias can hinder effective knowledge transfer and pedagogical effectiveness, and may be exacerbated by inadequate instructor-student communication. To encourage more effective feedback and promote empathy, we introduce TSConnect, a bias-aware, adaptable interactive MOOC (Massive Open Online Course) learning system, informed by a needfinding survey involving 129 students and 6 instructors. TSConnect integrates instructors, students, and Artificial Intelligence (AI) into a cohesive platform, facilitating diverse and targeted communication channels while addressing previously overlooked information needs. A notable feature is its dynamic knowledge graph, which enhances learning support and fosters a more interconnected educational experience. We conducted a between-subjects user study with 30 students comparing TSConnect to a baseline system. Results indicate that TSConnect significantly encourages students to provide more feedback to instructors. Additionally, interviews with 4 instructors reveal insights into how they interpret and respond to this feedback, potentially leading to improvements in teaching strategies and the development of broader pedagogical skills.
  • Mental Models of Generative AI Chatbot Ecosystems
    Wang, Xingyi; Wang, Xiaozheng; Park, Sunyup; Yao, Yaxing (ACM, 2025-03-24)
    The capability of GenAI-based chatbots, such as ChatGPT and Gemini, has expanded quickly in recent years, turning them into GenAI Chatbot Ecosystems. Yet, users’ understanding of how such ecosystems work remains unknown. In this paper, we investigate users’ mental models of how GenAI Chatbot Ecosystems work. This is an important question because users’ mental models guide their behaviors, including making decisions that impact their privacy. Through 21 semi-structured interviews, we uncovered users’ four mental models towards first-party (e.g., Google Gemini) and third-party (e.g., ChatGPT) GenAI Chatbot Ecosystems. These mental models centered around the role of the chatbot in the entire ecosystem.We further found that participants held a more consistent and simpler mental model towards third-party ecosystems than the first-party ones, resulting in higher trust and fewer concerns towards the thirdparty ecosystems. We discuss the design and policy implications based on our results.
  • Can LLMs Recommend More Responsible Prompts?
    Santana, Vagner; Berger, Sara; Machado, Tiago; de Macedo, Maysa Malfiza; Sanctos, Cassia; Williams, Lemara; Wu, Zhaoqing (ACM, 2025-03-24)
    Human-Computer Interaction practitioners have been proposing best practices in user interface design for decades. However, generative Artificial Intelligence (GenAI) brings additional design considerations and currently lacks sufficient user guidance regarding affordances, inputs, and outputs. In this context, we developed a recommender system to promote responsible AI (RAI) practices while people prompt GenAI systems, by recommending addition of sentences based on social values and removal of harmful sentences. We detail a lightweight recommender system designed to be used in prompting-time and compare its recommendations to the ones provided by three base large language models (LLMs) and two LLMs fine-tuned for the task, i.e., recommending inclusion of sentences based on social values and removal of harmful sentences from a given prompt. Results indicate that our approach has the best F1-score balance in terms of recommendations for additions and removal of sentences to promote responsible prompts, while a fine-tuned model obtained the best F1-score for additions, and our approach obtained the best F1-score for removals of harmful sentences. In addition, fine-tuned models improved the objectiveness of responses by reducing the verbosity of generated content in 93% when compared to the content generated by base models. Presented findings contribute to RAI by showing the limits and bias of existing LLMs in terms of recommendations on how to create more responsible prompts and how open-source technologies can fill this gap in prompting-time.
  • CLEAR: Towards Contextual LLM-Empowered Privacy Policy Analysis and Risk Generation for Large Language Model Applications
    Chen, Chaoran; Zhou, Daodao; Ye, Yanfang; Li, Toby; Yao, Yaxing (ACM, 2025-03-24)
    The rise of end-user applications powered by large language models (LLMs), including both conversational interfaces and add-ons to existing graphical user interfaces (GUIs), introduces new privacy challenges. However, many users remain unaware of the risks. This paper explores methods to increase user awareness of privacy risks associated with LLMs in end-user applications. We conducted five co-design workshops to uncover user privacy concerns and their demand for contextual privacy information within LLMs. Based on these insights, we developed CLEAR (Contextual LLM-Empowered Privacy Policy Analysis and Risk Generation), a just-in-time contextual assistant designed to help users identify sensitive information, summarize relevant privacy policies, and highlight potential risks when sharing information with LLMs. We evaluated the usability and usefulness of CLEAR across two example domains: ChatGPT and the Gemini plugin in Gmail. Our findings demonstrated that CLEAR is easy to use and improves users’ understanding of data practices and privacy risks. We also discussed LLM’s duality in posing and mitigating privacy risks, offering design and policy implications.
  • 'Do I Have to Take This Class?': A Review of Ethics Requirements in Computer Science Curricula
    Weichert, James; Kim, Dayoung; Zhu, Qin; Eldardiry, Hoda (ACM, 2025-02-12)
    ABET criteria for accreditation of undergraduate computer science (CS) degrees require universities to cover within their curricula topics including “local and global impacts of computing solutions on individuals, organizations, and society,” and to prepare their students to “make informed judgments in computing practice, taking into account legal, ethical, diversity, equity, inclusion, and accessibility principles” [1]. A growing body of research similarly identifies the need for CS programs to integrate ethics into their degree requirements, both through standalone ethics-related courses and embedded modules or case studies on the ethical impacts in ‘technical’ courses. The calls for increased attention to CS ethics education have become more pressing with the emergence of sophisticated consumer-ready AI technologies, which pose new ethical challenges in the forms of bias, hallucination, and autonomous decision-making. Yet it remains unclear whether current university curricula are adequately preparing future graduates to confront these challenges. This paper presents a systematic review of the degree requirements of 250 computer science bachelor’s degree programs worldwide. We categorize each program according to whether a CS-related ethics course is offered and/or required by the department, finding that almost half of all universities we review do not offer any computing ethics courses, and only 33% of universities require students to take an ethics course to obtain their degree. We analyze differences among public US, private US, and non-US universities and discuss implications for curricular changes and the state of undergraduate computing ethics education.
  • Diary Study as an Educational Tool: An Experience Report from an HCI Course
    Fan, Jixiang; Haqq, Derek; Saaty, Morva; Wang, Wei-Lu; McCrickard, D. Scott (ACM, 2025-02-12)
    With the rapid advancement and widespread adoption of computer technology, it has become an indispensable component in the development of human society. Therefore, computer science education’s focus extends beyond merely teaching students to read and write code; it is crucial to assist them in gaining an accurate and deep understanding of the applications of technology in the real world, its conveniences, and potential risks. Furthermore, it involves exploring how to design, improve, and innovate computer technologies to meet practical demands. Consequently, Human-Computer Interaction (HCI) has grown increasingly significant in the curriculum of computer science. However, research indicates that computing students face numerous challenges in learning HCI. To enhance students’ ability to experience, discover, and understand user needs, the authors of this paper recommend incorporating diary studies in HCI education. In the field of HCI, diary studies are a method for collecting long-term data on user behavior and experiences in a natural environment. Participants are required to record their daily activities, product usage, encountered issues, and personal impressions over specific periods. This paper will detail the process and steps implemented in our diary studies and present student feedback and evaluations. Through this experience report, we hope to encourage more educators to adopt and refine the diary study methodology in their courses, thereby aiding computer science students in better understanding and embracing the concepts and knowledge of HCI.
  • The Impact of Group Discussion and Formation on Student Performance: An Experience Report in a Large CS1 Course
    Wu, Tong; Tang, Xiaohang; Wong, Sam; Chen, Xi; Shaffer, Clifford A.; Chen, Yan (ACM, 2025-02-12)
    Programming instructors often conduct collaborative learning activities, such as Peer Instruction (PI), to enhance student motivation, engagement, and learning gains. However, the impact of group discussion and formation mechanisms on student performance remains unclear. To investigate this, we conducted an 11- session experiment in a large, in-person CS1 course. We employed both random and expertise-balanced grouping methods to examine the efficacy of different group mechanisms and the impact of expert students’ presence on collaborative learning. Our observations revealed complex dynamics within the collaborative learning environment. Among 255 groups, 146 actively engaged in discussions, with 96 of these groups demonstrating improvement for poor-performing students. Interestingly, our analysis revealed that different grouping methods (expertise-balanced or random) did not significantly influence discussion engagement or poor-performing students’ improvement. In our deeper qualitative analysis, we found that struggling students often derived benefits from interactions with expert peers, but this positive effect was not consistent across all groups.We identified challenges that expert students face in peer instruction interactions, highlighting the complexity of leveraging expertise within group discussions.
  • Understanding the Effects of Integrating Music Programming and Web Development in a Summer Camp for High School Students
    Manesh, Daniel; Jelson, Andrew; Altland, Emily; Freeman, Jason; Lee, Sang Won (ACM, 2025-02-18)
    This poster presents the development and implementation of a 10- day remix-based summer camp curriculum designed to introduce high school students, particularly a multinational cohort of young women, to programming through creative coding. The curriculum integrates music composition using EarSketch and web development with HTML and CSS. The camp aims to inspire participants to gain self-efficacy in programming and motivate them to explore STEM/computing careers. Preliminary results from surveys and interviews indicate increased confidence in programming skills. This ongoing research explores the impact of remixing as a gateway for transitioning into more general-purpose computing domains such as web development.
  • RT-BarnesHut: Accelerating Barnes–Hut Using Ray-Tracing Hardware
    Nagarajan, Vani; Gangaraju, Rohan; Sundararajah, Kirshanthan; Pelenitsyn, Artem; Kulkarni, Milind (ACM, 2025-03)
    The 𝑛-body problem involves calculating the effect of bodies on each other. 𝑛-body simulations are ubiquitous in the fields of physics and astronomy and notoriously computationally expensive. The naïve algorithm for 𝑛-body simulations has the prohibiting 𝑂(𝑛2) time complexity. Reducing the time complexity to 𝑂(𝑛 · lg(𝑛)), the tree-based Barnes–Hut algorithm approximates the effect of bodies beyond a certain threshold distance. Other than algorithmic improvements, extensive research has gone into accelerating 𝑛-body simulations on GPUs and multi-core systems. However, Barnes– Hut is a tree-traversal algorithm, which makes it a poor target for acceleration using traditional GPU shader cores. In contrast, recent work shows that, for tree-based computations, GPU ray-tracing (RT) cores dominate shader cores. In this work, we reformulate the Barnes–Hut algorithm as a ray-tracing problem and implement it with NVIDIA OptiX. Our evaluation shows that the resulting system, RT-BarnesHut, outperforms current state-of-the-art GPU-based implementations.
  • Making Software Development More Diverse and Inclusive: Key Themes, Challenges, and Future Directions
    Hyrynsalmi, Sonja; Baltes, Sebastian; Brown, Chris; Prikladnicki, Rafael; Rodriguez-Perez, Gema; Serebrenik, Alexander; Simmonds, Jocelyn; Trinkenreich, Bianca; Wang, Yi; Liebel, Grischa (ACM, 2025)
    Introduction: Digital products increasingly reshape industries, influencing human behavior and decision-making. However, the software development teams developing these systems often lack diversity, which may lead to designs that overlook the needs, equal treatment or safety of diverse user groups. These risks highlight the need for fostering diversity and inclusion in software development to create safer, more equitable technology. Method: This research is based on insights from an academic meeting in June 2023 involving 23 software engineering researchers and practitioners. We used the collaborative discussion method 1-2-4-ALL as a systematic research approach and identified six themes around the theme ?challenges and opportunities to improve Software Developer Diversity and Inclusion(SDDI)'. We identified benefits, harms, and future research directions for the four main themes. Then, we discuss the remaining two themes, Artificial Intelligence&SDDI and AI&Computer Science education, which have a cross-cutting effect on the other themes. Results: This research explores the key challenges and research opportunities for promoting SDDI, providing a roadmap to guide both researchers and practitioners. We underline that research around SDDI requires a constant focus on maximizing benefits while minimizing harms, especially to vulnerable groups. As a research community, we must strike this balance in a responsible way.
  • Test Case-Informed Knowledge Tracing for Open-ended Coding Tasks
    Duan, Zhangqi; Fernandez, Nigel; Hicks, Alexander; Lan, Andrew (ACM, 2025-03-03)
    Open-ended coding tasks, which ask students to construct programs according to certain specifications, are common in computer science education. Student modeling can be challenging since their open-ended nature means that student code can be diverse. Traditional knowledge tracing (KT) models that only analyze response correctness may not fully capture nuances in student knowledge from student code. In this paper, we introduce Test case-Informed Knowledge Tracing for Open-ended Coding (TIKTOC), a framework to simultaneously analyze and predict both open-ended student code and whether the code passes each test case. We augment the existing CodeWorkout dataset with the test cases used for a subset of the open-ended coding questions, and propose a multitask learning KT method to simultaneously analyze and predict 1) whether a student’s code submission passes each test case and 2) the student’s open-ended code, using a large language model as the backbone. We quantitatively show that these methods outperform existing KT methods for coding that only use the overall score a code submission receives. We also qualitatively demonstrate how test case information, combined with open-ended code, helps us gain fine-grained insights into student knowledge.