Journal Articles, Association for Computing Machinery (ACM)

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 20 of 351
  • Enforcing C/C++ Type and Scope at Runtime for Control-Flow and Data-Flow Integrity
    Ismail, Mohannad; Jelesnianski, Christopher; Jang, Yeongjin; Min, Changwoo; Xiong, Wenjie (ACM, 2024-04-27)
    Control-flow hijacking and data-oriented attacks are becoming more sophisticated. These attacks, especially dataoriented attacks, can result in critical security threats, such as leaking an SSL key. Data-oriented attacks are hard to defend against with acceptable performance due to the sheer amount of data pointers present. The root cause of such attacks is using pointers in unintended ways; fundamentally, these attacks rely on abusing pointers to violate the original scope they were used in or the original types that they were declared as. This paper proposes Scope Type Integrity (STI), a new defense policy that enforces all pointers (both code and data pointers) to conform to the original programmer’s intent, as well as Runtime Scope Type Integrity (RSTI) mechanisms to enforce STI at runtime leveraging ARM Pointer Authentication. STI gathers information about the scope, type, and permissions of pointers. This information is then leveraged by RSTI to ensure pointers are legitimately utilized at runtime. We implemented three defense mechanisms of RSTI, with varying levels of security and performance tradeoffs to showcase the versatility of RSTI. We employ these three variants on a variety of benchmarks and real-world applications for a full security and performance evaluation of these mechanisms. Our results show that they have overheads of 5.29%, 2.97%, and 11.12%, respectively.
  • Red is Sus: Automated Identification of Low-Quality Service Availability Claims in the US National Broadband Map
    Nabi, Syed Tauhidun; Wen, Zhuowei; Ritter, Brooke; Hasan, Shaddi (ACM, 2024-11-04)
    The FCC’s National Broadband Map aspires to provide an unprecedented view into broadband availability in the US. However, this map, which also determines eligibility for public grant funding, relies on self-reported data from service providers that in turn have incentives to strategically misrepresent their coverage. In this paper, we develop an approach for automatically identifying these low-quality service claims in the National Broadband Map. To do this, we develop a novel dataset of broadband availability consisting of 750k observations from more than 900 US ISPs, derived from a combination of regulatory data and crowdsourced speed tests. Using this dataset, we develop a model to classify the accuracy of service provider regulatory filings and achieve AUCs over 0.98 for unseen examples. Our approach provides an effective technique to enable policymakers, civil society, and the public to identify portions of the National Broadband Map that are likely to have integrity challenges.
  • Technology Use in the Black Church: Perspectives of Black Church Leaders Preliminary Findings
    Thompson, Gabriella; Otoo, Nissi; Fisher, Jaden; Sibi, Irene; Smith, Angela; Ogbonnaya-Ogburu, Ihudiya (ACM, 2024-11-11)
    Historically, the Black church has played a pivotal role in civic engagement and social justice, and continues to do so today. Yet, few researchers have explored how decisions around technology use are made in the church. To address this gap, we conducted semi-structured interviews with five Black church leaders to understand how church leaders interact with digital technologies, both in general and specifically with the communities that they serve. We found that while Black Church leaders are eager to engage with technology, most of the engagement with outside communities is through in-person contact; opportunities to give online have a financial penalty in comparison to traditional methods of tithing and donating; lastly, technology use within outreach and ministries is highly dependent by ministry leaders – many whom volunteer their time.We contribute to research that focuses on technology use in religious organizations and community engagement of community-based organizations.
  • Designing Technology to Support the Hospital Classroom: Preliminary Findings
    Rasberry, Nadra; Essandoh, Joshua; Do, Ethan; Ogbonnaya-Ogburu, Ihudiya (ACM, 2024-11-11)
    Hospital teachers are state-employed educators who provide K-12 instruction to children in the hospital. We conducted research to understand how technology is used in hospital classrooms, an area which has been relatively underexplored. We conducted semistructured interviews with five hospital teachers to understand their experience of using technology in and outside the classroom. Our findings revealed that hospital teachers often rely on older curricula given the changing education atmosphere; learning is often assessed through in-classroom observations of mastery; and technology and internet use by students is often restricted, which may inhibit opportunities to use AI and other technical resources in the classroom.We contribute a deeper understanding of technology use in the hospital classroom.
  • Evaluation of Interactive Demonstration in Voice-assisted Counting for Young Children
    Karunaratna, Sulakna; Vargas-Diaz, Daniel; Kim, Jisun; Wang, Jenny; Choi, Koeun; Lee, Sang Won (ACM, 2024-11-11)
    In recent years, the number of AI voice agent applications designed to help young children learn math has increased. However, the impact of interactivity within these applications on children’s learning and engagement remains unexplored. While current apps may employ various levels of interactions, such as visual, haptic, sound, and animation, the efficacy of these interactions in facilitating children’s learning remains uncertain. This research investigates how varying levels of interactivity in touch-based interfaces, combined with an AI voice agent, affect the learning of counting skills in children aged 2 to 4 years.We examine three conditions: baseline (no demonstration), animated demonstration, and interactive demonstration. By examining how these different levels of interactivity influence children’s engagement with math apps, this study seeks to enhance our understanding of effective design strategies for educational technology targeting early childhood education. The findings of this research hold the potential to inform the development of interfaces for math games that leverage both touch-based interactions and AI voice assistants to support young children’s learning of foundational mathematical concepts.
  • Investigating Characteristics of Media Recommendation Solicitation in r/ifyoulikeblank
    Bhuiyan, Md Momen; Hu, Donghan; Jelson, Andrew; Mitra, Tanushree; Lee, Sang Won (ACM, 2024-11-08)
    Despite the existence of search-based recommender systems like Google, Netflix, and Spotify, online users sometimes may turn to crowdsourced recommendations in places like the r/ifyoulikeblank subreddit. In this exploratory study, we probe why users go to r/ifyoulikeblank, how they look for recommendation, and how the subreddit users respond to recommendation requests. To answer, we collected sample posts from r/ifyoulikeblank and analyzed them using a qualitative approach. Our analysis reveals that users come to this subreddit for various reasons, such as exhausting popular search systems, not knowing what or how to search for an item, and thinking crowd have better knowledge than search systems. Examining users query and their description, we found novel information users provide during recommendation seeking using r/ifyoulikeblank. For example, sometimes they ask for artifacts recommendation based on the tools used to create them. Or, sometimes indicating a recommendation seeker's time constraints can help better suit recommendations to their needs. Finally, recommendation responses and interactions revealed patterns of how requesters and responders refine queries and recommendations. Our work informs future intelligent recommender systems design.
  • Simplify, Consolidate, Intervene: Facilitating Institutional Support with Mental Models of Learning Management System Use
    Hassan, Taha; Edmison, Bob; Williams, Daron; Cox II, Larry; Louvet, Matthew; Knijnenburg, Bart; McCrickard, D. (ACM, 2024-11-08)
    Measuring instructors' adoption of learning management system (LMS) tools is a critical first step in evaluating the efficacy of online teaching and learning at scale. Existing models for LMS adoption are often qualitative, learner-centered, and difficult to leverage towards institutional support. We propose depth-of-use (DOU): an intuitive measurement model for faculty's utilization of a university-wide LMS and their needs for institutional support. We hypothesis-test the relationship between DOU and course attributes like modality, participation, logistics, and outcomes. In a large-scale analysis of metadata from 30000+ courses offered at Virginia Tech over two years, we find that a pervasive need for scale, interoperability and ubiquitous access drives LMS adoption by university instructors. We then demonstrate how DOU can help faculty members identify the opportunity-cost of transition from legacy apps to LMS tools. We also describe how DOU can help instructional designers and IT organizational leadership evaluate the impact of their support allocation, faculty development and LMS evangelism initiatives.
  • ThreatKG: An AI-Powered System for Automated Open-Source Cyber Threat Intelligence Gathering and Management
    Gao, Peng; Liu, Xiaoyuan; Choi, Edward; Ma, Sibo; Yang, Xinyu; Song, Dawn (ACM, 2023-11-19)
    Open-source cyber threat intelligence (OSCTI) has become essential for keeping up with the rapidly changing threat landscape. However, current OSCTI gathering and management solutions mainly focus on structured Indicators of Compromise (IOC) feeds, which are lowlevel and isolated, providing only a narrow view of potential threats. Meanwhile, the extensive and interconnected knowledge found in the unstructured text of numerous OSCTI reports (e.g., security articles, threat reports) available publicly is still largely underexplored. To bridge the gap, we propose THREATKG, an automated system for OSCTI gathering and management. THREATKG efficiently collects a large number of OSCTI reports from multiple sources, leverages specialized AI-based techniques to extract high-quality knowledge about various threat entities and their relationships, and constructs and continuously updates a threat knowledge graph by integrating new OSCTI data. THREATKG features a modular and extensible design, allowing for the addition of components to accommodate diverse OSCTI report structures and knowledge types. Our extensive evaluations demonstrate THREATKG’s practical effectiveness in enhancing threat knowledge gathering and management.
  • Practical Fault Injection Attacks on Constant Time CSIDH and Mitigation Techniques
    Chiu, Tinghung; LeGrow, Jason; Xiong, Wenjie (ACM, 2024-11-19)
    Commutative Supersingular Isogeny Diffie-Hellman (CSIDH) is an isogeny-based key exchange protocol which is believed to be secure even when parties use long-lived secret keys. To secure CSIDH against side-channel attacks, constant-time implementations with additional dummy isogeny computations are employed. In this study, we demonstrate a fault injection attack on the constant-time real-then-dummy CSIDH to recover the full static secret key. We prototype the attack using voltage glitches on the victim STM32 microcontroller. The attack scheme, which is based on existing research which has yet to be practically implemented, involves getting the faulty output by injecting the fault in a binary search fashion. Our attack reveals many practical factors that were not considered in the previous theoretical fault injection attack analysis, e.g., the probability of a failed fault injection. We bring the practice to theory and developed new complexity analysis of the attack. Further, to mitigate the possible binary search attack on real-then-dummy CSIDH, dynamic random vector CSIDH was proposed previously to randomize the order of real and dummy isogeny operations. We explore fault injection attacks on dynamic random vector CSIDH and evaluate the security level of the mitigation. Our analysis and experimental results demonstrate that it is infeasible to attack dynamic random vector CSIDH in a reasonable amount of time when the success rate of fault injection is not consistent over time.
  • Editorial: ACM Transactions on Computer Systems
    van Renesse, Robbert; Noh, Sam H. (ACM, 2024-11-22)
  • FedCaSe: Enhancing Federated Learning with Heterogeneity-aware Caching and Scheduling
    Khan, Redwan Ibne Seraj; Paul, Arnab K.; Jian, Xun (Steve); Cheng, Yue; Butt, Ali R. (ACM, 2024-11-20)
    Federated learning (FL) has emerged as a new paradigm of machine learning (ML) with the goal of collaborative learning on the vast pool of private data available across distributed edge devices. The focus of most existing works in FL systems has been on addressing the challenges of computation and communication heterogeneity inherent in training with edge devices. However, the crucial impact of I/O and the role of limited on-device storage has not been explored fully in FL context. Without policies to exploit the on-device storage for placement of client data samples, and schedule clients based on I/O benefits, FL training can lead to inefficiencies, such as increased training time and impacted accuracy convergence. In this paper, we propose FedCaSe, a framework for efficiently caching client samples in-situ on limited on-device storage and scheduling client participation. FedCaSe boosts the I/O performance by exploiting a unique characteristic— the experience, i.e., relative impact on overall performance, of data samples and clients. FedCaSe utilizes this information in adaptive caching policies for sample placement inside the limited memory of edge clients. The framework also exploits the experience information to orchestrate the future selection of clients. Our experiments with representative workloads and policies show that compared to the state of the art, FedCaSe improves the training time by 2.06× for accuracy convergence at the scale of thousands of clients.
  • A Survey of Prototyping Platforms for Intermittent Computing Research
    Williams, Harrison; Hicks, Matthew (ACM, 2024-11-04)
    Batteryless energy harvesting platforms are gaining popularity as a way to bring next-generation sensing and edge computing devices to deployments previously limited by their need for batteries. Energy harvesting enables perpetual, maintenance-free operation, but also introduces new challenges associated with unreliable environmental power as systems face common-case, yet unpredictable power failures. Software execution on these devices is an active area of research: intermittently executed software must correctly and efficiently handle arbitrary interruption, frequent state saving/ restoration, and re-execution of certain code segments as part of a normal operation. The wide application range for batteryless systems combined with strict limitations on size and performance means there is little overlap in batteryless system prototypes— platforms are chosen for familiarity or specific features in a given application. Unfortunately, the effectiveness of different intermittent computing approaches varies widely across devices. As a result, intermittent computing research is at best hard to generalize across platforms and at worst contradictory across studies. This work explores several of the device-level differences that substantially affect intermittent system performance across eight low-power prototyping platforms. We examine system-level assumptions made by the major approaches to intermittent computing today and determine how compatible each approach is with each platform. The goal of this paper is to serve as a guide for researchers and practitioners developing intermittent systems to both understand the landscape of devices suitable for batteryless operation and to highlight how interactions between devices and the intermittent software running on them can profoundly affect both performance and high-level conclusions in intermittent systems research.We open source our device bring-up code and instructions to facilitate multi-board experiments for future approaches.
  • Optimizing Effectiveness and Defense of Drone Surveillance Missions via Honey Drones
    Wan, Zelin; Cho, Jin-Hee; Zhu, Mu; Anwar, Ahmed; Kamhoua, Charles; Singh, Munindar (ACM, 2024)
    This work aims to develop a surveillance mission system using unmanned aerial vehicles (UAVs) or drones when Denial-of-Service (DoS) attacks are present to disrupt normal operations for mission systems. In particular, we introduce the concept of cyber deception using honey drones (HDs) to protect the mission system from DoS attacks. HDs exhibit fake vulnerabilities and employ stronger signal strengths to lure DoS attacks, unlike the legitimate drones called mission drones (MDs) deployed for mission execution. This research formulates an optimization problem to identify an optimal set of signal strengths of HDs and MDs to best prevent the system from DoS attacks while maximizing mission performance under the resource constraints of UAVs. To solve this optimization problem, we leverage deep reinforcement learning (DRL) to achieve these multiple objectives of the mission system concerning system security and performance. Particularly, for efficient and effective parallel processing in DRL, we utilize a DRL algorithm called the Asynchronous Advantage Actor-Critic (A3C) algorithm to model attack-defense interactions. We employ a physical engine-based simulation testbed to consider realistic scenarios and demonstrate valid findings from the realistic testbed. The extensive experiments proved that our HD-based approach could achieve up to a 32% increase in mission completion, a 20% reduction in energy consumption, and a 62% decrease in attack success rates compared to existing defense strategies.
  • XplainScreen: Unveiling the Black Box of Graph Neural Network Drug Screening Models with a Unified XAI Framework
    Ahn, Geonhee; Haque, Md Mahim Anjum; Hazarika, Subhashis; Kim, Soo Kyung (ACM, 2024-10-21)
    Despite the powerful capabilities of GNN-based drug screening model in predicting target drug properties, the black-box nature of these models poses a challenge for practical application, particularly in a field as critical as drug development where understanding and trust in AI-driven decisions are important. To address the interpretability issues associated with GNN-based virtual drug screening, we introduce XplainScreen: a unified explanation framework designed to evaluate various explanation methods for GNN-based models. XplainScreen offers a user-friendly, web-based interactive platform that allows for the selection of specific GNN-based drug screening models and multiple cutting-edge explainable AI methods. It supports both qualitative assessments (through visualization and generative text descriptions) and quantitative evaluations of these methods, utilizing drug molecules in SMILES format. This demonstration showcases the utility of XplainScreen through a user study with pharmacological researchers focused on virtual screening tasks based on toxicity, highlighting the framework’s potential to enhance the integrity and trustworthiness of AI-driven virtual drug screening. A video demo of XplainScreen is available at https://youtu.be/Q4yobrTLKec, and the source code can be accessed at https://github.com/GeonHeeAhn/XplainScreen.
  • Hermes: Boosting the Performance of Machine-Learning-Based Intrusion Detection System through Geometric Feature Learning
    Zhang, Chaoyu; Shi, Shanghao; Wang, Ning; Xu, Xiangxiang; Li, Shaoyu; Zheng, Lizhong; Marchany, Randy; Gardner, Mark; Hou, Y. Thomas; Lou, Wenjing (ACM, 2024-10-14)
    Anomaly-Based Intrusion Detection Systems (IDSs) have been extensively researched for their ability to detect zero-day attacks. These systems establish a baseline of normal behavior using benign traffic data and flag deviations from this norm as potential threats. They generally experience higher false alarm rates than signature-based IDSs. Unlike image data, where the observed features provide immediate utility, raw network traffic necessitates additional processing for effective detection. It is challenging to learn useful patterns directly from raw traffic data or simple traffic statistics (e.g., connection duration, package inter-arrival time) as the complex relationships are difficult to distinguish. Therefore, some feature engineering becomes imperative to extract and transform raw data into new feature representations that can directly improve the detection capability and reduce the false positive rate. We propose a geometric feature learning method to optimize the feature extraction process. We employ contrastive feature learning to learn a feature space where normal traffic instances reside in a compact cluster. We further utilize H-Score feature learning to maximize the compactness of the cluster representing the normal behavior, enhancing the subsequent anomaly detection performance. Our evaluations using the NSL-KDD and N-BaloT datasets demonstrate that the proposed IDS powered by feature learning can consistently outperform state-of-the-art anomaly-based IDS methods by significantly lowering the false positive rate. Furthermore, we deploy the proposed IDS on a Raspberry Pi 4 and demonstrate its applicability on resource-constrained Internet of Things (IoT) devices, highlighting its versatility for diverse application scenarios.
  • VizGroup: An AI-assisted Event-driven System for Collaborative Programming Learning Analytics
    Tang, Xiaohang; Wong, Sam; Pu, Kevin; Chen, Xi; Yang, Yalong; Chen, Yan (ACM, 2024-10-13)
    Programming instructors often conduct collaborative learning activities, like Peer Instruction, to foster a deeper understanding in students and enhance their engagement with learning. These activities, however, may not always yield productive outcomes due to the diversity of student mental models and their ineffective collaboration. In this work, we introduce VizGroup, an AI-assisted system that enables programming instructors to easily oversee students’ real-time collaborative learning behaviors during large programming courses. VizGroup leverages Large Language Models (LLMs) to recommend event specifications for instructors so that they can simultaneously track and receive alerts about key correlation patterns between various collaboration metrics and ongoing coding tasks. We evaluated VizGroup with 12 instructors in a comparison study using a dataset collected from a Peer Instruction activity that was conducted in a large programming lecture. The results showed that VizGroup helped instructors effectively overview, narrow down, and track nuances throughout students’ behaviors.
  • Examining Pair Dynamics in Shared, Co-located Augmented Reality Narratives
    Connor, Cherelle; Schoenborn, Eric; Hu, Sathaporn; Porcino, Thiago; Moore, Cameron; Reilly, Derek; Lages, Wallace (ACM, 2024-10-07)
    Augmented reality (AR) allows users to experience stories together in the same physical space. However, little is known about the experience of sharing AR narratives with others. Much of our current understanding is derived from multi-user VR applications, which can differ significantly in presence, social interaction, and spatial awareness from narratives and other entertainment content designed for AR head-worn displays. To understand the dynamics of multi-user, co-located, AR storytelling, we conducted an exploratory study involving three original AR narratives. Participants experienced each narrative alone or in pairs via the Microsoft Hololens 2.We collected qualitative and quantitative data from 42 participants through questionnaires and post-experience semi-structured interviews. Results indicate participants enjoyed experiencing AR narratives together and revealed five themes relevant to the design of multi-user, colocated AR narratives. We discuss the implications of these themes and provide design recommendations for AR experience designers and storytellers regarding the impact of interaction, physical space, spatial coherence, and narrative timing. Our findings highlight the importance of exploring both user interactions and pair interactions as factors in AR storytelling research.
  • Evaluating Layout Dimensionalities in PC+VR Asymmetric Collaborative Decision Making
    Enriquez, Daniel; Tong, Wai; North, Christopher L.; Qu, Huamin; Yang, Yalong (ACM, 2024-10-20)
    With the commercialization of virtual/augmented reality (VR/AR) devices, there is an increasing interest in combining immersive and non-immersive devices (e.g., desktop computers) for asymmetric collaborations. While such asymmetric settings have been examined in social platforms, significant questions around layout dimensionality in data-driven decision-making remain underexplored. A crucial inquiry arises: although presenting a consistent 3D virtual world on both immersive and non-immersive platforms has been a common practice in social applications, does the same guideline apply to lay out data? Or should data placement be optimized locally according to each device's display capacity? This study aims to provide empirical insights into the user experience of asymmetric collaboration in data-driven decision-making. We tested practical dimensionality combinations between PC and VR, resulting in three conditions: PC2D+VR2D, PC2D+VR3D, and PC3D+VR3D. The results revealed a preference for PC2D+VR3D, and PC2D+VR2D led to the quickest task completion. Our investigation facilitates an in-depth discussion of the trade-offs associated with different layout dimensionalities in asymmetric collaborations.
  • Causality-Aware Spatiotemporal Graph Neural Networks for Spatiotemporal Time Series Imputation
    Jing, Baoyu; Zhou, Dawei; Ren, Kan; Yang, Carl (ACM, 2024-10-21)
    Spatiotemporal time series are usually collected via monitoring sensors placed at different locations, which usually contain missing values due to various failures, such as mechanical damages and Internet outages. Imputing the missing values is crucial for analyzing time series. When recovering a specific data point, most existing methods consider all the information relevant to that point regardless of the cause-and-effect relationship. During data collection, it is inevitable that some unknown confounders are included, e.g., background noise in time series and non-causal shortcut edges in the constructed sensor network. These confounders could open backdoor paths and establish non-causal correlations between the input and output. Over-exploiting these non-causal correlations could cause overfitting. In this paper, we first revisit spatiotemporal time series imputation from a causal perspective and show how to block the confounders via the frontdoor adjustment. Based on the results of frontdoor adjustment, we introduce a novel Causality- Aware Spatiotemporal Graph Neural Network (Casper), which contains a novel Prompt Based Decoder (PBD) and a Spatiotemporal Causal Attention (SCA). PBD could reduce the impact of confounders and SCA could discover the sparse causal relationships among embeddings. Theoretical analysis reveals that SCA discovers causal relationships based on the values of gradients. We evaluate Casper on three real-world datasets, and the experimental results show that Casper could outperform the baselines and could effectively discover the causal relationships.
  • An Exploratory Mixed-methods Study on General Data Protection Regulation (GDPR) Compliance in Open-Source Software
    Franke, Lucas; Liang, Huayu; Farzanehpour, Sahar; Brantly, Aaron F.; Davis, James C.; Brown, Chris (ACM, 2024-10-24)
    Background: Governments worldwide are considering data privacy regulations. These laws, such as the European Union’s General Data Protection Regulation (GDPR), require software developers to meet privacy-related requirements when interacting with users’ data. Prior research describes the impact of such laws on software development, but only for commercial software. Although opensource software is commonly integrated into regulated software, and thus must be engineered or adapted for compliance, we do not know how such laws impact open-source software development. Aims: To understand how data privacy laws affect open-source software (OSS) development, we focus on the European Union’s GDPR, as it is the most prominent such law. We investigated how GDPR compliance activities influence OSS developer activity (RQ1), how OSS developers perceive fulfilling GDPR requirements (RQ2), the most challenging GDPR requirements to implement (RQ3), and how OSS developers assess GDPR compliance (RQ4). Method:We distributed an online survey to explore perceptions of GDPR implementations from open-source developers (N=56). To augment this analysis, we further conducted a repository mining study to analyze development metrics on pull requests (N=31,462) submitted to open-source GitHub repositories. Results: Our results suggest GDPR policies complicate OSS development and introduce challenges, primarily regarding the management of users’ data, implementation costs and time, and assessments of compliance. Moreover, we observed negative perceptions of the GDPR from OSS developers and significant increases in development activity, in particular metrics related to coding and reviewing, on GitHub pull requests related to GDPR compliance. Conclusions: Our findings provide future research directions and implications for improving data privacy policies, motivating the need for relevant resources and automated tools to support data privacy regulation implementation and compliance efforts in OSS.