Journal Articles, Association for Computing Machinery (ACM)
Permanent URI for this collection
Browse
Recent Submissions
- Optimizing Effectiveness and Defense of Drone Surveillance Missions via Honey DronesWan, Zelin; Cho, Jin-Hee; Zhu, Mu; Anwar, Ahmed; Kamhoua, Charles; Singh, Munindar (ACM, 2024)This work aims to develop a surveillance mission system using unmanned aerial vehicles (UAVs) or drones when Denial-of-Service (DoS) attacks are present to disrupt normal operations for mission systems. In particular, we introduce the concept of cyber deception using honey drones (HDs) to protect the mission system from DoS attacks. HDs exhibit fake vulnerabilities and employ stronger signal strengths to lure DoS attacks, unlike the legitimate drones called mission drones (MDs) deployed for mission execution. This research formulates an optimization problem to identify an optimal set of signal strengths of HDs and MDs to best prevent the system from DoS attacks while maximizing mission performance under the resource constraints of UAVs. To solve this optimization problem, we leverage deep reinforcement learning (DRL) to achieve these multiple objectives of the mission system concerning system security and performance. Particularly, for efficient and effective parallel processing in DRL, we utilize a DRL algorithm called the Asynchronous Advantage Actor-Critic (A3C) algorithm to model attack-defense interactions. We employ a physical engine-based simulation testbed to consider realistic scenarios and demonstrate valid findings from the realistic testbed. The extensive experiments proved that our HD-based approach could achieve up to a 32% increase in mission completion, a 20% reduction in energy consumption, and a 62% decrease in attack success rates compared to existing defense strategies.
- XplainScreen: Unveiling the Black Box of Graph Neural Network Drug Screening Models with a Unified XAI FrameworkAhn, Geonhee; Haque, Md Mahim Anjum; Hazarika, Subhashis; Kim, Soo Kyung (ACM, 2024-10-21)Despite the powerful capabilities of GNN-based drug screening model in predicting target drug properties, the black-box nature of these models poses a challenge for practical application, particularly in a field as critical as drug development where understanding and trust in AI-driven decisions are important. To address the interpretability issues associated with GNN-based virtual drug screening, we introduce XplainScreen: a unified explanation framework designed to evaluate various explanation methods for GNN-based models. XplainScreen offers a user-friendly, web-based interactive platform that allows for the selection of specific GNN-based drug screening models and multiple cutting-edge explainable AI methods. It supports both qualitative assessments (through visualization and generative text descriptions) and quantitative evaluations of these methods, utilizing drug molecules in SMILES format. This demonstration showcases the utility of XplainScreen through a user study with pharmacological researchers focused on virtual screening tasks based on toxicity, highlighting the framework’s potential to enhance the integrity and trustworthiness of AI-driven virtual drug screening. A video demo of XplainScreen is available at https://youtu.be/Q4yobrTLKec, and the source code can be accessed at https://github.com/GeonHeeAhn/XplainScreen.
- Hermes: Boosting the Performance of Machine-Learning-Based Intrusion Detection System through Geometric Feature LearningZhang, Chaoyu; Shi, Shanghao; Wang, Ning; Xu, Xiangxiang; Li, Shaoyu; Zheng, Lizhong; Marchany, Randy; Gardner, Mark; Hou, Y. Thomas; Lou, Wenjing (ACM, 2024-10-14)Anomaly-Based Intrusion Detection Systems (IDSs) have been extensively researched for their ability to detect zero-day attacks. These systems establish a baseline of normal behavior using benign traffic data and flag deviations from this norm as potential threats. They generally experience higher false alarm rates than signature-based IDSs. Unlike image data, where the observed features provide immediate utility, raw network traffic necessitates additional processing for effective detection. It is challenging to learn useful patterns directly from raw traffic data or simple traffic statistics (e.g., connection duration, package inter-arrival time) as the complex relationships are difficult to distinguish. Therefore, some feature engineering becomes imperative to extract and transform raw data into new feature representations that can directly improve the detection capability and reduce the false positive rate. We propose a geometric feature learning method to optimize the feature extraction process. We employ contrastive feature learning to learn a feature space where normal traffic instances reside in a compact cluster. We further utilize H-Score feature learning to maximize the compactness of the cluster representing the normal behavior, enhancing the subsequent anomaly detection performance. Our evaluations using the NSL-KDD and N-BaloT datasets demonstrate that the proposed IDS powered by feature learning can consistently outperform state-of-the-art anomaly-based IDS methods by significantly lowering the false positive rate. Furthermore, we deploy the proposed IDS on a Raspberry Pi 4 and demonstrate its applicability on resource-constrained Internet of Things (IoT) devices, highlighting its versatility for diverse application scenarios.
- VizGroup: An AI-assisted Event-driven System for Collaborative Programming Learning AnalyticsTang, Xiaohang; Wong, Sam; Pu, Kevin; Chen, Xi; Yang, Yalong; Chen, Yan (ACM, 2024-10-13)Programming instructors often conduct collaborative learning activities, like Peer Instruction, to foster a deeper understanding in students and enhance their engagement with learning. These activities, however, may not always yield productive outcomes due to the diversity of student mental models and their ineffective collaboration. In this work, we introduce VizGroup, an AI-assisted system that enables programming instructors to easily oversee students’ real-time collaborative learning behaviors during large programming courses. VizGroup leverages Large Language Models (LLMs) to recommend event specifications for instructors so that they can simultaneously track and receive alerts about key correlation patterns between various collaboration metrics and ongoing coding tasks. We evaluated VizGroup with 12 instructors in a comparison study using a dataset collected from a Peer Instruction activity that was conducted in a large programming lecture. The results showed that VizGroup helped instructors effectively overview, narrow down, and track nuances throughout students’ behaviors.
- Examining Pair Dynamics in Shared, Co-located Augmented Reality NarrativesConnor, Cherelle; Schoenborn, Eric; Hu, Sathaporn; Porcino, Thiago; Moore, Cameron; Reilly, Derek; Lages, Wallace (ACM, 2024-10-07)Augmented reality (AR) allows users to experience stories together in the same physical space. However, little is known about the experience of sharing AR narratives with others. Much of our current understanding is derived from multi-user VR applications, which can differ significantly in presence, social interaction, and spatial awareness from narratives and other entertainment content designed for AR head-worn displays. To understand the dynamics of multi-user, co-located, AR storytelling, we conducted an exploratory study involving three original AR narratives. Participants experienced each narrative alone or in pairs via the Microsoft Hololens 2.We collected qualitative and quantitative data from 42 participants through questionnaires and post-experience semi-structured interviews. Results indicate participants enjoyed experiencing AR narratives together and revealed five themes relevant to the design of multi-user, colocated AR narratives. We discuss the implications of these themes and provide design recommendations for AR experience designers and storytellers regarding the impact of interaction, physical space, spatial coherence, and narrative timing. Our findings highlight the importance of exploring both user interactions and pair interactions as factors in AR storytelling research.
- Evaluating Layout Dimensionalities in PC+VR Asymmetric Collaborative Decision MakingEnriquez, Daniel; Tong, Wai; North, Christopher L.; Qu, Huamin; Yang, Yalong (ACM, 2024-10-20)With the commercialization of virtual/augmented reality (VR/AR) devices, there is an increasing interest in combining immersive and non-immersive devices (e.g., desktop computers) for asymmetric collaborations. While such asymmetric settings have been examined in social platforms, significant questions around layout dimensionality in data-driven decision-making remain underexplored. A crucial inquiry arises: although presenting a consistent 3D virtual world on both immersive and non-immersive platforms has been a common practice in social applications, does the same guideline apply to lay out data? Or should data placement be optimized locally according to each device's display capacity? This study aims to provide empirical insights into the user experience of asymmetric collaboration in data-driven decision-making. We tested practical dimensionality combinations between PC and VR, resulting in three conditions: PC2D+VR2D, PC2D+VR3D, and PC3D+VR3D. The results revealed a preference for PC2D+VR3D, and PC2D+VR2D led to the quickest task completion. Our investigation facilitates an in-depth discussion of the trade-offs associated with different layout dimensionalities in asymmetric collaborations.
- Causality-Aware Spatiotemporal Graph Neural Networks for Spatiotemporal Time Series ImputationJing, Baoyu; Zhou, Dawei; Ren, Kan; Yang, Carl (ACM, 2024-10-21)Spatiotemporal time series are usually collected via monitoring sensors placed at different locations, which usually contain missing values due to various failures, such as mechanical damages and Internet outages. Imputing the missing values is crucial for analyzing time series. When recovering a specific data point, most existing methods consider all the information relevant to that point regardless of the cause-and-effect relationship. During data collection, it is inevitable that some unknown confounders are included, e.g., background noise in time series and non-causal shortcut edges in the constructed sensor network. These confounders could open backdoor paths and establish non-causal correlations between the input and output. Over-exploiting these non-causal correlations could cause overfitting. In this paper, we first revisit spatiotemporal time series imputation from a causal perspective and show how to block the confounders via the frontdoor adjustment. Based on the results of frontdoor adjustment, we introduce a novel Causality- Aware Spatiotemporal Graph Neural Network (Casper), which contains a novel Prompt Based Decoder (PBD) and a Spatiotemporal Causal Attention (SCA). PBD could reduce the impact of confounders and SCA could discover the sparse causal relationships among embeddings. Theoretical analysis reveals that SCA discovers causal relationships based on the values of gradients. We evaluate Casper on three real-world datasets, and the experimental results show that Casper could outperform the baselines and could effectively discover the causal relationships.
- An Exploratory Mixed-methods Study on General Data Protection Regulation (GDPR) Compliance in Open-Source SoftwareFranke, Lucas; Liang, Huayu; Farzanehpour, Sahar; Brantly, Aaron F.; Davis, James C.; Brown, Chris (ACM, 2024-10-24)Background: Governments worldwide are considering data privacy regulations. These laws, such as the European Union’s General Data Protection Regulation (GDPR), require software developers to meet privacy-related requirements when interacting with users’ data. Prior research describes the impact of such laws on software development, but only for commercial software. Although opensource software is commonly integrated into regulated software, and thus must be engineered or adapted for compliance, we do not know how such laws impact open-source software development. Aims: To understand how data privacy laws affect open-source software (OSS) development, we focus on the European Union’s GDPR, as it is the most prominent such law. We investigated how GDPR compliance activities influence OSS developer activity (RQ1), how OSS developers perceive fulfilling GDPR requirements (RQ2), the most challenging GDPR requirements to implement (RQ3), and how OSS developers assess GDPR compliance (RQ4). Method:We distributed an online survey to explore perceptions of GDPR implementations from open-source developers (N=56). To augment this analysis, we further conducted a repository mining study to analyze development metrics on pull requests (N=31,462) submitted to open-source GitHub repositories. Results: Our results suggest GDPR policies complicate OSS development and introduce challenges, primarily regarding the management of users’ data, implementation costs and time, and assessments of compliance. Moreover, we observed negative perceptions of the GDPR from OSS developers and significant increases in development activity, in particular metrics related to coding and reviewing, on GitHub pull requests related to GDPR compliance. Conclusions: Our findings provide future research directions and implications for improving data privacy policies, motivating the need for relevant resources and automated tools to support data privacy regulation implementation and compliance efforts in OSS.
- Goldilocks Zoning: Evaluating a Gaze-Aware Approach to Task-Agnostic VR Notification PlacementIlo, Cory; DiVerdi, Stephen; Bowman, Douglas A. (ACM, 2024-10-07)While virtual reality (VR) offers immersive experiences, users need to remain aware of notifications from outside VR. However, inserting notifications into a VR experience can result in distraction or breaks in presence, since existing notification systems in VR use static placement and lack situational awareness. We address this challenge by introducing a novel notification placement technique, Goldilocks Zoning (GZ), which leverages a 360-degree heatmap generated using gaze data to place notifications near salient areas of the environment without obstructing the primary task. To investigate the effectiveness of this technique, we conducted a dualtask experiment comparing GZ to common notification placement techniques. We found that GZ had similar performance to state-ofthe- art techniques in a variety of primary task scenarios. Our study reveals that no single technique is universally optimal in dynamic settings, underscoring the potential for adaptive approaches to notification management. As a step in this direction, we explored the potential to use machine learning to predict the task based on the gaze heatmap.
- TAGGAR: General-Purpose Task Guidance from Natural Language in Augmented Reality using Vision-Language ModelsStover, Daniel; Bowman, Douglas A. (ACM, 2024-10-07)Augmented reality (AR) task guidance systems provide assistance for procedural tasks by rendering virtual guidance visuals within the real-world environment. Current AR task guidance systems are limited in that they require AR system experts to manually place visuals, require models of real-world objects, or only function for limited tasks or environments. We propose a general-purpose AR task guidance approach for tasks defined by natural language. Our approach allows an operator to take pictures of relevant objects and write task instructions for an end user, which are used by the system to determine where to place guidance visuals. Then, an end user can receive and follow guidance even if objects change locations or environments. Our approach utilizes current visionlanguage machine learning models for text and image semantic understanding and object localization. We built a proof-of-concept system called TAGGAR using our approach and tested its accuracy and usability in a user study. We found that all operators were able to generate clear guidance for tasks and end users were able to follow the guidance visuals to complete the expected action 85.7% of the time without any knowledge of the tasks.
- Breaking Privacy in Model-Heterogeneous Federated LearningHaldankar, Atharva; Riasi, Arman; Nguyen, Hoang-Dung; Phuong, Tran; Hoang, Thang (ACM, 2024-09-30)Federated learning (FL) allows multiple distrustful clients to collaboratively train a machine learning model. In FL, data never leaves client devices; instead, clients only share locally computed gradients with a central server. As individual gradients may leak information about a given client’s dataset, secure aggregation was proposed. With secure aggregation, the server only receives the aggregate gradient update from the set of all sampled clients without being able to access any individual gradient. One challenge in FL is the systemslevel heterogeneity that is quite often present among client devices. Specifically, clients in the FL protocol may have varying levels of compute power, on-device memory, and communication bandwidth. These limitations are addressed by model-heterogeneous FL schemes, where clients are able to train on subsets of the global model. Despite the benefits of model-heterogeneous schemes in addressing systems-level challenges, the implications of these schemes on client privacy have not been thoroughly investigated. In this paper, we investigate whether the nature of model distribution and the computational heterogeneity among client devices in model-heterogeneous FL schemes may result in the server being able to recover sensitive data from target clients. To this end, we propose two attacks in the model-heterogeneous FL setting, even with secure aggregation in place. We call these attacks the Convergence Rate Attack and the Rolling Model Attack. The Convergence Rate Attack targets schemes where clients train on the same subset of the global model, while the Rolling Model Attack targets schemes where model parameters are dynamically updated each round. We show that a malicious adversary can compromise the model and data confidentiality of a target group of clients. We evaluate our attacks on the MNIST and CIFAR-10 datasets and show that using our techniques, an adversary can reconstruct data samples with near perfect accuracy for batch sizes of up to 20 samples.
- Vision-Language Models for Biomedical ApplicationsThapa, Surendrabikram; Naseem, Usman; Zhou, Luping; Kim, Jinman (ACM, 2024-10-28)Vision-language models (VLMs) are transforming the landscape of biomedical research and healthcare by enabling the seamless integration and interpretation of complex multimodal data, including medical images and clinical texts. Recognizing the growing impact of these models, the first international workshop on Vision- Language Models for Biomedicine (VLM4Bio) was held in conjunction with ACM Multimedia 2024. The workshop aimed to address the critical need for advanced techniques that can leverage VLMs in applications such as medical imaging, diagnostics, and personalized treatment. As healthcare data increasingly involves both visual and textual information, VLM4Bio provided a platform for interdisciplinary collaboration between experts in natural language processing, computer vision, biomedical engineering, and AI ethics. This paper provides an overview of the inaugural edition of the VLM4Bio workshop, summarizing the key discussions, contributions, and future directions for expanding the workshop’s scope and influence in subsequent editions.
- An empirical study to understand how students use ChatGPT for writing essays and how it affects their ownershipJelson, Andrew; Lee, Sang Won (ACM, 2024-05-11)As large language models (LLMs) become more powerful and ubiquitous, systems like ChatGPT are increasingly used by students to help them with writing tasks. To better understand how these tools are used, we investigate how students might use an LLM for essay writing, for example, to study the queries asked to ChatGPT and the responses that ChatGPT gives. To that end, we plan to conduct a user study that will record the user writing process and present them with the opportunity to use ChatGPT as an AI assistant. This study’s findings will help us understand how these tools are used and how practitioners — such as educators and essay readers — should consider writing education and evaluation based on essay writing.
- libLISA: Instruction Discovery and Analysis on x86-64Craaijo, Jos; Verbeek, Freek; Ravindran, Binoy (ACM, 2024-10-08)Even though heavily researched, a full formal model of the x86-64 instruction set is still not available. We present libLISA, a tool for automated discovery and analysis of the ISA of a CPU. This produces the most extensive formal x86-64 model to date, with over 118000 different instruction groups. The process requires as little human specification as possible: specifically, we do not rely on a human-written (dis)assembler to dictate which instructions are executable on a given CPU, or what their in- and outputs are. The generated model is CPU-specific: behavior that is "undefined" is synthesized for the current machine. Producing models for five different x86-64 machines, we mutually compare them, discover undocumented instructions, and generate instruction sequences that are CPU-specific. Experimental evaluation shows that we enumerate virtually all instructions within scope, that the instructions' semantics are correct w.r.t. existing work, and that we improve existing work by exposing bugs in their handwritten models.
- Semi-Supervised Code Translation Overcoming the Scarcity of Parallel Code DataZhu, Ming; Karim, Mohimenul; Lourentzou, Ismini; Yao, Daphne (ACM, 2024-10-27)Neural code translation is the task of converting source code from one programming language to another. One of the main challenges is the scarcity of parallel code data, which hinders the ability of translation models to learn accurate cross-language alignments. In this paper, we introduce MIRACLE, a semi-supervised approach that improves code translation through synthesizing high-quality parallel code data and curriculum learning on code data with ascending alignment levels. MIRACLE leverages static analysis and compilation to generate synthetic parallel code datasets with enhanced quality and alignment to address the challenge of data scarcity. We evaluate the proposed method along with strong baselines including instruction-tuned Large Language Models (LLMs) for code. Our analysis reveals that LLMs pre-trained on open-source code data, regardless of their size, suffer from the “shallow translation” problem. This issue arises when translated code copies keywords, statements, and even code blocks from the source language, leading to compilation and runtime errors. Extensive experiments demonstrate that our method significantly mitigates this issue, enhancing code translation performance across multiple models in C++, Java, Python, and C. Remarkably, MIRACLE outperforms code LLMs that are ten times larger in size. MIRACLE also achieves up to a 43% improvement in C code translation with fewer than 150 annotated examples.
- An Empirical Study on Current Practices and Challenges of Core AR/VR DevelopersBose, Dibyendu Brinto; Brown, Chris (ACM, 2024-10-27)Augmented reality (AR) and virtual reality (VR) applications are increasingly integral to modern society. Core AR/VR developers, pivotal in crafting these advanced technologies, face significant challenges throughout the software development lifecycle. In this context, ‘core AR/VR developers’ refers to professionals who actively engage in developing AR/VR technologies, including researchers and developers. We surveyed such professionals to directly understand these challenges and received 48 responses. Our findings categorize the unique challenges into three major stages of SDLC - Design, Implementation, Testing that core AR/VR developers pointed out. These challenges include creating immersive experiences, complexity in 3D interaction, cross-platform compatibility, and reproducing bugs. This study highlights significant AR/VR development obstacles and provides foundational insights for future research to improve development practices and tools in this rapidly evolving field.
- How Do Developers Reuse StackOverflow Answers in Their GitHub Projects?Chen, Juntong; Zhao, Yan; Meng, Na (ACM, 2024-10-27)StackOverflow (SO) is a widely used question-and-answer (Q&A) website for software developers and computer scientists. GitHub is an online development platform used for storing, tracking, and collaborating on software projects. Prior work relates the information mined from both platforms without carefully inspecting the answer-reuse practices. For this paper, we did an empirical study by mining the SO answers reused by Java projects available on GitHub. We created a hybrid approach of clone detection, keyword-based search, and manual inspection, to identify the answer(s) actually used by developers. Based on those answers, we studied topics of the discussion threads, answer characteristics (e.g., scores, ages, code lengths, and text lengths), and developers’ reuse practices. We observed that most reused answers offer programs to implement specific coding tasks. Among all analyzed SO discussion threads, the reused answers often have higher scores, older ages, longer code, and longer text than unused answers. In only 9% of scenarios (40/430), developers fully copied answer code for reuse. In the remaining scenarios, they reused partial code or created brand new code from scratch. Our study characterized 130 SO discussion threads referred to by Java developers in 357 GitHub projects. Our observations can guide SO answerers to provide better answers, and shed lights on future human-centric research that creates better tools to help with code reuse.
- SparseAuto: An Auto-scheduler for Sparse Tensor Computations using Recursive Loop Nest RestructuringDias, Adhitha; Anderson, Logan; Sundararajah, Kirshanthan; Pelenitsyn, Artem; Kulkarni, Milind (ACM, 2024-10-08)Automated code generation and performance enhancements for sparse tensor algebra have become essential in many real-world applications, such as quantum computing, physical simulations, computational chemistry, and machine learning. General sparse tensor algebra compilers are not always versatile enough to generate asymptotically optimal code for sparse tensor contractions. This paper shows how to generate asymptotically better schedules for complex sparse tensor expressions using kernel fission and fusion. We present generalized loop restructuring transformations to reduce asymptotic time complexity and memory footprint. Furthermore, we present an auto-scheduler that uses a partially ordered set (poset)-based cost model that uses both time and auxiliary memory complexities to prune the search space of schedules. In addition, we highlight the use of Satisfiability Module Theory (SMT) solvers in sparse auto-schedulers to approximate the Pareto frontier of better schedules to the smallest number of possible schedules, with user-defined constraints available at compile-time. Finally, we show that our auto-scheduler can select better-performing schedules and generate code for them. Our results show that the auto-scheduler provided schedules achieve orders-of-magnitude speedup compared to the code generated by the Tensor Algebra Compiler (TACO) for several computations on different real-world tensors.
- Ajna: A Wearable Shared Perception System for Extreme SensemakingWilchek, Matthew; Luther, Kurt; Batarseh, Feras A. (ACM, 2024)This paper introduces the design and prototype of Ajna, a wearable shared perception system for supporting extreme sensemaking in emergency scenarios. Ajna addresses technical challenges in Augmented Reality (AR) devices, specifically the limitations of depth sensors and cameras. These limitations confine object detection to close proximity and hinder perception beyond immediate surroundings, through obstructions, or across different structural levels, impacting collaborative use. It harnesses the Inertial Measurement Unit (IMU) in AR devices to measure users? relative distances from a set physical point, enabling object detection sharing among multiple users across obstacles like walls and over distances. We tested Ajna's effectiveness in a controlled study with 15 participants simulating emergency situations in a multi-story building. We found that Ajna improved object detection, location awareness, and situational awareness, and reduced search times by 15%. Ajna's performance in simulated environments highlights the potential of artificial intelligence (AI) to enhance sensemaking in critical situations, offering insights for law enforcement, search and rescue, and infrastructure management.
- Eco-Friendly Route Planning Algorithms: Taxonomies, Literature Review and Future DirectionsFahmin, Ahmed; Cheema, Muhammad Aamir; Eunus Ali, Mohammed; Nadjaran Toosi, Adel; Lu, Hua; Li, Huan; Taniar, David; Rakha, Hesham A.; Shen, Bojie (ACM, 2024)Eco-friendly navigation (aka eco-routing) finds a route from A to B in a road network that minimizes the greenhouse gas (GHG) emission or fuel/energy consumption of the traveling vehicle. As road transport is a major contributor to GHG emissions, eco-routing has received considerable research attention in the past decade, mainly on two research themes: 1) developing models to estimate emissions or fuel/energy consumption of vehicles; and 2) developing algorithms to find eco-friendly routes for a vehicle. There are some excellent literature reviews that cover the existing estimation models. However, there is no literature review on eco-friendly route planning algorithms. This paper fills this gap and provides a systematic literature review in this area. From mainstream online databases, we obtained 2,494 articles and shortlisted 76 articles using our exclusion criteria. Accordingly, we establish a holistic view of eco-routing systems and define five taxonomies of estimation models, eco-routing problems and algorithms, vehicle types, traffic, and road network characteristics. Concerning the taxonomies, we categorize and review the shortlisted articles. Finally, we highlight research challenges and outline future directions in this important area.