Journal Articles, Association for Computing Machinery (ACM)

Permanent URI for this collection


Recent Submissions

Now showing 1 - 20 of 236
  • Rebuttal How-to: Strategies, Tactics, and the Big Picture in Research
    Yao, Danfeng (Daphne) (ACM, 2023-12-21)
    Rebuttals are not published, thus, it is difficult for junior researchers to read successful rebuttals and improve. This article demystifies rebuttal writing by showing the arm-the-champion strategy and a few key tactics. More importantly, we also discuss the conformity nature of conference reviewing and why researchers should not be defeated by paper rejections.
  • StructCoder: Structure-Aware Transformer for Code Generation
    Tipirneni, Sindhu; Zhu, Ming; Reddy, Chandan (ACM, 2024)
    There has been a recent surge of interest in automating software engineering tasks using deep learning. This paper addresses the problem of code generation, where the goal is to generate target code given source code in a different language or a natural language description. Most state-of-the-art deep learning models for code generation use training strategies primarily designed for natural language. However, understanding and generating code requires a more rigorous comprehension of the code syntax and semantics. With this motivation, we develop an encoder-decoder Transformer model where both the encoder and decoder are explicitly trained to recognize the syntax and data flow in the source and target codes, respectively. We not only make the encoder structure-aware by leveraging the source code?s syntax tree and data flow graph, but we also support the decoder in preserving the syntax and data flow of the target code by introducing two novel auxiliary tasks: AST (Abstract Syntax Tree) paths prediction and data flow prediction. To the best of our knowledge, this is the first work to introduce a structure-aware Transformer decoder that models both syntax and data flow to enhance the quality of generated code. The proposed StructCoder model achieves state-of-the-art performance on code translation and text-to-code generation tasks in the CodeXGLUE benchmark, and improves over baselines of similar size on the APPS code generation benchmark. Our code is publicly available at
  • Graph Time-series Modeling in Deep Learning: A Survey
    Chen, Hongjie; Eldardiry, Hoda (ACM, 2024)
    Time-series and graphs have been extensively studied for their ubiquitous existence in numerous domains. Both topics have been separately explored in the field of deep learning. For time-series modeling, recurrent neural networks or convolutional neural networks model the relations between values across time steps, while for graph modeling, graph neural networks model the inter-relations between nodes. Recent research in deep learning requires simultaneous modeling for time-series and graphs when both representations are present. For example, both types of modeling are necessary for time-series classification, regression, and anomaly detection in graphs. This paper aims to provide a comprehensive summary of these models, which we call graph time-series models. To the best of our knowledge, this is the first survey paper that provides a picture of related models from the perspective of deep graph time-series modeling to address a range of time-series tasks, including regression, classification, and anomaly detection. Graph time-series models are split into two categories, a) graph recurrent/convolutional neural networks and b) graph attention neural networks. Under each category, we further categorize models based on their properties. Additionally, we compare representative models and discuss how distinctive model characteristics are utilized with respect to various model components and data challenges. Pointers to commonly used datasets and code are included to facilitate access for further research. In the end, we discuss potential directions for future research.
  • Whiteboarding: A Tool to Improve CS1 Student Self-Efficacy
    Chapin, John; Bowen, Bradley (ACM, 2023-12-05)
    Many students struggle in Introductory Computer Science (CS1) and fail or drop out of the class. A lack of CS self-efficacy - the belief that the individual can complete a task - is frequently the cause of this failure to succeed in CS1. Solutions have been proposed to improve student self-efficacy in CS1. Unfortunately, a lack of self-efficacy in CS1 classes is still a problem. This study examines a pedagogical tool, whiteboarding, and its effect on student perception of self-efficacy during the programming problem-solving process for novice programmers. Whiteboarding refers to students using whiteboards during the CS problem solving process. Focus group sessions, researcher notes, and memos were used to collect qualitative data. The whiteboarding intervention was conducted in two AP CS A classes during the first four weeks of the year. Seventeen 10th-grade students participated in the focus groups. Three focus groups of four students and one focus group of five students were conducted at the end of the intervention. These findings indicate that whiteboarding can be a vital tool that increases student self-efficacy by improving their success at programming activities, increasing collaboration and feedback, and providing an active, positive learning environment that holds students accountable for their work. The themes that emerged from the focus group sessions were: Engagement with the Problem, Engagement with Others, and Engagement with the Environment. Teaching success in the CS1 classroom requires student self-efficacy. This study highlights a teaching pedagogy that CS1 educators can implement to increase student self-efficacy.
  • Self-Correlation and Cross-Correlation Learning for Few-Shot Remote Sensing Image Semantic Segmentation
    Wang, Linhan; Lei, Shuo; He, Jianfeng; Wang, Shengkun; Zhang, Min; Lu, Chang-Tien (ACM, 2023-11-13)
    Remote sensing image semantic segmentation is an important problem for remote sensing image interpretation. Although remarkable progress has been achieved, existing deep neural network methods suffer from the reliance on massive training data. Few-shot remote sensing semantic segmentation aims at learning to segment target objects from a query image using only a few annotated support images of the target class. Most existing few-shot learning methods stem primarily from their sole focus on extracting information from support images, thereby failing to effectively address the large variance in appearance and scales of geographic objects. To tackle these challenges, we propose a Self-Correlation and Cross-Correlation Learning Network for the few-shot remote sensing image semantic segmentation. Our model enhances the generalization by considering both self-correlation and cross-correlation between support and query images to make segmentation predictions. To further explore the self-correlation with the query image, we propose to adopt a classical spectral method to produce a class-agnostic segmentation mask based on the basic visual information of the image. Extensive experiments on two remote sensing image datasets demonstrate the effectiveness and superiority of our model in few-shot remote sensing image semantic segmentation. The code is available at
  • Spatial Temporal Graph Neural Networks for Decentralized Control of Robot Swarms
    Chen, Siji; Sun, Yanshen; Li, Peihan; Zhou, Lifeng; Lu, Chang-Tien (ACM, 2023-11-13)
    Recent research has explored the use of graph neural networks (GNNs) for decentralized control in swarm robotics. However, it has been observed that relying solely on local states is insufficient to imitate a centralized control policy. To address this limitation, previous studies proposed incorporating 𝐾-hop delayed states into the computation. While this approach shows promise, it can lead to a lack of consensus among distant flock members and the formation of small localized groups, ultimately resulting in task failure. Our approach is to include the delayed states to build a spatiotemporal GNN model (ST-GNN) by two levels of expansion: spatial expansion and temporal expansion. The spatial expansion utilizes 𝐾-hop delayed states to broaden the network while temporal expansion, can effectively predict the trend of swarm behavior, making it more robust against local noise. To validate the effectiveness of our approach, we conducted simulations in two distinct scenarios: free flocking and flocking with a leader. In both scenarios, the simulation results demonstrated that our decentralized ST-GNN approach successfully overcomes the limitations of local controllers. We performed a comprehensive analysis on the effectiveness of spatial expansions and temporal expansions independently. The results clearly demonstrate that both significantly improve overall performance. Furthermore, when combined, they achieve the best performance compared to global solution and delayed states solutions. The performance of ST-GNN underscores its potential as an effective and reliable approach for achieving cohesive flocking behavior while ensuring safety and maintaining desired swarm characteristics.
  • Single-Image 3D Human Digitization with Shape-guided Diffusion
    Albahar, Badour; Saito, Shunsuke; Tseng, Hung-Yu; Kim, Changil; Kopf, Johannes; Huang, Jia-Bin (ACM, 2023-12-10)
    We present an approach to generate a 360-degree view of a person with a consistent, high-resolution appearance from a single input image. NeRF and its variants typically require videos or images from different viewpoints. Most existing approaches taking monocular input either rely on ground-truth 3D scans for supervision or lack 3D consistency. While recent 3D generative models show promise of 3D consistent human digitization, these approaches do not generalize well to diverse clothing appearances, and the results lack photorealism. Unlike existing work, we utilize high-capacity 2D diffusion models pretrained for general image synthesis tasks as an appearance prior of clothed humans. To achieve better 3D consistency while retaining the input identity, we progressively synthesize multiple views of the human in the input image by inpainting missing regions with shape-guided diffusion conditioned on silhouette and surface normal. We then fuse these synthesized multi-view images via inverse rendering to obtain a fully textured high-resolution 3D mesh of the given person. Experiments show that our approach outperforms prior methods and achieves photorealistic 360-degree synthesis of a wide range of clothed humans with complex textures from a single image.
  • Triggering Modes in Spectrum-Based Multi-location Fault Localization
    Dao, Tung; Meng, Na; Nguyen, ThanhVu (ACM, 2023-11-30)
    Spectrum-based fault localization (SBFL) techniques can aid in debugging, but their practicality in industrial settings has been limited due to the large number of tests needed to execute before applying SBFL. Previous research has explored different trigger modes for SBFL and found that applying it immediately after the first test failure is also effective. However, this study only considered single-location bugs, while multi-location bugs are prevalent in real-world scenarios and especially at our company Cvent, which is interested in integrating SBFL to its CI/CD workflow. In this work, we investigate the effectiveness of SBFL on multilocation bugs and propose a framework called Instant Fault Localization for Multi-location Bugs (IFLM). We compare and evaluate four trigger modes of IFLM using open-source (Defects4J) and close-source (Cvent) bug datasets. Our study showed that it is not necessary to execute all test cases before applying SBFL. However, we also found that that applying SBFL right after the first failed test is less effective than applying it after executing all tests for multi-location bugs, which is contrary to the single-location bug study. We also observe differences in performance between real and artificial bugs. Our contributions include the development of IFLM and CVent bug datasets, analysis of SBFL effectiveness for multi-location bugs, and practical implications for integrating SBFL in industrial environments.
  • Co-dependence Aware Fuzzing for Dataflow-Based Big Data Analytics
    Humayun, Ahmad; Kim, Miryung; Gulzar, Muhammad Ali (ACM, 2023-11-30)
    Data-intensive scalable computing has become popular due to the increasing demands of analyzing big data. For example, Apache Spark and Hadoop allow developers to write dataflow-based applications with user-defined functions to process data with custom logic. Testing such applications is difficult. (1) These applications often take multiple datasets as input. (2) Unlike in SQL, there is no explicit schema for these datasets and each unstructured (or semi-structured) dataset is segmented and parsed at runtime. (3) Dataflow operators (e.g., join) create implicit co-dependence constraints between the fields of multiple datasets. An efficient and effective testing technique must analyze co-dependence among different regions of multiple datasets at the level of rows and columns and orchestrate input mutations jointly on co-dependent regions. We propose DepFuzz to increase the effectiveness and efficiency of fuzz testing dataflow-based big data applications. The key insight behind DepFuzz is twofold. It keeps track of which code segments operate on which datasets, which rows, and which columns. By analyzing the use of dataflow operators (e.g., join and groupByKey) in tandem with the semantics of UDFs, DepFuzz generates test data that subsequently reach hard-to-reach regions of the application code. In real-world big data applications, DepFuzz finds 3.4× more faults, achieving 29% more statement coverage in half the time as Jazzer’s, a state-of-the-art commercial fuzzer for Java bytecode. It outperforms prior DISC testing by exposing deeper semantic faults beyond simpler input formatting errors, especially when multiple datasets have complex interactions through dataflow operators.
  • BigDataflow: A Distributed Interprocedural Dataflow Analysis Framework
    Sun, Zewen; Xu, Duanchen; Zhang, Yiyu; Qi, Yun; Wang, Yueyang; Zuo, Zhiqiang; Wang, Zhaokang; Li, Yue; Li, Xuandong; Lu, Qingda; Peng, Wenwen; Guo, Shengjian (ACM, 2023-11-30)
    Apart from forming the backbone of compiler optimization, static dataflow analysis has been widely applied in a vast variety of applications, such as bug detection, privacy analysis, program comprehension, etc. Despite its importance, performing interprocedural dataflow analysis on large-scale programs is well known to be challenging.In this paper, we propose a novel distributed analysis framework supporting the general interprocedural dataflow analysis.Inspired by large-scale graph processing, we devise a dedicated distributed worklist algorithm tailored for interprocedural dataflow analysis. We implement the algorithm and develop a distributed framework called BigDataflow running on a large-scale cluster.The experimental results validate the promising performance of BigDataflow – it can finish analyzing the program of millions lines of code in minutes. Compared with the state-of-the-art, BigDataflow achieves much more analysis efficiency.
  • FedDefender: Backdoor Attack Defense in Federated Learning
    Gill, Waris; Anwar, Ali; Gulzar, Muhammad Ali (ACM, 2023-12-04)
    Federated Learning (FL) is a privacy-preserving distributed machine learning technique that enables individual clients (e.g., user participants, edge devices, or organizations) to train a model on their local data in a secure environment and then share the trained model with an aggregator to build a global model collaboratively. In this work, we propose FedDefender, a defense mechanism against targeted poisoning attacks in FL by leveraging differential testing. FedDefender first applies differential testing on clients’ models using a synthetic input. Instead of comparing the output (predicted label), which is unavailable for synthetic input, FedDefender fingerprints the neuron activations of clients’ models to identify a potentially malicious client containing a backdoor. We evaluate FedDefender using MNIST and FashionMNIST datasets with 20 and 30 clients, and our results demonstrate that FedDefender effectively mitigates such attacks, reducing the attack success rate (ASR) to 10% without deteriorating the global model performance.
  • Arguments for and Approaches to Computing Education in Undergraduate Computer Science Programmes
    Cutts, Quintin; Kallia, Maria; Anderson, Ruth; Crick, Tom; Devlin, Marie; Farghally, Mohammed; Mirolo, Claudio; Runde, Ragnhild Kobro; Seppälä, Otto; Urquiza-Fuentes, Jaime; Vahrenhold, Jan (ACM, 2023-12-22)
    Computing education (CE), the scientific foundation of the teaching and learning of subject matter specific to computing, has matured into a field with its own research journals and conferences as well as graduate programmes. Yet, and unlike other mature subfields of computer science (CS), it is rarely taught as part of undergraduate CS programmes. In this report, we present a gap analysis resulting from semi-structured interviews with various types of stakeholders and derive a set of arguments for teaching CE courses in undergraduate CS programmes. This analysis and the arguments highlight a number of opportunities for the discipline of CS at large, in academia, in industry, and in school education, that would be opened up with undergraduate CE courses, as well as potential barriers to implementation that will need to be overcome. We also report on the results of a Delphi process performed to elicit topics for such a course with various audiences in mind. The Delphi process yielded 19 high-level categories that encompass the subject matter CE courses should incorporate, tailored to the specific needs of their intended student audiences. This outcome underscores the extensive range of content that can be integrated into a comprehensive CE programme. Based on these two stakeholder interactions as well as a systematic literature review aiming to explore the current practices in teaching CE to undergraduate students, we develop two prototypical outlines of such a course, keeping in mind that departments may have different preferences and affordances resulting in different kinds of CE offerings. Overall, input from external stakeholders underscores the clear significance of undergraduate CE courses. We anticipate leveraging this valuable feedback to actively promote these courses on a broader scale.
  • Modeling Women's Elective Choices in Computing
    Bradley, Steven; Parker, Miranda; Altin, Rukiye; Barker, Lecia; Hooshangi, Sara; Kunkeler, Thom; Lennon, Ruth; McNeill, Fiona; Minguillón, Julià; Parkinson, Jack; Peltsverger, Svetlana; Sibia, Naaz (ACM, 2023-12-22)
    Evidence-based strategies suggest ways to reduce the gender gap in computing. For example, elective classes are valuable in enabling students to choose in which directions to expand their computing knowledge in areas aligned with their interests. The availability of electives of interest may also make computing programs of study more meaningful to women. However, research on which elective computing topics are more appealing to women is often class or institution specific. In this study, we investigate differences in enrollment within undergraduate-level elective classes in computing to study differences between women and men. The study combined data from nine institutions from both Western Europe and North America and included 272 different classes with 49,710 student enrollments. These classes were encoded using ACM curriculum guidelines and combined with the enrollment data to build a hierarchical statistical model of factors affecting student choice. Our model shows which elective topics are less popular with all students (including fundamentals of programming languages and parallel and distributed computing), and which elective topics are more popular with women students (including mathematical and statistical foundations, human computer interaction and society, ethics, and professionalism). Understanding which classes appeal to different students can help departments gain insight of student choices and develop programs accordingly. Additionally, these choices can also help departments explore whether some students are less likely to choose certain classes than others, indicating potential barriers to participation in computing.
  • A First Look at Toxicity Injection Attacks on Open-domain Chatbots
    Weeks, Connor; Cheruvu, Aravind; Abdullah, Sifat Muhammad; Kanchi, Shravya; Yao, Daphne; Viswanath, Bimal (ACM, 2023-12-04)
    Chatbot systems have improved significantly because of the advances made in language modeling. These machine learning systems follow an end-to-end data-driven learning paradigm and are trained on large conversational datasets. Imperfections or harmful biases in the training datasets can cause the models to learn toxic behavior, and thereby expose their users to harmful responses. Prior work has focused on measuring the inherent toxicity of such chatbots, by devising queries that are more likely to produce toxic responses. In this work, we ask the question: How easy or hard is it to inject toxicity into a chatbot after deployment? We study this in a practical scenario known as Dialog-based Learning (DBL), where a chatbot is periodically trained on recent conversations with its users after deployment. A DBL setting can be exploited to poison the training dataset for each training cycle. Our attacks would allow an adversary to manipulate the degree of toxicity in a model and also enable control over what type of queries can trigger a toxic response. Our fully automated attacks only require LLM-based software agents masquerading as (malicious) users to inject high levels of toxicity. We systematically explore the vulnerability of popular chatbot pipelines to this threat. Lastly, we show that several existing toxicity mitigation strategies (designed for chatbots) can be significantly weakened by adaptive attackers.
  • Towards Energy-Efficient Spiking Neural Networks: A Robust Hybrid CMOS-Memristive Accelerator
    Nowshin, Fabiha; An, Hongyu; Yi, Yang (ACM, 2024)
    Spiking Neural Networks (SNNs) are energy-efficient artificial neural network models that can carry out data-intensive applications. Energy consumption, latency, and memory bottleneck are some of the major issues that arise in machine learning applications due to their data-demanding nature. Memristor-enabled Computing-In-Memory (CIM) architectures have been able to tackle the memory wall issue, eliminating the energy and time-consuming movement of data. In this work we develop a scalable CIM-based SNN architecture with our fabricated two-layer memristor crossbar array. In addition to having an enhanced heat dissipation capability, our memristor exhibits substantial enhancement of 10% to 66% in design area, power and latency compared to state-of-the-art memristors. This design incorporates an inter-spike interval (ISI) encoding scheme due to its high information density to convert the incoming input signals into spikes. Furthermore, we include a time-to-first-spike (TTFS) based output processing stage for its energy-efficiency to carry out the final classification. With the combination of ISI, CIM and TTFS, this network has a competitive inference speed of 2?s/image and can successfully classify handwritten digits with 2.9mW of power and 2.51pJ energy per spike. The proposed architecture with the ISI encoding scheme can achieve ~10% higher accuracy than those of other encoding schemes in the MNIST dataset.
  • Delegation of TLS Authentication to CDNs using Revocable Delegated Credentials
    Yoon, Daegeun; Chung, Taejoong; Kim, Yongdae (ACM, 2023-12-04)
    When using a Content Delivery Network (CDN), domain owners typically delegate Transport Layer Security (TLS) authentication to the CDN by sharing their TLS certificate’s private key. However, this practice not only delegates TLS authentication but also grants the CDN complete control over the certificate. To mitigate these concerns, Delegated Credential (DC) was proposed as a solution; DC, which contains both the CDN’s public key and the domain owner’s signature, allows the domain owners to delegate their own credentials for TLS authentication, thereby avoiding the need to share their private keys. However, the absence of a mechanism to distribute the revocation status of a DC renders it non-revocable, even when a compromise of a credential has been detected. DCs were thus designed to be short-lived, necessitating frequent renewal for continued use. To overcome this limitation, we designed Revocable Delegated Credential (RDC), which provides a revocation method for DCs. With RDCs, there is no need for frequent renewals as they can be revoked, allowing for a longer validity period. The revocation status of RDCs is distributed via DNS, an essential component of web communication. RDCs utilize the NSEC record, a type of DNSSEC record, as a means to store, validate, and easily manage their revocation status. When domain owners no longer trust their CDNs or detect compromise in their RDCs, they can distribute the RDC’s revocation status by simply creating a subdomain named with an RDC identifier. The browser then confirms the existence of this subdomain using the NSEC record to validate the revocation status. We implemented RDC in the go tls package and Firefox Nightly to demonstrate and evaluate its feasibility.
  • An End-to-End High-performance Deduplication Scheme for Docker Registries and Docker Container Storage Systems
    Zhao, Nannan; Lin, Muhui; Albahar, Hadeel; Paul, Arnab K.; Huan, Zhijie; Abraham, Subil; Chen, Keren; Tarasov, Vasily; Skourtis, Dimitrios; Anwar, Ali; Butt, Ali R. (ACM, 2024)
    The wide adoption of Docker containers for supporting agile and elastic enterprise applications has led to a broad proliferation of container images. The associated storage performance and capacity requirements place high pressure on the infrastructure of container registries that store and distribute images and container storage systems on the Docker client side that manage image layers and store ephemeral data generated at container runtime. The storage demand is worsened by the large amount of duplicate data in images. Moreover, container storage systems that use Copy-on-Write (CoW) file systems as storage drivers exacerbate the redundancy. Exploiting the high file redundancy in real-world images is a promising approach to drastically reduce the growing storage requirements of container registries and improve the space efficiency of container storage systems. However, existing deduplication techniques significantly degrade the performance of both registries and container storage systems because of data reconstruction overhead as well as the deduplication cost. We propose DupHunter, an end-to-end deduplication that deduplicates layers for both Docker registries and container storage systems while maintaining a high image distribution speed and container I/O performance. DupHunter is divided into 3 tiers: Docker registry tier, middle tier, and client tier. Specifically, we first build a high-performance deduplication engine at the Docker registry tier that not only natively deduplicates layers for space savings but also reduces layer restore overhead. Then, we use deduplication offloading at the middle tier that utilizes the deduplication engine to eliminate the redundant files from the client tier, which avoids introducing deduplication overhead to the Docker client side. To further reduce the data duplicates caused by CoW and improve the container I/O performance, we use a container-aware backing file system at the client tier that preallocates space for each container and ensures that files in a container and its modifications are placed and redirected closer on the disk to maintain locality. Under real workloads, DupHunter reduces storage space by up to 6.9× and reduces the GET layer latency by up to 2.8× compared to the state-of-the-art. Moreover, DupHunter can improve the container I/O performance by up to 93% for reads and 64% for writes.
  • Introduction to the Special Section on USENIX ATC 2023
    Williams, Dan; Lawall, Julia (ACM, 2024-02-19)
  • Citation Forecasting with Multi-Context Attention-Aided Dependency Modeling
    Ji, Taoran; Self, Nathan; Fu, Kaiqun; Chen, Zhiqian; Ramakrishnan, Naren; Lu, Chang-Tien (ACM, 2024)
    Forecasting citations of scientific patents and publications is a crucial task for understanding the evolution and development of technological domains and for foresight into emerging technologies. By construing citations as a time series, the task can be cast into the domain of temporal point processes. Most existing work on forecasting with temporal point processes, both conventional and neural network-based, only performs single-step forecasting. In citation forecasting, however, the more salient goal is n-step forecasting: predicting the arrival of the next n citations. In this paper, we propose Dynamic Multi-Context Attention Networks (DMA-Nets), a novel deep learning sequence-to-sequence (Seq2Seq) model with a novel hierarchical dynamic attention mechanism for long-term citation forecasting. Extensive experiments on two real-world datasets demonstrate that the proposed model learns better representations of conditional dependencies over historical sequences compared to state-of-the-art counterparts and thus achieves significant performance for citation predictions.
  • RedCapes: the Design and Evaluation of a Game Towards Improving Autistic Children's Privacy Awareness
    Yuan, Xiaowen; Ye, Hongni; Tang, Ziheng; Zhu, Xiangrong; Yao, Yaxing; Tong, Xin (ACM, 2023-11-13)
    Autistic children have differences in social communication, making them vulnerable to privacy risks in social contexts. Research on typical development (TD) children’s privacy learning often neglects autistic children’s unique needs. Therefore, our study aims to understand their challenges in learning privacy and design an effective privacy education game for them. We designed a serious game, RedCapes, and recruited 9 autistic children and 6 TD children to evaluate the game. Our findings suggested that RedCapes improved autistic children’s privacy awareness. Compared to TD children, autistic children have more difficulty identifying relevant privacy risk factors and understanding the full consequences of privacy violations. We propose three design implications for future privacy education games for autistic children. Our work contributes: insights into autistic children’s challenges in learning privacy, a serious game prototype for privacy education, and design recommendations for future privacy education games focused on autistic children.