Browsing by Author "Anwar, Ali"
Now showing 1 - 7 of 7
Results Per Page
Sort Options
- An End-to-End High-performance Deduplication Scheme for Docker Registries and Docker Container Storage SystemsZhao, Nannan; Lin, Muhui; Albahar, Hadeel; Paul, Arnab K.; Huan, Zhijie; Abraham, Subil; Chen, Keren; Tarasov, Vasily; Skourtis, Dimitrios; Anwar, Ali; Butt, Ali R. (ACM, 2024)The wide adoption of Docker containers for supporting agile and elastic enterprise applications has led to a broad proliferation of container images. The associated storage performance and capacity requirements place high pressure on the infrastructure of container registries that store and distribute images and container storage systems on the Docker client side that manage image layers and store ephemeral data generated at container runtime. The storage demand is worsened by the large amount of duplicate data in images. Moreover, container storage systems that use Copy-on-Write (CoW) file systems as storage drivers exacerbate the redundancy. Exploiting the high file redundancy in real-world images is a promising approach to drastically reduce the growing storage requirements of container registries and improve the space efficiency of container storage systems. However, existing deduplication techniques significantly degrade the performance of both registries and container storage systems because of data reconstruction overhead as well as the deduplication cost. We propose DupHunter, an end-to-end deduplication that deduplicates layers for both Docker registries and container storage systems while maintaining a high image distribution speed and container I/O performance. DupHunter is divided into 3 tiers: Docker registry tier, middle tier, and client tier. Specifically, we first build a high-performance deduplication engine at the Docker registry tier that not only natively deduplicates layers for space savings but also reduces layer restore overhead. Then, we use deduplication offloading at the middle tier that utilizes the deduplication engine to eliminate the redundant files from the client tier, which avoids introducing deduplication overhead to the Docker client side. To further reduce the data duplicates caused by CoW and improve the container I/O performance, we use a container-aware backing file system at the client tier that preallocates space for each container and ensures that files in a container and its modifications are placed and redirected closer on the disk to maintain locality. Under real workloads, DupHunter reduces storage space by up to 6.9× and reduces the GET layer latency by up to 2.8× compared to the state-of-the-art. Moreover, DupHunter can improve the container I/O performance by up to 93% for reads and 64% for writes.
- FedDefender: Backdoor Attack Defense in Federated LearningGill, Waris; Anwar, Ali; Gulzar, Muhammad Ali (ACM, 2023-12-04)Federated Learning (FL) is a privacy-preserving distributed machine learning technique that enables individual clients (e.g., user participants, edge devices, or organizations) to train a model on their local data in a secure environment and then share the trained model with an aggregator to build a global model collaboratively. In this work, we propose FedDefender, a defense mechanism against targeted poisoning attacks in FL by leveraging differential testing. FedDefender first applies differential testing on clients’ models using a synthetic input. Instead of comparing the output (predicted label), which is unavailable for synthetic input, FedDefender fingerprints the neuron activations of clients’ models to identify a potentially malicious client containing a backdoor. We evaluate FedDefender using MNIST and FashionMNIST datasets with 20 and 30 clients, and our results demonstrate that FedDefender effectively mitigates such attacks, reducing the attack success rate (ASR) to 10% without deteriorating the global model performance.
- FLOAT: Federated Learning Optimizations with Automated TuningKhan, Ahmad; Khan, Azal Ahmad; Abdelmoniem, Ahmed M.; Fountain, Samuel; Butt, Ali R.; Anwar, Ali (ACM, 2024-04-22)Federated Learning (FL) has emerged as a powerful approach that enables collaborative distributed model training without the need for data sharing. However, FL grapples with inherent heterogeneity challenges leading to issues such as stragglers, dropouts, and performance variations. Selection of clients to run an FL instance is crucial, but existing strategies introduce biases and participation issues and do not consider resource efficiency. Communication and training acceleration solutions proposed to increase client participation also fall short due to the dynamic nature of system resources. We address these challenges in this paper by designing FLOAT, a novel framework designed to boost FL client resource awareness. FLOAT optimizes resource utilization dynamically for meeting training deadlines, and mitigates stragglers and dropouts through various optimization techniques; leading to enhanced model convergence and improved performance. FLOAT leverages multi-objective Reinforcement Learning with Human Feedback (RLHF) to automate the selection of the optimization techniques and their configurations, tailoring them to individual client resource conditions. Moreover, FLOAT seamlessly integrates into existing FL systems, maintaining non-intrusiveness and versatility for both asynchronous and synchronous FL settings. As per our evaluations, FLOAT increases accuracy by up to 53%, reduces client dropouts by up to 78×, and improves communication, computation, and memory utilization by up to 81×, 44×, and 20× respectively.
- MOANA: Modeling and Analyzing I/O Variability in Parallel System Experimental DesignCameron, Kirk W.; Anwar, Ali; Cheng, Yue; Xu, Li; Li, Bo; Ananth, Uday; Lux, Thomas; Hong, Yili; Watson, Layne T.; Butt, Ali R. (Department of Computer Science, Virginia Polytechnic Institute & State University, 2018-04-19)Exponential increases in complexity and scale make variability a growing threat to sustaining HPC performance at exascale. Performance variability in HPC I/O is common, acute, and formidable. We take the first step towards comprehensively studying linear and nonlinear approaches to modeling HPC I/O system variability. We create a modeling and analysis approach (MOANA) that predicts HPC I/O variability for thousands of software and hardware configurations on highly parallel shared-memory systems. Our findings indicate nonlinear approaches to I/O variability prediction are an order of magnitude more accurate than linear regression techniques. We demonstrate the use of MOANA to accurately predict the confidence intervals of unmeasured I/O system configurations for a given number of repeat runs – enabling users to quantitatively balance experiment duration with statistical confidence.
- Optimizing Systems for Deep Learning ApplicationsAlbahar, Hadeel Ahmad (Virginia Tech, 2023-03-01)Modern systems for Machine Learning (ML) workloads support heterogeneous workloads and resources. However, existing resource managers in these systems do not differentiate between heterogeneous GPU resources. Moreover, users are usually unaware of the sufficient and appropriate type and amount of GPU resources to request for their ML jobs. In this thesis, we analyze the performance of ML training and inference jobs and identify ML model and GPU characteristics that impact this performance. We then propose ML-based prediction models to accurately determine appropriate and sufficient resource requirements to ensure improved job latency and GPU utilization in the cluster.
- Perceived values and motivations influencing m-commerce use: A nine-country comparative studyAshraf, Abdul R.; Tek, Narongsak Thongpapanl; Anwar, Ali; Lapa, Luciano; Venkatesh, Viswanath (Elsevier, 2021-08-01)Mobile commerce (m-commerce) has become increasingly important for organizations attempting to grow revenue by expanding into international markets. However, for multinational mobile retailers (m-retailers), one of the greatest challenges lies in carefully managing their websites across multiple national markets. This work advances cross-national research on m-retailing by (1) examining how value dimensions shape m-shoppers’ motivations, (2) analyzing differential effects of hedonic and utilitarian motivations on intention and habit, and (3) examining the competing roles of conscious (intentional) and unconscious (habitual) m-commerce use drivers across developed and developing countries. This research also examines the moderating role of m-commerce readiness at the country level on the effect of motivation on intention and habit, along with their impact on m-commerce use. Based on data from 1,975 m-shoppers in nine countries (Australia, Bangladesh, Brazil, India, Pakistan, Singapore, the United Kingdom, the United States, and Vietnam) across four continents, the results demonstrate differential relationships: consumers at an advanced (early) readiness stage are more likely to be hedonism-motivated (utility-motivated) when using m-commerce and tend to use it intentionally/consciously (habitually/unconsciously). In addition to advancing knowledge about m-commerce from a scientific perspective, the findings can help multinational firms decide whether to standardize or adapt m-shopping experiences when internationalizing.
- Towards Efficient and Flexible Object Storage Using Resource and Functional PartitioningAnwar, Ali (Virginia Tech, 2018-06-08)Modern storage systems are designed to manage data without considering the dynamicity of user or resource requirements. This design approach does not consider the complexities of the dynamically changing runtime application behaviors as well as the unique features of underlying resources. To this end, this dissertation studies how resource and functional partitioning strategies can improve efficiency and flexibility of object stores. This dissertation presents a series of practical and efficient techniques, algorithms, and optimizations to realize efficient and flexible object stores. The experimental evaluation demonstrates the effectiveness of our design choices and strategies to make object stores flexible and resource-aware.