Browsing by Author "Hoang, Thang"
Now showing 1 - 11 of 11
Results Per Page
Sort Options
- Breaking Privacy in Model-Heterogeneous Federated LearningHaldankar, Atharva Amit (Virginia Tech, 2024-05-14)Federated learning (FL) is a communication protocol that allows multiple distrustful clients to collaboratively train a machine learning model. In FL, data never leaves client devices; instead, clients only share locally computed gradients or model parameters with a central server. As individual gradients may leak information about a given client's dataset, secure aggregation was proposed. With secure aggregation, the server only receives the aggregate gradient update from the set of all sampled clients without being able to access any individual gradient. One challenge in FL is the systems-level heterogeneity that is quite often present among client devices. Specifically, clients in the FL protocol may have varying levels of compute power, on-device memory, and communication bandwidth. These limitations are addressed by model-heterogeneous FL schemes, where clients are able to train on subsets of the global model. Despite the benefits of model-heterogeneous schemes in addressing systems-level challenges, the implications of these schemes on client privacy have not been thoroughly investigated. In this thesis, we investigate whether the nature of model distribution and the computational heterogeneity among client devices in model-heterogeneous FL schemes may result in the server being able to recover sensitive information from target clients. To this end, we propose two novel attacks in the model-heterogeneous setting, even with secure aggregation in place. We call these attacks the Convergence Rate Attack and the Rolling Model Attack. The Convergence Rate Attack targets schemes where clients train on the same subset of the global model, while the Rolling Model Attack targets schemes where model-parameters are dynamically updated each round. We show that a malicious adversary is able to compromise the model and data confidentiality of a target group of clients. We evaluate our attacks on the MNIST dataset and show that using our techniques, an adversary can reconstruct data samples with high fidelity.
- CLOSUREX: Transforming Source Code for Correct Persistent FuzzingRanjan, Rishi (Virginia Tech, 2024-05-29)Fuzzing is a popular technique which has been adopted for automated vulnerability research for software hardening. Research reveals that increasing fuzzing throughput directly increases bug discovery rate. Given fuzzing revolves around executing a large number of test cases, test case execution rate is the dominant component of overall fuzzing throughput. To increase test case execution rate, researchers provide techniques that reduce the amount of time spent performing work that is independent of specific test case data. The highest performance approach is persistent fuzzing, which reuses a single process for all test cases by looping back to the start instead of exiting. This eliminates all process initialization and tear-down costs. Unfortunately, persistent fuzzing leads to semantically inconsistent program states because process state changes from one test case remains for subsequent test cases. This semantic inconsistency results in both missed crashes and false crashes, undermining fuzzing effectiveness. I observe that existing fuzzing execution mechanisms exist on a continuum, based on the amount of state that gets discarded and restored between test cases. I present a fuzzing execution mechanism that sits at a new spot on this state restoration continuum, where only test-case-execution-specific state is reset. This fine-grain state restoration provides near-persistent performance with the correctness of heavyweight state restoration. I construct CLOSUREX as a set of LLVM compiler passes that integrate with AFL++. Our evaluation on ten popular open-source fuzzing targets show that CLOSUREX maintains semantic correctness all while increasing test case execution rate by over 3.5x, on average, compared to AFL++. CLOSUREX also finds bugs more consistently and 1.9x faster than AFL++, with CLOSUREX discovering 15 0-day bugs (4 CVEs).
- Exploiting Update Leakage in Searchable Symmetric EncryptionHaltiwanger, Jacob Sayid (Virginia Tech, 2024-03-15)Dynamic Searchable Symmetric Encryption (DSSE) provides efficient techniques for securely searching and updating an encrypted database. However, efficient DSSE schemes leak some sensitive information to the server. Recent works have implemented forward and backward privacy as security properties to reduce the amount of information leaked during update operations. Many attacks have shown that leakage from search operations can be abused to compromise the privacy of client queries. However, the attack literature has not rigorously investigated techniques to abuse update leakage. In this work, we investigate update leakage under DSSE schemes with forward and backward privacy from the perspective of a passive adversary. We propose two attacks based on a maximum likelihood estimation approach, the UFID Attack and the UF Attack, which target forward-private DSSE schemes with no backward privacy and Level 2 backward privacy, respectively. These are the first attacks to show that it is possible to leverage the frequency and contents of updates to recover client queries. We propose a variant of each attack which allows the update leakage to be combined with search pattern leakage to achieve higher accuracy. We evaluate our attacks against a real-world dataset and show that using update leakage can improve the accuracy of attacks against DSSE schemes, especially those without backward privacy.
- Exploiting Update Leakage in Searchable Symmetric EncryptionHaltiwanger, Jacob; Hoang, Thang (ACM, 2024-06-19)Dynamic Searchable Symmetric Encryption (DSSE) provides efficient techniques for securely searching and updating an encrypted database. However, efficient DSSE schemes leak some sensitive information to the server. Recent works have implemented forward and backward privacy as security properties to reduce the amount of information leaked during update operations. Many attacks have shown that leakage from search operations can be abused to compromise the privacy of client queries. However, the attack literature has not rigorously investigated techniques to abuse update leakage. In this work, we investigate update leakage under DSSE schemes with forward and backward privacy from the perspective of a passive adversary. We propose two attacks based on a maximum likelihood estimation approach, the UFID Attack and the UF Attack, which target forward-private DSSE schemes with no backward privacy and Level II backward privacy, respectively. These are the first attacks to show that it is possible to leverage the frequency and contents of updates to recover client queries. We propose a variant of each attack which allows the update leakage to be combined with search pattern leakage to achieve higher accuracy. We evaluate our attacks against a real-world dataset and show that using update leakage can improve the accuracy of attacks against DSSE schemes, especially those without backward privacy.
- ezDPS: An Efficient and Zero-Knowledge Machine Learning Inference PipelineWang, Haodi; Hoang, Thang (2023)Machine Learning as a service (MLaaS) permits resource-limited clients to access powerful data analytics services ubiquitously. Despite its merits, MLaaS poses significant concerns regarding the integrity of delegated computation and the privacy of the server’s model parameters. To address this issue, Zhang et al. (CCS’20) initiated the study of zero-knowledge Machine Learning (zkML). Few zkML schemes have been proposed afterward; however, they focus on sole ML classification algorithms that may not offer satisfactory accuracy or require large-scale training data and model parameters, which may not be desirable for some applications. We propose ezDPS, a new efficient and zero-knowledge ML inference scheme. Unlike prior works, ezDPS is a zkML pipeline in which the data is processed in multiple stages for high accuracy. Each stage of ezDPS is harnessed with an established ML algorithm that is shown to be effective in various applications, including DiscreteWavelet Transformation, Principal Components Analysis, and Support Vector Machine. We design new gadgets to prove ML operations effectively. We fully implemented ezDPS and assessed its performance on real datasets. Experimental results showed that ezDPS achieves one-to-three orders of magnitude more efficient than the generic circuit-based approach in all metrics while maintaining more desirable accuracy than single ML classification approaches.
- Harpocrates: Privacy-Preserving and Immutable Audit Log for Sensitive Data OperationsThazhath, Mohit Bhasi (Virginia Tech, 2022-06-10)The immutability, validity and confidentiality of an audit log is crucial when operating over sensitive data to comply to standard data regulations (e.g., HIPAA). Despite its critical needs, state-of-the-art privacy-preserving audit log schemes (e.g., Ghostor (NSDI '20), Calypso (VLDB '19)) do not fully obtain a high level of privacy, integrity, and immutability simultaneously, in which certain information (e.g., user identities) is still leaked in the log. In this work, we propose Harpocrates, a new privacy-preserving and immutable audit log scheme. Harpocrates permits data store, share, and access operations to be recorded in the audit log without leaking sensitive information (e.g., data identifier, user identity), while permitting the validity of data operations to be publicly verifiable. Harpocrates makes use of blockchain techniques to achieve immutability and avoid a single point of failure, while cryptographic zero-knowledge proofs are harnessed for confidentiality and public verifiability. We analyze the security of our proposed technique and prove that it achieves non-malleability and indistinguishability. We fully implemented Harpocrates and evaluated its performance on a real blockchain system (i.e., Hyperledger Fabric) deployed on a commodity platform (i.e., Amazon EC2). Experimental results demonstrated that Harpocrates is highly scalable and achieves practical performance.
- Interdependent Mission Impact Assessment of an IoT System with Hypergame-Theoretic Attack-Defense Behavior ModelingThukkaraju, Ashrith Reddy (Virginia Tech, 2023-11-17)Mission impact assessment (MIA) research has been explored to evaluate the performance and effectiveness of a mission system, such as enterprise networks with organizational missions and military or tactical mission teams with assigned missions. The key components in such mission systems, including assets, services, tasks, vulnerability, attacks, and defenses, are interdependent, and their impacts are interwoven. However, the current state-of-the-art MIA approaches have less studied such interdependencies. In addition, they have not modeled strategic attack-defense interactions under partial observability. In this work, we propose a novel MIA framework that assesses measures of performance (MoP) or measures of effectiveness (MoE) based on the service requirements (e.g., correctness or timeliness) of a given mission system based on full and comprehensive modeling and simulation of the key system components and their interdependencies. Particularly, we model intelligent attack-defense strategy selections based on hypergame theory, which allows considering uncertainty in estimating each player's hypergame expected utility (HEU) for its best strategy selection. As the case study, we consider an Internet-of-Things (IoT)-based mission system aiming to accurately and timely detect an object, given stringent accuracy and time constraints for successful mission completion. Via extensive simulation experiments, we validate the quality of the proposed MIA tool in its inference accuracy of the mission performance under a wide range of different environmental settings hindering the mission performance assessment and attack-defense interactions. Our results prove that the developed MIA framework shows a sufficiently high inference accuracy (e.g., 80%) even with a small portion of the training dataset (e.g., 20-50%). We also found the MIA can better assess the system's mission performance when attackers exhibit clearer patterns to take more strategic actions using hypergame theory.
- Measuring and Understanding TTL Violations in DNS ResolversBhowmick, Protick (Virginia Tech, 2024-01-02)The Domain Name System (DNS) is a scalable-distributed caching architecture where each DNS records are cached around several DNS servers distributed globally. DNS records include a time-to-live (TTL) value that dictates how long the record can be stored before it's evicted from the cache. TTL holds significant importance in aspects of DNS security, such as determining the caching period for DNSSEC-signed responses, as well as performance, like the responsiveness of CDN-managed domains. On a high level, TTL is crucial for ensuring efficient caching, load distribution, and network security in Domain Name System. Setting appropriate TTL values is a key aspect of DNS administration to ensure the reliable and efficient functioning of the Domain Name System. Therefore, it is crucial to measure how TTL violations occur in resolvers. But, assessing how DNS resolvers worldwide handle TTL is not easy and typically requires access to multiple nodes distributed globally. In this work, we introduce a novel methodology for measuring TTL violations in DNS resolvers leveraging a residential proxy service called Brightdata, enabling us to evaluate more than 27,000 resolvers across 9,500 Autonomous Systems (ASes). We found that 8.74% arbitrarily extends TTL among 8,524 resolvers that had atleast five distinct exit nodes. Additionally, we also find that the DNSSEC standard is being disregarded by 44.1% of DNSSEC-validating resolvers, as they continue to provide DNSSEC-signed responses even after the RRSIGs have expired.
- No Linux, No Problem: Fast and Correct Windows Binary Fuzzing via Target-embedded SnapshottingStone, Leo Calvin (Virginia Tech, 2023-05-19)Coverage-guided fuzzing remains today's most successful approach for exposing software security vulnerabilities. Speed is paramount in fuzzing, as maintaining a high test case throughput enables more expeditious exploration of programs—leading to faster vulnerability discovery. High-performance fuzzers exploit the Linux kernel's customizability to implement process snapshotting: fuzzing-oriented execution primitives that dramatically increase fuzzing throughput. Unfortunately, such speeds remain elusive on Windows. The closed-source nature of its kernel prevents current kernel-based snapshotting techniques from being ported—severely limiting fuzzing's effectiveness on Windows programs. Thus, accelerating vetting of the Windows software ecosystem demands a fast, correct, and kernel-agnostic fuzzing execution mechanism. We propose making state snapshotting an application-level concern as opposed to a kernel-level concern via target-embedded snapshotting. Target-embedded-snapshotting combines binary- and library-level hooking to allow applications to snapshot themselves—while leaving both their source code and the Windows kernel untouched. Our evaluation on 10 real-world Windows binaries shows that target-embedded snapshotting overcomes the speed, correctness, and compatibility challenges of previous Windows fuzzing execution mechanisms (i.e., process creation, forkserver-based cloning, and in-memory looping). The result is 7–182x increased performance.
- Oblivious RAM in Scalable SGXMarathe, Akhilesh Parag (Virginia Tech, 2024-06-05)The prevalence of cloud storage has yielded significant benefits to consumers. Trusted Exe- cution Environments (TEEs) have been introduced to protect program execution and data in the cloud. However, an attacker targeting the cloud storage server through side-channel attacks can still learn some data in TEEs. This data retrieval is possible through the monitor- ing and analysis of the encrypted ciphertext as well as a program's memory access patterns. As the attacks grow in complexity and accuracy, innovative protection methods must be de- signed to secure data. This thesis proposes and implements an ORAM controller primitive in TEE and protects it from all potential side-channel attacks. This thesis presents two vari- ations, each with two different encryption methods designed to mitigate attacks targeting both memory access patterns and ciphertext analysis. The latency for enabling this protec- tion is calculated and proven to be 75.86% faster overall than the previous implementation on which this thesis is based.
- Privatizing the Volume and Timing of Blockchain TransactionsMiller, Trevor John (Virginia Tech, 2023-03-20)With current state-of-the-art privacy-preserving blockchain solutions, users can submit transactions to a blockchain while maintaining full anonymity and not leaking the contents of the transaction through cryptographic techniques like zero-knowledge proofs and homomorphic encryption. However, the architecture of a blockchain consists of a decentralized network where every network participant maintains their own local copy of the blockchain and updates it upon every added transaction. As a result, the volume of blockchain transactions and the timestamp of each blockchain transaction for an application is publicly available. This is problematic for applications with time-sensitive or volume-sensitive outcomes because users may want this information to be privatized, such as not leaking the lateness of student examinations. However, this is not possible with existing blockchain research. In this thesis, we propose a blockchain system for multi-party applications that does not leak any useful information from the volume and timing metadata of the application's transactions, including maintaining the privacy of a time-sensitive or volume-sensitive outcome. We achieve this by adding sufficient noise using indistinguishable decoy transactions such that an adversary cannot deduce which transactions actually impacted the outcome of the application. This is facilitated in a manner where anyone can publicly verify the application's execution to be correct, fair, and honest. We demonstrate and evaluate our approach by implementing a Dutch auction that supports decoy bid transactions on a private Ethereum blockchain network.