Browsing by Author "Gao, Peng"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
- Defending Against Misuse of Synthetic Media: Characterizing Real-world Challenges and Building Robust DefensesPu, Jiameng (Virginia Tech, 2022-10-07)Recent advances in deep generative models have enabled the generation of realistic synthetic media or deepfakes, including synthetic images, videos, and text. However, synthetic media can be misused for malicious purposes and damage users' trust in online content. This dissertation aims to address several key challenges in defending against the misuse of synthetic media. Key contributions of this dissertation include the following: (1) Understanding challenges with the real-world applicability of existing synthetic media defenses. We curate synthetic videos and text from the wild, i.e., the Internet community, and assess the effectiveness of state-of-the-art defenses on synthetic content in the wild. In addition, we propose practical low-cost adversarial attacks, and systematically measure the adversarial robustness of existing defenses. Our findings reveal that most defenses show significant degradation in performance under real-world detection scenarios, which leads to the second thread of my work: (2) Building detection schemes with improved generalization performance and robustness for synthetic content. Most existing synthetic image detection schemes are highly content-specific, e.g., designed for only human faces, thus limiting their applicability. I propose an unsupervised content-agnostic detection scheme called NoiseScope, which does not require a priori access to synthetic images and is applicable to a wide variety of generative models, i.e., GANs. NoiseScope is also resilient against a range of countermeasures conducted by a knowledgeable attacker. For the text modality, our study reveals that state-of-the-art defenses that mine sequential patterns in the text using Transformer models are vulnerable to simple evasion schemes. We conduct further exploration towards enhancing the robustness of synthetic text detection by leveraging semantic features.
- How Do Java Developers Reuse StackOverflow Answers in Their GitHub Projects?Chen, Juntong (Virginia Tech, 2022-09-09)StackOverflow (SO) is a widely used question-and-answer (QandA) website for software developers and computer scientists. GitHub is a code hosting platform for collaboration and version control. Popular software libraries are open-source and published in repositories on GitHub. Preliminary observation shows developers cite SO questions in their GitHub repository. This observation inspired us to explore the relationship between SO posts and GitHub repositories; to help software developers better understand the characterization of SO answers that are reused by GitHub projects. For this study, we conducted an empirical study to investigate the SO answers reused by Java code from public GitHub projects. We used a hybrid approach to ensure precise results: code clone detection, keyword-based search, and manual inspection. This approach helped us identify the leveraged answers from developers. Based on the identified answers, we further investigated the topics of the discussion threads; answer characteristics (e.g., scores, ages, code lengths, and text lengths) and developers' reuse practices. We observed both reused and unused answers. Compared with unused answers, We found that the reused answers mostly have higher scores, longer code, and longer plain text explanations. Most reused answers were related to implementing specific coding tasks. In one of our observations, 9% (40/430) of scenarios, developers entirely copied code from one or multiple answers of an SO discussion thread. Furthermore, we observed that in the other 91% (390/430) of scenarios, developers only partially reused code or created brand new code from scratch. We investigated 130 SO discussion threads referred to by Java developers in 356 GitHub projects. We then arranged those into five different categories. Our findings can help the SO community have a better distribution of programming knowledge and skills, as well as inspire future research related to SO and GitHub.
- Privatizing the Volume and Timing of Blockchain TransactionsMiller, Trevor John (Virginia Tech, 2023-03-20)With current state-of-the-art privacy-preserving blockchain solutions, users can submit transactions to a blockchain while maintaining full anonymity and not leaking the contents of the transaction through cryptographic techniques like zero-knowledge proofs and homomorphic encryption. However, the architecture of a blockchain consists of a decentralized network where every network participant maintains their own local copy of the blockchain and updates it upon every added transaction. As a result, the volume of blockchain transactions and the timestamp of each blockchain transaction for an application is publicly available. This is problematic for applications with time-sensitive or volume-sensitive outcomes because users may want this information to be privatized, such as not leaking the lateness of student examinations. However, this is not possible with existing blockchain research. In this thesis, we propose a blockchain system for multi-party applications that does not leak any useful information from the volume and timing metadata of the application's transactions, including maintaining the privacy of a time-sensitive or volume-sensitive outcome. We achieve this by adding sufficient noise using indistinguishable decoy transactions such that an adversary cannot deduce which transactions actually impacted the outcome of the application. This is facilitated in a manner where anyone can publicly verify the application's execution to be correct, fair, and honest. We demonstrate and evaluate our approach by implementing a Dutch auction that supports decoy bid transactions on a private Ethereum blockchain network.
- PrivMon: A Stream-Based System for Real-Time Privacy Attack Detection for Machine Learning ModelsKo, Myeongseob; Yang, Xinyu; Ji, Zhengjie; Just, Hoang Anh; Gao, Peng; Kumar, Anoop; Jia, Ruoxi (ACM, 2023-10-16)Machine learning (ML) models can expose the private information of training data when confronted with privacy attacks. Specifically, a malicious user with black-box access to a ML-as-a-service platform can reconstruct the training data (i.e., model inversion attacks) or infer the membership information (i.e., membership inference attacks) simply by querying the ML model. Despite the pressing need for effective defenses against privacy attacks with black-box access, existing approaches have mostly focused on enhancing the robustness of the ML model via modifying the model training process or the model prediction process. These defenses can compromise model utility and require the cooperation of the underlying AI platform (i.e., platform-dependent). These constraints largely limit the real-world applicability of existing defenses. Despite the prevalent focus on improving the model’s robustness, none of the existing works have focused on the continuous protection of already deployed ML models from privacy attacks by detecting privacy leakage in real-time. This defensive task becomes increasingly important given the vast deployment of MLas- a-service platforms these days. To bridge the gap, we propose PrivMon, a new stream-based system for real-time privacy attack detection for ML models. To facilitate wide applicability and practicality, PrivMon defends black-box ML models against a wide range of privacy attacks in a platform-agnostic fashion: PrivMon only passively monitors model queries without requiring the cooperation of the model owner or the AI platform. Specifically, PrivMon takes as input a stream of ML model queries and provides an efficient attack detection engine that continuously monitors the stream to detect the privacy attack in real-time, by identifying self-similar malicious queries. We show empirically and theoretically that PrivMon can detect a wide range of realistic privacy attacks within a practical time frame and successfully mitigate the attack success rate. Code is available at https://github.com/ruoxi-jia-group/privmon.
- Securing the Future of 5G Smart Dust: Optimizing Cryptographic Algorithms for Ultra-Low SWaP Energy-Harvesting DevicesRyu, Zeezoo (Virginia Tech, 2023-07-12)While 5G energy harvesting makes 5G smart dust possible, stretching computation across power cycles affects cryptographic algorithms. This effect may lead to new security issues that make the system vulnerable to adversary attacks. Therefore, security measures are needed to protect data at rest and in transit across the network. In this paper, we identify the security requirements of existing 5G networks and the best-of-breed cryptographic algorithms for ultra-low SWaP devices in an energy harvesting context. To do this, we quantify the performance vs. energy tradespace, investigate the device features that impact the tradespace the most, and assess the security impact when the attacker has access to intermediate results. Our open-source energy-harvesting-tolerant versions of the cryptographic algorithms provide algorithm and device recommendations and ultra-low SWaP energy-harvesting-device-optimized versions of the cryptographic algorithms.