Defending Against Misuse of Synthetic Media: Characterizing Real-world Challenges and Building Robust Defenses

Pu, Jiameng

Defending Against Misuse of Synthetic Media: Characterizing Real-world Challenges and Building Robust Defenses

dc.contributor.author	Pu, Jiameng	en
dc.contributor.committeechair	Viswanath, Bimal	en
dc.contributor.committeemember	Chung, Taejoong Tijay	en
dc.contributor.committeemember	Yao, Danfeng	en
dc.contributor.committeemember	Wang, Gang	en
dc.contributor.committeemember	Gao, Peng	en
dc.contributor.department	Computer Science and Applications	en
dc.date.accessioned	2022-10-08T08:00:12Z	en
dc.date.available	2022-10-08T08:00:12Z	en
dc.date.issued	2022-10-07	en
dc.description.abstract	Recent advances in deep generative models have enabled the generation of realistic synthetic media or deepfakes, including synthetic images, videos, and text. However, synthetic media can be misused for malicious purposes and damage users' trust in online content. This dissertation aims to address several key challenges in defending against the misuse of synthetic media. Key contributions of this dissertation include the following: (1) Understanding challenges with the real-world applicability of existing synthetic media defenses. We curate synthetic videos and text from the wild, i.e., the Internet community, and assess the effectiveness of state-of-the-art defenses on synthetic content in the wild. In addition, we propose practical low-cost adversarial attacks, and systematically measure the adversarial robustness of existing defenses. Our findings reveal that most defenses show significant degradation in performance under real-world detection scenarios, which leads to the second thread of my work: (2) Building detection schemes with improved generalization performance and robustness for synthetic content. Most existing synthetic image detection schemes are highly content-specific, e.g., designed for only human faces, thus limiting their applicability. I propose an unsupervised content-agnostic detection scheme called NoiseScope, which does not require a priori access to synthetic images and is applicable to a wide variety of generative models, i.e., GANs. NoiseScope is also resilient against a range of countermeasures conducted by a knowledgeable attacker. For the text modality, our study reveals that state-of-the-art defenses that mine sequential patterns in the text using Transformer models are vulnerable to simple evasion schemes. We conduct further exploration towards enhancing the robustness of synthetic text detection by leveraging semantic features.	en
dc.description.abstractgeneral	Recent advances in deep generative models have enabled the generation of realistic synthetic media or deepfakes, including synthetic images, videos, and text. However, synthetic media can be misused for malicious purposes and damage users' trust in online content. This dissertation aims to address several key challenges in defending against the misuse of synthetic media. Key contributions of this dissertation include the following: (1) Understanding challenges with the real-world applicability of existing synthetic media defenses. We curate synthetic videos and text from the Internet community, and assess the effectiveness of state-of-the-art defenses on the collected datasets. In addition, we systematically measure the robustness of existing defenses by designing practical low-cost attacks, such as changing the configuration of generative models. Our findings reveal that most defenses show significant degradation in performance under real-world detection scenarios, which leads to the second thread of my work: (2) Building detection schemes with improved generalization performance and robustness for synthetic content. Many existing synthetic image detection schemes make decisions by looking for anomalous patterns in a specific type of high-level content, e.g., human faces, thus limiting their applicability. I propose a blind content-agnostic detection scheme called NoiseScope, which does not require synthetic images for training, and is applicable to a wide variety of generative models. For the text modality, our study reveals that state-of-the-art defenses that mine sequential patterns in the text using Transformer models are not robust against simple attacks. We conduct further exploration towards enhancing the robustness of synthetic text detection by leveraging semantic features.	en
dc.description.degree	Doctor of Philosophy	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:35688	en
dc.identifier.uri	http://hdl.handle.net/10919/112116	en
dc.language.iso	en	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Deepfake Datasets	en
dc.subject	Deepfake Detection	en
dc.subject	Synthetic Media	en
dc.subject	Generative Models	en
dc.title	Defending Against Misuse of Synthetic Media: Characterizing Real-world Challenges and Building Robust Defenses	en
dc.type	Dissertation	en
thesis.degree.discipline	Computer Science and Applications	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	doctoral	en
thesis.degree.name	Doctor of Philosophy	en

Files

Original bundle

Now showing 1 - 2 of 2

Name:: Pu_J_D_2022.pdf
Size:: 9.94 MB
Format:: Adobe Portable Document Format

Download

Name:: Pu_J_D_2022_support_1.pdf
Size:: 47.81 KB
Format:: Adobe Portable Document Format
Description:: Supporting documents

Download

Collections

Doctoral Dissertations