Evolving Threats and Defenses in Machine Learning: Focus on Model Inversion and Beyond
dc.contributor.author | Chen, Si | en |
dc.contributor.committeechair | Jia, Ruoxi | en |
dc.contributor.committeemember | Ramakrishnan, Narendran | en |
dc.contributor.committeemember | Abbott, Amos L. | en |
dc.contributor.committeemember | Jin, Ming | en |
dc.contributor.committeemember | Wang, Xuan | en |
dc.contributor.department | Electrical and Computer Engineering | en |
dc.date.accessioned | 2025-05-24T08:00:32Z | en |
dc.date.available | 2025-05-24T08:00:32Z | en |
dc.date.issued | 2025-05-23 | en |
dc.description.abstract | Machine learning (ML) models are increasingly integrated into critical real-world applications, raising concerns about security, privacy, and trustworthiness. Among various emerging threats, model inversion (MI) attacks stand out due to their potential to compromise the confidentiality of training data. This dissertation investigates evolving threats in ML, centering on model inversion and its implications across image classification and natural language processing domains. Initially, we present an advanced model inversion attack algorithm leveraging knowledge-enriched distributional strategies under white-box conditions, effectively reconstructing private training data from image classifiers. To counteract such threats, we develop a novel data-centric defense approach, strategically utilizing augmentation techniques to reshape the model's loss landscape, thereby mitigating vulnerability to MI attacks. Recognizing the dual nature of threats and defenses, we further demonstrate how MI attacks, conventionally viewed as harmful, can be creatively repurposed to enhance model security. Specifically, we show MI can detect and neutralize backdoor attacks in image classification, enabling effective clean-data-free defense strategies. Broadening the scope beyond vision tasks, this dissertation introduces a proactive red-teaming framework designed for large language models (LLMs). By combining global strategy formation with local adaptive learning, our proposed red-teaming agent systematically identifies vulnerabilities, thus enhancing robustness against adaptive adversarial scenarios. Finally, addressing the critical issue of hallucination in language models, we propose FASTTRACK, a reliable fact-tracing framework. FASTTRACK uniquely integrates recursive clustering with large language model-driven validation, significantly surpassing existing methods in accuracy and computational efficiency. Collectively, these works illustrate a comprehensive narrative—from understanding foundational threats to innovating versatile, robust defenses—advancing the ongoing effort toward secure, privacy-preserving, and trustworthy machine learning systems. | en |
dc.description.abstractgeneral | Machine learning (ML) involves computer systems learning patterns from data to perform tasks without explicit programming. As ML becomes increasingly prevalent in our daily livesfrom social media and healthcare to autonomous vehiclesit also faces growing security risks. This dissertation explores these risks, focusing especially on one type called model inversion, where an attacker tries to reconstruct sensitive training information by examining how an ML model makes decisions. Initially, the study shows how these attacks can reveal private information from image recog- nition systems, and then proposes methods to defend against them. Interestingly, it also highlights how the same methods attackers use can be repurposed positively, such as de- tecting hidden manipulations in models. Moving beyond images, the work tackles security threats against language-based ML models, which power applications like chatbots and vir- tual assistants. It develops techniques to find and fix weaknesses in these systems, ensuring they behave safely and reliably. Overall, this research contributes significantly to creating safer, more reliable machine learn- ing technologies that protect our data and maintain trust in digital systems. | en |
dc.description.degree | Doctor of Philosophy | en |
dc.format.medium | ETD | en |
dc.identifier.other | vt_gsexam:43484 | en |
dc.identifier.uri | https://hdl.handle.net/10919/134204 | en |
dc.language.iso | en | en |
dc.publisher | Virginia Tech | en |
dc.rights | Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International | en |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ | en |
dc.subject | AI Safety | en |
dc.title | Evolving Threats and Defenses in Machine Learning: Focus on Model Inversion and Beyond | en |
dc.type | Dissertation | en |
thesis.degree.discipline | Computer Engineering | en |
thesis.degree.grantor | Virginia Polytechnic Institute and State University | en |
thesis.degree.level | doctoral | en |
thesis.degree.name | Doctor of Philosophy | en |
Files
Original bundle
1 - 1 of 1