Evolving Threats and Defenses in Machine Learning: Focus on Model Inversion and Beyond

dc.contributor.authorChen, Sien
dc.contributor.committeechairJia, Ruoxien
dc.contributor.committeememberRamakrishnan, Narendranen
dc.contributor.committeememberAbbott, Amos L.en
dc.contributor.committeememberJin, Mingen
dc.contributor.committeememberWang, Xuanen
dc.contributor.departmentElectrical and Computer Engineeringen
dc.date.accessioned2025-05-24T08:00:32Zen
dc.date.available2025-05-24T08:00:32Zen
dc.date.issued2025-05-23en
dc.description.abstractMachine learning (ML) models are increasingly integrated into critical real-world applications, raising concerns about security, privacy, and trustworthiness. Among various emerging threats, model inversion (MI) attacks stand out due to their potential to compromise the confidentiality of training data. This dissertation investigates evolving threats in ML, centering on model inversion and its implications across image classification and natural language processing domains. Initially, we present an advanced model inversion attack algorithm leveraging knowledge-enriched distributional strategies under white-box conditions, effectively reconstructing private training data from image classifiers. To counteract such threats, we develop a novel data-centric defense approach, strategically utilizing augmentation techniques to reshape the model's loss landscape, thereby mitigating vulnerability to MI attacks. Recognizing the dual nature of threats and defenses, we further demonstrate how MI attacks, conventionally viewed as harmful, can be creatively repurposed to enhance model security. Specifically, we show MI can detect and neutralize backdoor attacks in image classification, enabling effective clean-data-free defense strategies. Broadening the scope beyond vision tasks, this dissertation introduces a proactive red-teaming framework designed for large language models (LLMs). By combining global strategy formation with local adaptive learning, our proposed red-teaming agent systematically identifies vulnerabilities, thus enhancing robustness against adaptive adversarial scenarios. Finally, addressing the critical issue of hallucination in language models, we propose FASTTRACK, a reliable fact-tracing framework. FASTTRACK uniquely integrates recursive clustering with large language model-driven validation, significantly surpassing existing methods in accuracy and computational efficiency. Collectively, these works illustrate a comprehensive narrative—from understanding foundational threats to innovating versatile, robust defenses—advancing the ongoing effort toward secure, privacy-preserving, and trustworthy machine learning systems.en
dc.description.abstractgeneralMachine learning (ML) involves computer systems learning patterns from data to perform tasks without explicit programming. As ML becomes increasingly prevalent in our daily livesfrom social media and healthcare to autonomous vehiclesit also faces growing security risks. This dissertation explores these risks, focusing especially on one type called model inversion, where an attacker tries to reconstruct sensitive training information by examining how an ML model makes decisions. Initially, the study shows how these attacks can reveal private information from image recog- nition systems, and then proposes methods to defend against them. Interestingly, it also highlights how the same methods attackers use can be repurposed positively, such as de- tecting hidden manipulations in models. Moving beyond images, the work tackles security threats against language-based ML models, which power applications like chatbots and vir- tual assistants. It develops techniques to find and fix weaknesses in these systems, ensuring they behave safely and reliably. Overall, this research contributes significantly to creating safer, more reliable machine learn- ing technologies that protect our data and maintain trust in digital systems.en
dc.description.degreeDoctor of Philosophyen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:43484en
dc.identifier.urihttps://hdl.handle.net/10919/134204en
dc.language.isoenen
dc.publisherVirginia Techen
dc.rightsCreative Commons Attribution-NonCommercial-ShareAlike 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en
dc.subjectAI Safetyen
dc.titleEvolving Threats and Defenses in Machine Learning: Focus on Model Inversion and Beyonden
dc.typeDissertationen
thesis.degree.disciplineComputer Engineeringen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.leveldoctoralen
thesis.degree.nameDoctor of Philosophyen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Chen_S_D_2025.pdf
Size:
6.73 MB
Format:
Adobe Portable Document Format