Browsing by Author "Jia, Ruoxi"
Now showing 1 - 15 of 15
Results Per Page
Sort Options
- Active Learning Under Limited Interaction with Data LabelerChen, Si (Virginia Tech, 2021)Active learning (AL) aims at reducing labeling effort by identifying the most valuable unlabeled data points from a large pool. Traditional AL frameworks have two limitations: First, they perform data selection in a multi-round manner, which is time-consuming and impractical. Second, they usually assume that there are a small amount of labeled data points available in the same domain as the data in the unlabeled pool. In this thesis, we initiate the study of one-round active learning to solve the first issue. We propose DULO, a general framework for one-round setting based on the notion of data utility functions, which map a set of data points to some performance measure of the model trained on the set. We formulate the one-round active learning problem as data utility function maximization. We then propose D²ULO on the basis of DULO as a solution that solves both issues. Specifically, D²ULO leverages the idea of domain adaptation (DA) to train a data utility model on source labeled data. The trained utility model can then be used to select high-utility data in the target domain and at the same time, provide an estimate for the utility of the selected data. Our experiments show that the proposed frameworks achieves better performance compared with state-of-the-art baselines in the same setting. Particularly, D²ULO is applicable to the scenario where the source and target labels have mismatches, which is not supported by the existing works.
- Adversarial Unlearning of Backdoors via Implicit HypergradientZeng, Yi; Chen, Si; Park, Won; Mao, Morley; Jin, Ming; Jia, Ruoxi (2022)We propose a minimax formulation for removing backdoors from a given poisoned model based on a small set of clean data. This formulation encompasses much of prior work on backdoor removal. We propose the Implicit Bacdoor Adversarial Unlearning (I-BAU) algorithm to solve the minimax. Unlike previous work, which breaks down the minimax into separate inner and outer problems, our algorithm utilizes the implicit hypergradient to account for the interdependence between inner and outer optimization. We theoretically analyze its convergence and the generalizability of the robustness gained by solving minimax on clean data to unseen test data. In our evaluation, we compare I-BAU with six stateof- art backdoor defenses on seven backdoor attacks over two datasets and various attack settings, including the common setting where the attacker targets one class as well as important but underexplored settings where multiple classes are targeted. I-BAU’s performance is comparable to and most often significantly better than the best baseline. Particularly, its performance is more robust to the variation on triggers, attack settings, poison ratio, and clean data size. Moreover, I-BAU requires less computation to take effect; particularly, it is more than 13X faster than the most efficient baseline in the single-target attack setting. Furthermore, it can remain effective in the extreme case where the defender can only access 100 clean samples—a setting where all the baselines fail to produce acceptable results.
- Anomaly Detection for Smart Infrastructure: An Unsupervised Approach for Time Series ComparisonGandra, Harshitha (Virginia Tech, 2022-01-25)Time series anomaly detection can prove to be a very useful tool to inspect and maintain the health and quality of an infrastructure system. While tackling such a problem, the main concern lies in the imbalanced nature of the dataset. In order to mitigate this problem, this thesis proposes two unsupervised anomaly detection frameworks. The first one is an architecture which leverages the concept of matrix profile which essentially refers to a data structure containing the euclidean scores of the subsequences of two time series that is obtained through a similarity join.It is an architecture comprising of a data fusion technique coupled with using matrix profile analysis under the constraints of varied sampling rate for different time series. To this end, we have proposed a framework, through which a time series that is being evaluated for anomalies is quantitatively compared with a benchmark (anomaly-free) time series using the proposed asynchronous time series comparison that was inspired by matrix profile approach for anomaly detection on time series . In order to evaluate the efficacy of this framework, it was tested on a case study comprising of a Class I Rail road dataset. The data collection system integrated into this railway system collects data through different data acquisition channels which represent different transducers. This framework was applied to all the channels and the best performing channels were identified. The average Recall and Precision achieved on the single channel evaluation through this framework was 93.5% and 55% respectively with an error threshold of 0.04 miles or 211 feet. A limitation that was noticed in this framework was that there were some false positive predictions. In order to overcome this problem, a second framework has been proposed which incorporates the idea of extracting signature patterns in a time series also known as motifs which can be leveraged to identify anomalous patterns. This second framework proposed is a motif based framework which operates under the same constraints of a varied sampling rate. Here, a feature extraction method and a clustering method was used in the training process of a One Class Support Vector Machine (OCSVM) coupled with a Kernel Density Estimation (KDE) technique. The average Recall and Precision achieved on the same case study through this frame work was 74% and 57%. In comparison to the first, the second framework does not perform as well. There will be future efforts focused on improving this classification-based anomaly detection method
- Controllable Visual SynthesisAlBahar, Badour A. Sh A. (Virginia Tech, 2023-06-08)Computer graphics has become an integral part of various industries such as entertainment (i.e.,films and content creation), fashion (i.e.,virtual try-on), and video games. Computer graphics has evolved tremendously over the past years. It has shown remarkable image generation improvement from low-quality, pixelated images with limited details to highly realistic images with fine details that can often be mistaken for real images. However, the traditional pipeline of rendering an image in computer graphics is complex and time- consuming. The whole process of creating the geometry, material, and textures requires not only time but also significant expertise. In this work, we aim to replace this complex traditional computer graphics pipeline with a simple machine learning model. This machine learning model can synthesize realistic images without requiring expertise or significant time and effort. Specifically, we address the problem of controllable image synthesis. We propose several approaches that allow the user to synthesize realistic content and manipulate images to achieve their desired goals with ease and flexibility.
- Data Centric Defenses for Privacy AttacksAbhyankar, Nikhil Suhas (Virginia Tech, 2023-08-14)Recent research shows that machine learning algorithms are highly susceptible to attacks trying to extract sensitive information about the data used in model training. These attacks called privacy attacks, exploit the model training process. Contemporary defense techniques make alterations to the training algorithm. Such defenses are computationally expensive, cause a noticeable privacy-utility tradeoff, and require control over the training process. This thesis presents a data-centric approach using data augmentations to mitigate privacy attacks. We present privacy-focused data augmentations to change the sensitive data submitted to the model trainer. Compared to traditional defenses, our method provides more control to the individual data owner to protect one's private data. The defense is model-agnostic and does not require the data owner to have any sort of control over the model training. Privacypreserving augmentations are implemented for two attacks namely membership inference and model inversion using two distinct techniques. While the proposed augmentations offer a better privacy-utility tradeoff on CIFAR-10 for membership inference, they reduce the reconstruction rate to ≤ 1% while reducing the classification accuracy by only 2% against model inversion attacks. This is the first attempt to defend model inversion and membership inference attacks using decentralized privacy protection.
- Data Sharing and Retrieval of Manufacturing ProcessesSeth, Avi (Virginia Tech, 2023-03-28)With Industrial Internet, businesses can pool their resources to acquire large amounts of data that can then be used in machine learning tasks. Despite the potential to speed up training and deployment and improve decision-making through data-sharing, rising privacy concerns are slowing the spread of such technologies. As businesses are naturally protective of their data, this poses a barrier to interoperability. While previous research has focused on privacy-preserving methods, existing works typically consider data that is averaged or randomly sampled by all contributors rather than selecting data that are best suited for a specific downstream learning task. In response to the dearth of efficient data-sharing methods for diverse machine learning tasks in the Industrial Internet, this work presents an end-to end working demonstration of a search engine prototype built on PriED, a task-driven data-sharing approach that enhances the performance of supervised learning by judiciously fusing shared and local participant data.
- Data-Efficient Learning in Image Synthesis and Instance SegmentationRobb, Esther Anne (Virginia Tech, 2021-08-18)Modern deep learning methods have achieve remarkable performance on a variety of computer vision tasks, but frequently require large, well-balanced training datasets to achieve high-quality results. Data-efficient performance is critical for downstream tasks such as automated driving or facial recognition. We propose two methods of data-efficient learning for the tasks of image synthesis and instance segmentation. We first propose a method of high-quality and diverse image generation from finetuning to only 5-100 images. Our method factors a pretrained model into a small but highly expressive weight space for finetuning, which discourages overfitting in a small training set. We validate our method in a challenging few-shot setting of 5-100 images in the target domain. We show that our method has significant visual quality gains compared with existing GAN adaptation methods. Next, we introduce a simple adaptive instance segmentation loss which achieves state-of-the-art results on the LVIS dataset. We demonstrate that the rare categories are heavily suppressed by textit{correct background predictions}, which reduces the probability for all foreground categories with equal weight. Due to the relative infrequency of rare categories, this leads to an imbalance that biases towards predicting more frequent categories. Based on this insight, we develop DropLoss -- a novel adaptive loss to compensate for this imbalance without a trade-off between rare and frequent categories.
- Deep Convolutional Neural Networks for Segmenting Unruptured Intracranial Aneurysms from 3D TOF-MRA ImagesBoonaneksap, Surasith (Virginia Tech, 2022-02-07)Despite facing technical issues (e.g., overfitting, vanishing and exploding gradients), deep neural networks have the potential to capture complex patterns in data. Understanding how depth impacts neural networks performance is vital to the advancement of novel deep learning architectures. By varying hyperparameters on two sets of architectures with different depths, this thesis aims to examine if there are any potential benefits from developing deep networks for segmenting intracranial aneurysms from 3D TOF-MRA scans in the ADAM dataset.
- Derivative-Free Meta-Blackbox Optimization on ManifoldSel, Bilgehan (Virginia Tech, 2024-06)Solving a sequence of high-dimensional, nonconvex, but potentially similar optimization problems poses a significant computational challenge in various engineering applications. This thesis presents the first meta-learning framework that leverages the shared structure among sequential tasks to improve the computational efficiency and sample complexity of derivative-free optimization. Based on the observation that most practical high-dimensional functions lie on a latent low-dimensional manifold, which can be further shared among problem instances, the proposed method jointly learns the meta-initialization of a search point and a meta-manifold. This novel approach enables the efficient adaptation of the optimization process to new tasks by exploiting the learned meta-knowledge. Theoretically, the benefit of meta-learning in this challenging setting is established by proving that the proposed method achieves improved convergence rates and reduced sample complexity compared to traditional derivative-free optimization techniques. Empirically, the effectiveness of the proposed algorithm is demonstrated in two high-dimensional reinforcement learning tasks, showcasing its ability to accelerate learning and improve performance across multiple domains. Furthermore, the robustness and generalization capabilities of the meta-learning framework are explored through extensive ablation studies and sensitivity analyses. The thesis highlights the potential of meta-learning in tackling complex optimization problems and opens up new avenues for future research in this area.
- Domain Adaptation with a Classifier Trained by Robust Pseudo-LabelsZhou, Yunke (Virginia Tech, 2022-01-07)With the rapid growth of computing power, approaches based on deep learning algorithms have achieved remarkable results in solving computer vision classification problems. These performance improvements are achieved by assuming the source and target data are collected from the same probability distribution. However, this assumption is usually too strict to be satisfied in many real-world applications, such as big data analysis, natural language processing, and computer vision classification problems. Because of distribution discrepancies between these domains, directly training the model on the source domain cannot be expected to generate satisfactory results on the target domain. Therefore, the problem of minimizing these data distribution discrepancies is the main challenge with which modern machine learning is now faced. To address this problem, domain adaptation (DA) aims to identify domain-invariant features between two different but related domains. This thesis proposes a state-of-the-art DA approach that overcomes the limitations of traditional DA methods. To capture fine-grained information for each category, I deploy centroid-to-centroid alignment to perform domain adaptation. An Exponential Moving Average strategy (EMA) is used to ensure we can form robust source and target centroids. A Gaussian-uniform mixture model is trained using an Expectation-Maximization (EM) algorithm to infer the robustness of the target pseudo-labels. With the help of target pseudo-labels, I propose two novel types of classifiers: (1) a target-oriented classifier (TO); and (2) a centroid-oriented classifier (CO). Extensive experiments show that these two classifiers exhibit superior performance on a variety of DA benchmarks when compared to standard baseline methods.
- Learning-to-Learn to Guide Random Search: Derivative-Free Meta Blackbox Optimization on ManifoldSel, Bilgehan; Tawaha, Ahmad; Ding, Yuhao; Jia, Ruoxi; Ji, Bo; Lavaei, Javad; Jin, Ming (2023-01-01)Solving a sequence of high-dimensional, nonconvex, but potentially similar optimization problems poses a computational challenge in engineering applications. We propose the first meta-learning framework that leverages the shared structure among sequential tasks to improve the computational efficiency and sample complexity of derivative-free optimization. Based on the observation that most practical high-dimensional functions lie on a latent low-dimensional manifold, which can be further shared among instances, our method jointly learns the meta-initialization of a search point and a meta-manifold. Theoretically, we establish the benefit of meta-learning in this challenging setting. Empirically, we demonstrate the effectiveness of the proposed algorithm in two high-dimensional reinforcement learning tasks.
- ModelPred: A Framework for Predicting Trained Model from Training DataZeng, Yingyan (Virginia Tech, 2024-06-06)In this work, we propose ModelPred, a framework that helps to understand the impact of changes in training data on a trained model. This is critical for building trust in various stages of a machine learning pipeline: from cleaning poor-quality samples and tracking important ones to be collected during data preparation, to calibrating uncertainty of model prediction, to interpreting why certain behaviors of a model emerge during deployment. Specifically, ModelPred learns a parameterized function that takes a dataset S as the input and predicts the model obtained by training on S. Our work differs from the recent work of Datamodels as we aim for predicting the trained model parameters directly instead of the trained model behaviors. We demonstrate that a neural network-based set function class is capable of learning the complex relationships between the training data and model parameters. We introduce novel global and local regularization techniques to prevent overfitting and we rigorously characterize the expressive power of neural networks (NN) in approximating the end-to-end training process. Through extensive empirical investigations, we show that ModelPred enables a variety of applications that boost the interpretability and accountability of machine learning (ML), such as data valuation, data selection, memorization quantification, and model calibration.
- Narcissus: A Practical Clean-Label Backdoor Attack with Limited InformationZeng, Yi; Pan, Minzhou; Just, Hoang Anh; Lyu, Lingjuan; Qiu, Meikang; Jia, Ruoxi (ACM, 2023-11-15)Backdoor attacks introduce manipulated data into a machine learning model's training set, causing the model to misclassify inputs with a trigger during testing to achieve a desired outcome by the attacker. For backdoor attacks to bypass human inspection, it is essential that the injected data appear to be correctly labeled. The attacks with such property are often referred to as "clean-label attacks." The success of current clean-label backdoor methods largely depends on access to the complete training set. Yet, accessing the complete dataset is often challenging or unfeasible since it frequently comes from varied, independent sources, like images from distinct users. It remains a question of whether backdoor attacks still present real threats. In this paper, we provide an affirmative answer to this question by designing an algorithm to launch clean-label backdoor attacks using only samples from the target class and public out-of-distribution data. By inserting carefully crafted malicious examples totaling less than 0.5% of the target class size and 0.05% of the full training set size, we can manipulate the model to misclassify arbitrary inputs into the target class when they contain the backdoor trigger. Importantly, the trained poisoned model retains high accuracy for regular test samples without the trigger, as if the model is trained on untainted data. Our technique is consistently effective across various datasets, models, and even when the trigger is injected into the physical world. We explore the space of defenses and find that Narcissus can evade the latest state-of-the-art defenses in their vanilla form or after a simple adaptation. We analyze the effectiveness of our attack - the synthesized Narcissus trigger contains durable features as persistent as the original target class features. Attempts to remove the trigger inevitably hurt model accuracy first.
- Parkinson's Disease Automated Hand Tremor Analysis from Spiral ImagesDeSipio, Rebecca E. (Virginia Tech, 2023-05)Parkinson’s Disease is a neurological degenerative disease affecting more than six million people worldwide. It is a progressive disease, impacting a person’s movements and thought processes. In recent years, computer vision and machine learning researchers have been developing techniques to aid in the diagnosis. This thesis is motivated by the exploration of hand tremor symptoms in Parkinson’s patients from the Archimedean Spiral test, a paper-and-pencil test used to evaluate hand tremors. This work presents a novel Fourier Domain analysis technique that transforms the pencil content of hand-drawn spiral images into frequency features. Our technique is applied to an image dataset consisting of spirals drawn by healthy individuals and people with Parkinson’s Disease. The Fourier Domain analysis technique achieves 81.5% accuracy predicting images drawn by someone with Parkinson’s, a result 6% higher than previous methods. We compared this method against the results using extracted features from the ResNet-50 and VGG16 pre-trained deep network models. The VGG16 extracted features achieve 95.4% accuracy classifying images drawn by people with Parkinson’s Disease. The extracted features of both methods were also used to develop a tremor severity rating system which scores the spiral images on a scale from 0 (no tremor) to 1 (severe tremor). The results show correlation to the Unified Parkinson’s Disease Rating Scale (MDS-UPDRS) developed by the International Parkinson and Movement Disorder Society. These results can be useful for aiding in early detection of tremors, the medical treatment process, and symptom tracking to monitor the progression of Parkinson’s Disease.
- PrivMon: A Stream-Based System for Real-Time Privacy Attack Detection for Machine Learning ModelsKo, Myeongseob; Yang, Xinyu; Ji, Zhengjie; Just, Hoang Anh; Gao, Peng; Kumar, Anoop; Jia, Ruoxi (ACM, 2023-10-16)Machine learning (ML) models can expose the private information of training data when confronted with privacy attacks. Specifically, a malicious user with black-box access to a ML-as-a-service platform can reconstruct the training data (i.e., model inversion attacks) or infer the membership information (i.e., membership inference attacks) simply by querying the ML model. Despite the pressing need for effective defenses against privacy attacks with black-box access, existing approaches have mostly focused on enhancing the robustness of the ML model via modifying the model training process or the model prediction process. These defenses can compromise model utility and require the cooperation of the underlying AI platform (i.e., platform-dependent). These constraints largely limit the real-world applicability of existing defenses. Despite the prevalent focus on improving the model’s robustness, none of the existing works have focused on the continuous protection of already deployed ML models from privacy attacks by detecting privacy leakage in real-time. This defensive task becomes increasingly important given the vast deployment of MLas- a-service platforms these days. To bridge the gap, we propose PrivMon, a new stream-based system for real-time privacy attack detection for ML models. To facilitate wide applicability and practicality, PrivMon defends black-box ML models against a wide range of privacy attacks in a platform-agnostic fashion: PrivMon only passively monitors model queries without requiring the cooperation of the model owner or the AI platform. Specifically, PrivMon takes as input a stream of ML model queries and provides an efficient attack detection engine that continuously monitors the stream to detect the privacy attack in real-time, by identifying self-similar malicious queries. We show empirically and theoretically that PrivMon can detect a wide range of realistic privacy attacks within a practical time frame and successfully mitigate the attack success rate. Code is available at https://github.com/ruoxi-jia-group/privmon.