Browsing by Author "Wang, Ning"
Now showing 1 - 6 of 6
Results Per Page
Sort Options
- Building trustworthy machine learning systems in adversarial environmentsWang, Ning (Virginia Tech, 2023-05-26)Modern AI systems, particularly with the rise of big data and deep learning in the last decade, have greatly improved our daily life and at the same time created a long list of controversies. AI systems are often subject to malicious and stealthy subversion that jeopardizes their efficacy. Many of these issues stem from the data-driven nature of machine learning. While big data and deep models significantly boost the accuracy of machine learning models, they also create opportunities for adversaries to tamper with models or extract sensitive data. Malicious data providers can compromise machine learning systems by supplying false data and intermediate computation results. Even a well-trained model can be deceived to misbehave by an adversary who provides carefully designed inputs. Furthermore, curious parties can derive sensitive information of the training data by interacting with a machine-learning model. These adversarial scenarios, known as poisoning attack, adversarial example attack, and inference attack, have demonstrated that security, privacy, and robustness have become more important than ever for AI to gain wider adoption and societal trust. To address these problems, we proposed the following solutions: (1) FLARE, which detects and mitigates stealthy poisoning attacks by leveraging latent space representations; (2) MANDA, which detects adversarial examples by utilizing evaluations from diverse sources, i.e, model-based prediction and data-based evaluation; (3) FeCo which enhances the robustness of machine learning-based network intrusion detection systems by introducing a novel representation learning method; and (4) DP-FedMeta, which preserves data privacy and improves the privacy-accuracy trade-off in machine learning systems through a novel adaptive clipping mechanism.
- Evaluation of the U.S. Peanut Germplasm Mini-Core Collection in the Virginia-Carolina Region Using Traditional and New High-Throughput MethodsSarkar, Sayantan; Oakes, Joseph; Cazenave, Alexandre-Brice; Burow, Mark D.; Bennett, Rebecca S.; Chamberlin, Kelly D.; Wang, Ning; White, Melanie; Payton, Paxton; Mahan, James; Chagoya, Jennifer; Sung, Cheng-Jung; McCall, David S.; Thomason, Wade E.; Balota, Maria (MDPI, 2022-08-18)Peanut (Arachis hypogaea L.) is an important food crop for the U.S. and the world. The Virginia-Carolina (VC) region (Virginia, North Carolina, and South Carolina) is an important peanut-growing region of the U.S and is affected by numerous biotic and abiotic stresses. Identification of stress-resistant germplasm, along with improved phenotyping methods, are important steps toward developing improved cultivars. Our objective in 2017 and 2018 was to assess the U.S. mini-core collection for desirable traits, a valuable source for resistant germplasm under limited water conditions. Accessions were evaluated using traditional and high-throughput phenotyping (HTP) techniques, and the suitability of HTP methods as indirect selection tools was assessed. Traditional phenotyping methods included stand count, plant height, lateral branch growth, normalized difference vegetation index (NDVI), canopy temperature depression (CTD), leaf wilting, fungal and viral disease, thrips rating, post-digging in-shell sprouting, and pod yield. The HTP method included 48 aerial vegetation indices (VIs), which were derived using red, blue, green, and near-infrared reflectance; color space indices were collected using an octocopter drone at the same time, with traditional phenotyping. Both phenotypings were done 10 times between 4 and 16 weeks after planting. Accessions had yields comparable to high yielding checks. Correlation coefficients up to 0.8 were identified for several Vis, with yield indicating their suitability for indirect phenotyping. Broad-sense heritability (H2) was further calculated to assess the suitability of particular VIs to enable genetic gains. VIs could be used successfully as surrogates for the physiological and agronomic trait selection in peanuts. Further, this study indicates that UAV-based sensors have potential for measuring physiologic and agronomic characteristics measured for peanut breeding, variable rate input application, real time decision making, and precision agriculture applications.
- FLARE: Defending Federated Learning against Model Poisoning Attacks via Latent Space RepresentationsWang, Ning; Xiao, Yang; Chen, Yimin; Hu, Yang; Lou, Wenjing; Hou, Y. Thomas (ACM, 2022-05-30)Federated learning (FL) has been shown vulnerable to a new class of adversarial attacks, known as model poisoning attacks (MPA), where one or more malicious clients try to poison the global model by sending carefully crafted local model updates to the central parameter server. Existing defenses that have been fixated on analyzing model parameters show limited effectiveness in detecting such carefully crafted poisonous models. In this work, we propose FLARE, a robust model aggregation mechanism for FL, which is resilient against state-of-the-art MPAs. Instead of solely depending on model parameters, FLARE leverages the penultimate layer representations (PLRs) of the model for characterizing the adversarial influence on each local model update. PLRs demonstrate a better capability to differentiate malicious models from benign ones than model parameter-based solutions. We further propose a trust evaluation method that estimates a trust score for each model update based on pairwise PLR discrepancies among all model updates. Under the assumption that honest clients make up the majority, FLARE assigns a trust score to each model update in a way that those far from the benign cluster are assigned low scores. FLARE then aggregates the model updates weighted by their trust scores and finally updates the global model. Extensive experimental results demonstrate the effectiveness of FLARE in defending FL against various MPAs, including semantic backdoor attacks, trojan backdoor attacks, and untargeted attacks, and safeguarding the accuracy of FL.
- GLR Control Charts for Monitoring Correlated Binary ProcessesWang, Ning (Virginia Tech, 2013-12-27)When monitoring a binary process proportion p, it is usually assumed that the binary observations are independent. However, it is very common that the observations are correlated with p being the correlation between two successive observations. The first part of this research investigates the problem of monitoring p when the binary observations follow a first-order two-state Markov chain model with p remaining unchanged. A Markov Binary GLR (MBGLR) chart with an upper bound on the estimate of p is proposed to monitor a continuous stream of autocorrelated binary observations treating each observation as a sample of size n=1. The MBGLR chart with a large upper bound has good overall performance over a wide range of shifts. The MBGLR chart is optimized using the extra number of defectives (END) over a range of upper bounds for the MLE of p. The numerical results show that the optimized MBGLR chart has a smaller END than the optimized Markov binary CUSUM. The second part of this research develops a CUSUM-pp chart and a GLR-pp chart to monitor p and p simultaneously. The CUSUM-pp with two tuning parameters is designed to detect shifts in p and p when the shifted values are known. We apply two CUSUM-pp charts as a chart combination to detect increases in p and increases or decreases in p. The GLR-pp chart with an upper bound on the estimate of p, and an upper bound and a lower bound on the estimate of p works well when the shifts are unknown. We find that the GLR-pp chart has better overall performance. The last part of this research investigates the problem of monitoring p with p remains at the target value when the correlated binary observations are aggregated into samples with n>1. We assume that samples are independent and there is correlation between the observations in a sample. We proposed some GLR and CUSUM charts to monitor p and the performance of the charts are compared. The simulation results show MBNGLR has overall better performance than the other charts.
- Hermes: Boosting the Performance of Machine-Learning-Based Intrusion Detection System through Geometric Feature LearningZhang, Chaoyu; Shi, Shanghao; Wang, Ning; Xu, Xiangxiang; Li, Shaoyu; Zheng, Lizhong; Marchany, Randy; Gardner, Mark; Hou, Y. Thomas; Lou, Wenjing (ACM, 2024-10-14)Anomaly-Based Intrusion Detection Systems (IDSs) have been extensively researched for their ability to detect zero-day attacks. These systems establish a baseline of normal behavior using benign traffic data and flag deviations from this norm as potential threats. They generally experience higher false alarm rates than signature-based IDSs. Unlike image data, where the observed features provide immediate utility, raw network traffic necessitates additional processing for effective detection. It is challenging to learn useful patterns directly from raw traffic data or simple traffic statistics (e.g., connection duration, package inter-arrival time) as the complex relationships are difficult to distinguish. Therefore, some feature engineering becomes imperative to extract and transform raw data into new feature representations that can directly improve the detection capability and reduce the false positive rate. We propose a geometric feature learning method to optimize the feature extraction process. We employ contrastive feature learning to learn a feature space where normal traffic instances reside in a compact cluster. We further utilize H-Score feature learning to maximize the compactness of the cluster representing the normal behavior, enhancing the subsequent anomaly detection performance. Our evaluations using the NSL-KDD and N-BaloT datasets demonstrate that the proposed IDS powered by feature learning can consistently outperform state-of-the-art anomaly-based IDS methods by significantly lowering the false positive rate. Furthermore, we deploy the proposed IDS on a Raspberry Pi 4 and demonstrate its applicability on resource-constrained Internet of Things (IoT) devices, highlighting its versatility for diverse application scenarios.
- Squeezing More Utility via Adaptive Clipping on Differentially Private Gradients in Federated Meta-LearningWang, Ning; Xiao, Yang; Chen, Yimin; Zhang, Ning; Lou, Wenjing; Hou, Y. Thomas (ACM, 2022-12-05)Federated meta-learning has emerged as a promising AI framework for today’s mobile computing scenes involving distributed clients. It enables collaborative model training using the data located at distributed mobile clients and accommodates clients that need fast model customization with limited new data. However, federated meta-learning solutions are susceptible to inference-based privacy attacks since the global model encoded with clients’ training data is open to all clients and the central server. Meanwhile, differential privacy (DP) has been widely used as a countermeasure against privacy inference attacks in federated learning. The adoption of DP in federated meta-learning is complicated by the model accuracy-privacy trade-off and the model hierarchy attributed to the meta-learning component. In this paper, we introduce DP-FedMeta, a new differentially private federated meta-learning architecture that addresses such data privacy challenges. DP-FedMeta features an adaptive gradient clipping method and a one-pass meta-training process to improve the model utility-privacy trade-off. At the core of DPFedMeta are two DP mechanisms, namely DP-AGR and DP-AGRLR, to provide two notions of privacy protection for the hierarchical models. Extensive experiments in an emulated federated metalearning scenario on well-known datasets (Omniglot, CIFAR-FS, and Mini-ImageNet) demonstrate that DP-FedMeta accomplishes better privacy protection while maintaining comparable model accuracy compared to the state-of-the-art solution that directly applies DP-based meta-learning to the federated setting.