Browsing by Author "Chen, Si"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
- Active Learning Under Limited Interaction with Data LabelerChen, Si (Virginia Tech, 2021)Active learning (AL) aims at reducing labeling effort by identifying the most valuable unlabeled data points from a large pool. Traditional AL frameworks have two limitations: First, they perform data selection in a multi-round manner, which is time-consuming and impractical. Second, they usually assume that there are a small amount of labeled data points available in the same domain as the data in the unlabeled pool. In this thesis, we initiate the study of one-round active learning to solve the first issue. We propose DULO, a general framework for one-round setting based on the notion of data utility functions, which map a set of data points to some performance measure of the model trained on the set. We formulate the one-round active learning problem as data utility function maximization. We then propose D²ULO on the basis of DULO as a solution that solves both issues. Specifically, D²ULO leverages the idea of domain adaptation (DA) to train a data utility model on source labeled data. The trained utility model can then be used to select high-utility data in the target domain and at the same time, provide an estimate for the utility of the selected data. Our experiments show that the proposed frameworks achieves better performance compared with state-of-the-art baselines in the same setting. Particularly, D²ULO is applicable to the scenario where the source and target labels have mismatches, which is not supported by the existing works.
- Adversarial Unlearning of Backdoors via Implicit HypergradientZeng, Yi; Chen, Si; Park, Won; Mao, Morley; Jin, Ming; Jia, Ruoxi (2022)We propose a minimax formulation for removing backdoors from a given poisoned model based on a small set of clean data. This formulation encompasses much of prior work on backdoor removal. We propose the Implicit Bacdoor Adversarial Unlearning (I-BAU) algorithm to solve the minimax. Unlike previous work, which breaks down the minimax into separate inner and outer problems, our algorithm utilizes the implicit hypergradient to account for the interdependence between inner and outer optimization. We theoretically analyze its convergence and the generalizability of the robustness gained by solving minimax on clean data to unseen test data. In our evaluation, we compare I-BAU with six stateof- art backdoor defenses on seven backdoor attacks over two datasets and various attack settings, including the common setting where the attacker targets one class as well as important but underexplored settings where multiple classes are targeted. I-BAU’s performance is comparable to and most often significantly better than the best baseline. Particularly, its performance is more robust to the variation on triggers, attack settings, poison ratio, and clean data size. Moreover, I-BAU requires less computation to take effect; particularly, it is more than 13X faster than the most efficient baseline in the single-target attack setting. Furthermore, it can remain effective in the extreme case where the defender can only access 100 clean samples—a setting where all the baselines fail to produce acceptable results.
- Role of DNA methylation on the association between physical activity and cardiovascular diseases: results from the longitudinal multi-ethnic study of atherosclerosis (MESA) cohortShi, Hangchuan; Ossip, Deborah J.; Mayo, Nicole L.; Lopez, Daniel A.; Block, Robert C.; Post, Wendy S.; Bertoni, Alain G.; Ding, Jingzhong; Chen, Si; Yan, Chen; Xie, Zidian; Hoeschele, Ina; Liu, Yongmei; Li, Dongmei (2021-11-03)Background The complexity of physical activity (PA) and DNA methylation interaction in the development of cardiovascular disease (CVD) is rarely simultaneously investigated in one study. We examined the role of DNA methylation on the association between PA and CVD. Results The Multi-Ethnic Study of Atherosclerosis (MESA) cohort Exam 5 data with 1065 participants free of CVD were used for final analysis. The quartile categorical total PA variable was created by activity intensity (METs/week). During a median follow-up of 4.0 years, 69 participants developed CVD. Illumina HumanMethylation450 BeadChip was used to provide genome-wide DNA methylation profiles in purified human monocytes (CD14+). We identified 23 candidate DNA methylation loci to be associated with both PA and CVD. We used the structural equation modeling (SEM) approach to test the complex relationships among multiple variables and the roles of mediators. Three of the 23 identified loci (corresponding to genes VPS13D, PIK3CD and VPS45) remained as significant mediators in the final SEM model along with other covariates. Bridged by the three genes, the 2nd PA quartile (β = − 0.959; 95%CI: − 1.554 to − 0.449) and the 3rd PA quartile (β = − 0.944; 95%CI: − 1.628 to − 0.413) showed the greatest inverse associations with CVD development, while the 4th PA quartile had a relatively weaker inverse association (β = − 0.355; 95%CI: − 0.713 to − 0.124). Conclusions The current study is among the first to simultaneously examine the relationships among PA, DNA methylation, and CVD in a large cohort with long-term exposure. We identified three DNA methylation loci bridged the association between PA and CVD. The function of the identified genes warrants further investigation in the pathogenesis of CVD.