Browsing by Author "Chen, Xi"
Now showing 1 - 20 of 35
Results Per Page
Sort Options
- Advanced Machine Learning for Surrogate Modeling in Complex Engineering SystemsLee, Cheol Hei (Virginia Tech, 2023-08-02)Surrogate models are indispensable in the analysis of engineering systems. The quality of surrogate models is determined by the data quality and the model class but achieving a high standard of them is challenging in complex engineering systems. Heterogeneity, implicit constraints, and extreme events are typical examples of the factors that complicate systems, yet they have been underestimated or disregarded in machine learning. This dissertation is dedicated to tackling the challenges in surrogate modeling of complex engineering systems by developing the following machine learning methodologies. (i) Partitioned active learning partitions the design space according to heterogeneity in response features, thereby exploiting localized models to measure the informativeness of unlabeled data. (ii) For the systems with implicit constraints, failure-averse active learning incorporates constraint outputs to estimate the safe region and avoid undesirable failures in learning the target function. (iii) The multi-output extreme spatial learning enables modeling and simulating extreme events in composite fuselage assembly. The proposed methods were applied to real-world case studies and outperformed benchmark methods.
- Bayesian Integration and Modeling for Next-generation Sequencing Data AnalysisChen, Xi (Virginia Tech, 2016-07-01)Computational biology currently faces challenges in a big data world with thousands of data samples across multiple disease types including cancer. The challenging problem is how to extract biologically meaningful information from large-scale genomic data. Next-generation Sequencing (NGS) can now produce high quality data at DNA and RNA levels. However, in cells there exist a lot of non-specific (background) signals that affect the detection accuracy of true (foreground) signals. In this dissertation work, under Bayesian framework, we aim to develop and apply approaches to learn the distribution of genomic signals in each type of NGS data for reliable identification of specific foreground signals. We propose a novel Bayesian approach (ChIP-BIT) to reliably detect transcription factor (TF) binding sites (TFBSs) within promoter or enhancer regions by jointly analyzing the sample and input ChIP-seq data for one specific TF. Specifically, a Gaussian mixture model is used to capture both binding and background signals in the sample data; and background signals are modeled by a local Gaussian distribution that is accurately estimated from the input data. An Expectation-Maximization algorithm is used to learn the model parameters according to the distributions on binding signal intensity and binding locations. Extensive simulation studies and experimental validation both demonstrate that ChIP-BIT has a significantly improved performance on TFBS detection over conventional methods, particularly on weak binding signal detection. To infer cis-regulatory modules (CRMs) of multiple TFs, we propose to develop a Bayesian integration approach, namely BICORN, to integrate ChIP-seq and RNA-seq data of the same tissue. Each TFBS identified from ChIP-seq data can be either a functional binding event mediating target gene transcription or a non-functional binding. The functional bindings of a set of TFs usually work together as a CRM to regulate the transcription processes of a group of genes. We develop a Gibbs sampling approach to learn the distribution of CRMs (a joint distribution of multiple TFs) based on their functional bindings and target gene expression. The robustness of BICORN has been validated on simulated regulatory network and gene expression data with respect to different noise settings. BICORN is further applied to breast cancer MCF-7 ChIP-seq and RNA-seq data to identify CRMs functional in promoter or enhancer regions. In tumor cells, the normal regulatory mechanism may be interrupted by genome mutations, especially those somatic mutations that uniquely occur in tumor cells. Focused on a specific type of genome mutation, structural variation (SV), we develop a novel pattern-based probabilistic approach, namely PSSV, to identify somatic SVs from whole genome sequencing (WGS) data. PSSV features a mixture model with hidden states representing different mutation patterns; PSSV can thus differentiate heterozygous and homozygous SVs in each sample, enabling the identification of those somatic SVs with a heterozygous status in the normal sample and a homozygous status in the tumor sample. Simulation studies demonstrate that PSSV outperforms existing tools. PSSV has been successfully applied to breast cancer patient WGS data for identifying somatic SVs of key factors associated with breast cancer development. In this dissertation research, we demonstrate the advantage of the proposed distributional learning-based approaches over conventional methods for NGS data analysis. Distributional learning is a very powerful approach to gain biological insights from high quality NGS data. Successful applications of the proposed Bayesian methods to breast cancer NGS data shed light on underlying molecular mechanisms of breast cancer, enabling biologists or clinicians to identify major cancer drivers and develop new therapeutics for cancer treatment.
- Bayesian Optimization for Engineering Design and Quality Control of Manufacturing SystemsAlBahar, Areej Ahmad (Virginia Tech, 2022-04-14)Manufacturing systems are usually nonlinear, nonstationary, highly corrupted with outliers, and oftentimes constrained by physical laws. Modeling and approximation of their underly- ing response surface functions are extremely challenging. Bayesian optimization is a great statistical tool, based on Bayes rule, used to optimize and model these expensive-to-evaluate functions. Bayesian optimization comprises of two important components namely, a sur- rogate model often the Gaussian process and an acquisition function often the expected improvement. The Gaussian process, known for its outstanding modeling and uncertainty quantification capabilities, is used to represent the underlying response surface function, while the expected improvement is used to select the next point to be evaluated by trading- off exploitation and exploration. Although Bayesian optimization has been extensively used in optimizing unknown and expensive-to-evaluate functions and in hyperparameter tuning of deep learning models, mod- eling highly outlier-corrupted, nonstationary, and stress-induced response surface functions hinder the use of conventional Bayesian optimization models in manufacturing systems. To overcome these limitations, we propose a series of systematic methodologies to improve Bayesian optimization for engineering design and quality control of manufacturing systems. Specifically, the contributions of this dissertation can be summarized as follows. 1. A novel asymmetric robust kernel function, called AEN-RBF, is proposed to model highly outlier-corrupted functions. Two new hyperparameters are introduced to im- prove the flexibility and robustness of the Gaussian process model. 2. A nonstationary surrogate model that utilizes deep multi-layer Gaussian processes, called MGP-CBO, is developed to improve the modeling of complex anisotropic con- strained nonstationary functions. 3. A Stress-Aware Optimal Actuator Placement framework is designed to model and op- timize stress-induced nonlinear constrained functions. Through extensive evaluations, the proposed methodologies have shown outstanding and significant improvements when compared to state-of-the-art models. Although these pro- posed methodologies have been applied to certain manufacturing systems, they can be easily adapted to other broad ranges of problems.
- BICORN: An R package for integrative inference of de novo cisregulatory modulesChen, Xi; Gu, Jinghua; Neuwald, Andrew F.; Hilakivi-Clarke, Leena; Clarke, Robert; Xuan, Jianhua (Springer Nature, 2020-05-14)Genome-wide transcription factor (TF) binding signal analyses reveal co-localization of TF binding sites based on inferred cis-regulatory modules (CRMs). CRMs play a key role in understanding the cooperation of multiple TFs under specific conditions. However, the functions of CRMs and their effects on nearby gene transcription are highly dynamic and context-specific and therefore are challenging to characterize. BICORN (Bayesian Inference of COoperative Regulatory Network) builds a hierarchical Bayesian model and infers context-specific CRMs based on TF-gene binding events and gene expression data for a particular cell type. BICORN automatically searches for a list of candidate CRMs based on the input TF bindings at regulatory regions associated with genes of interest. Applying Gibbs sampling, BICORN iteratively estimates model parameters of CRMs, TF activities, and corresponding regulation on gene transcription, which it models as a sparse network of functional CRMs regulating target genes. The BICORN package is implemented in R (version 3.4 or later) and is publicly available on the CRAN server at https://cran.r-project.org/web/packages/BICORN/index.html.
- ChIP-BIT2: a software tool to detect weak binding events using a Bayesian integration approachChen, Xi; Shi, Xu; Neuwald, Andrew F.; Hilakivi-Clarke, Leena; Clarke, Robert; Xuan, Jianhua (2021-04-15)Background ChIP-seq combines chromatin immunoprecipitation assays with sequencing and identifies genome-wide binding sites for DNA binding proteins. While many binding sites have strong ChIP-seq ‘peak’ observations and are well captured, there are still regions bound by proteins weakly, with a relatively low ChIP-seq signal enrichment. These weak binding sites, especially those at promoters and enhancers, are functionally important because they also regulate nearby gene expression. Yet, it remains a challenge to accurately identify weak binding sites in ChIP-seq data due to the ambiguity in differentiating these weak binding sites from the amplified background DNAs. Results ChIP-BIT2 ( http://sourceforge.net/projects/chipbitc/) is a software package for ChIP-seq peak detection. ChIP-BIT2 employs a mixture model integrating protein and control ChIP-seq data and predicts strong or weak protein binding sites at promoters, enhancers, or other genomic locations. For binding sites at gene promoters, ChIP-BIT2 simultaneously predicts their target genes. ChIP-BIT2 has been validated on benchmark regions and tested using large-scale ENCODE ChIP-seq data, demonstrating its high accuracy and wide applicability. Conclusion ChIP-BIT2 is an efficient ChIP-seq peak caller. It provides a better lens to examine weak binding sites and can refine or extend the existing binding site collection, providing additional regulatory regions for decoding the mechanism of gene expression regulation.
- ChIP-BIT: Bayesian inference of target genes using a novel joint probabilistic model of ChIP-seq profilesChen, Xi; Jung, Jin-Gyoung; Shajahan-Haq, Ayesha N.; Clarke, Robert; Shih, Ie-Ming; Wang, Yue; Magnani, Luca; Wang, Tian-Li; Xuan, Jianhua (Oxford, 2015-12-23)Chromatin immunoprecipitation with massively parallel DNA sequencing (ChIP-seq) has greatly improved the reliability with which transcription factor binding sites (TFBSs) can be identified from genome-wide profiling studies. Many computational tools are developed to detect binding events or peaks, however the robust detection of weak binding events remains a challenge for current peak calling tools. We have developed a novel Bayesian approach (ChIP-BIT) to reliably detect TFBSs and their target genes by jointly modeling binding signal intensities and binding locations of TFBSs. Specifically, a Gaussian mixture model is used to capture both binding and background signals in sample data. As a unique feature of ChIP-BIT, background signals are modeled by a local Gaussian distribution that is accurately estimated from the input data. Extensive simulation studies showed a significantly improved performance of ChIP-BIT in target gene prediction, particularly for detecting weak binding signals at gene promoter regions. We applied ChIP-BIT to find target genes from NOTCH3 and PBX1 ChIP-seq data acquired from MCF-7 breast cancer cells. TF knockdown experiments have initially validated about 30% of co-regulated target genes identified by ChIP-BIT as being differentially expressed in MCF-7 cells. Functional analysis on these genes further revealed the existence of crosstalk between Notch and Wnt signaling pathways.
- ChIP-GSM: Inferring active transcription factor modules to predict functional regulatory elementsChen, Xi; Neuwald, Andrew F.; Hilakivi-Clarke, Leena; Clarke, Robert; Xuan, Jianhua (PLoS, 2021-07-01)Transcription factors (TFs) often function as a module including both master factors and mediators binding at cis-regulatory regions to modulate nearby gene transcription. ChIPseq profiling of multiple TFs makes it feasible to infer functional TF modules. However, when inferring TF modules based on co-localization of ChIP-seq peaks, often many weak binding events are missed, especially for mediators, resulting in incomplete identification of modules. To address this problem, we develop a ChIP-seq data-driven Gibbs Sampler to infer Modules (ChIP-GSM) using a Bayesian framework that integrates ChIP-seq profiles of multiple TFs. ChIP-GSM samples read counts of module TFs iteratively to estimate the binding potential of a module to each region and, across all regions, estimates the module abundance. Using inferred module-region probabilistic bindings as feature units, ChIP-GSM then employs logistic regression to predict active regulatory elements. Validation of ChIPGSM predicted regulatory regions on multiple independent datasets sharing the same context confirms the advantage of using TF modules for predicting regulatory activity. In a case study of K562 cells, we demonstrate that the ChIP-GSM inferred modules form as groups, activate gene expression at different time points, and mediate diverse functional cellular processes. Hence, ChIP-GSM infers biologically meaningful TF modules and improves the prediction accuracy of regulatory region activities.
- A Comprehensive Analysis of Deep Learning for Interference Suppression, Sample and Model Complexity in Wireless SystemsOyedare, Taiwo Remilekun (Virginia Tech, 2024-03-12)The wireless spectrum is limited and the demand for its use is increasing due to technological advancements in wireless communication, resulting in persistent interference issues. Despite progress in addressing interference, it remains a challenge for effective spectrum usage, particularly in the use of license-free and managed shared bands and other opportunistic spectrum access solutions. Therefore, efficient and interference-resistant spectrum usage schemes are critical. In the past, most interference solutions have relied on avoidance techniques and expert system-based mitigation approaches. Recently, researchers have utilized artificial intelligence/machine learning techniques at the physical (PHY) layer, particularly deep learning, which suppress or compensate for the interfering signal rather than simply avoiding it. In addition, deep learning has been utilized by researchers in recent years to address various difficult problems in wireless communications such as, transmitter classification, interference classification and modulation recognition, amongst others. To this end, this dissertation presents a thorough analysis of deep learning techniques for interference classification and suppression, and it thoroughly examines complexity (sample and model) issues that arise from using deep learning. First, we address the knowledge gap in the literature with respect to the state-of-the-art in deep learning-based interference suppression. To account for the limitations of deep learning-based interference suppression techniques, we discuss several challenges, including lack of interpretability, the stochastic nature of the wireless channel, issues with open set recognition (OSR) and challenges with implementation. We also provide a technical discussion of the prominent deep learning algorithms proposed in the literature and also offer guidelines for their successful implementation. Next, we investigate convolutional neural network (CNN) architectures for interference and transmitter classification tasks. In particular, we utilize a CNN architecture to classify interference, investigate model complexity of CNN architectures for classifying homogeneous and heterogeneous devices and then examine their impact on test accuracy. Next, we explore the issues with sample size and sample quality with regards to the training data in deep learning. In doing this, we also propose a rule-of-thumb for transmitter classification using CNN based on the findings from our sample complexity study. Finally, in cases where interference cannot be avoided, it is important to suppress such interference. To achieve this, we build upon autoencoder work from other fields to design a convolutional neural network (CNN)-based autoencoder model to suppress interference thereby ensuring coexistence of different wireless technologies in both licensed and unlicensed bands.
- Consistency and Uniform Bounds for Heteroscedastic Simulation Metamodeling and Their ApplicationsZhang, Yutong (Virginia Tech, 2023-09-05)Heteroscedastic metamodeling has gained popularity as an effective tool for analyzing and optimizing complex stochastic systems. A heteroscedastic metamodel provides an accurate approximation of the input-output relationship implied by a stochastic simulation experiment whose output is subject to input-dependent noise variance. Several challenges remain unsolved in this field. First, in-depth investigations into the consistency of heteroscedastic metamodeling techniques, particularly from the sequential prediction perspective, are lacking. Second, sequential heteroscedastic metamodel-based level-set estimation (LSE) methods are scarce. Third, the increasingly high computational cost required by heteroscedastic Gaussian process-based LSE methods in the sequential sampling setting is a concern. Additionally, when constructing a valid uniform bound for a heteroscedastic metamodel, the impact of noise variance estimation is not adequately addressed. This dissertation aims to tackle these challenges and provide promising solutions. First, we investigate the information consistency of a widely used heteroscedastic metamodeling technique, stochastic kriging (SK). Second, we propose SK-based LSE methods leveraging novel uniform bounds for input-point classification. Moreover, we incorporate the Nystrom approximation and a principled budget allocation scheme to improve the computational efficiency of SK-based LSE methods. Lastly, we investigate empirical uniform bounds that take into account the impact of noise variance estimation, ensuring an adequate coverage capability.
- Designing Acrylic Block Copolymers with Multiple Hydrogen Bonding or Multiple Ionic BondingChen, Xi (Virginia Tech, 2018-09-05)The dynamic characteristics of hydrogen and ionic bonding contributes to the reversible properties of acrylic polymers, opening new avenues for designing materials with mechanical strength and processability. These non-covalent interactions function as physical crosslinks, which provide enhanced structural and mechanical integrity to acrylic block copolymers. The strong hydrogen bonding or ionic interaction also directs self-assembly to hierarchical microstructures, which enables many applications including thermoplastic elastomers and energy storage devices. Inspired by complementary hydrogen bonding interactions between nucleobase pairs in DNA, a series of bioinspired nucleobase-acrylate monomers such as adenine acrylate (AdA), thymine acrylate (ThA), cytosine acrylate (CyA) were designed, whose synthesis were afforded by aza-Michael addition. Among those nucleobases, cytosine arises as a unique category. It is not only able to self-associate via weak hydrogen bonds, but also forms quadruple hydrogen-bond bearing units (ureido-cytosine) when functionalized with isocyanates. Reversible addition-fragmentation chain transfer polymerization (RAFT) yielded acrylic ABA triblock copolymers with CyA external hard blocks. A subsequent post-functionalization using hexyl-isocyanate generated the corresponding ureido-cytosine acrylate(UCyA)-containing triblock copolymers. The self-complementary quadruple hydrogen bonding in the UCyA polymers achieved a broader service temperature window, while the alkyl chain ends of UCyA units allowed tunability of the mechanical strength to apply as thermoplastic elastomers. In addition, quadruple hydrogen bonding induced stronger propensity of self-assembly and denser packing of the polymers, which contributed to a well-defined ordered morphology and enhanced resistance to moisture uptake. A facile 2-step synthesis provided doubly-charged styrenic DABCO salt monomer(VBDC₁₈BrCl) containing an octadecyl tail. RAFT polymerization allowed the preparation of DABCO ABA block copolymers with defined molecular weights and low polydispersity. Thermal analysis revealed a melting transition of the VBDC₁₈BrCl block copolymer resulting from the side-chain crystallization of the long alkyl tail. Systematic mechanical comparisons between DABCO salt-containing copolymers and the corresponding singly-charged polymer controls demonstrated superior mechanical properties attributable to a stronger ionic interaction between the doubly-charged groups. Morphological characterizations revealed a well-ordered lamellar microstructure and a unique three-phase morphology of the DABCO block copolymers, which involve a soft phase, a hard phase, and an ionic aggregates domain dispersed within the hard domain.
- Development of Novel Attention-Aware Deep Learning Models and Their Applications in Computer Vision and Dynamical System CalibrationMaftouni, Maede (Virginia Tech, 2023-07-12)In recent years, deep learning has revolutionized computer vision and natural language processing tasks, but the black-box nature of these models poses significant challenges for their interpretability and reliability, especially in critical applications such as healthcare. To address this, attention-based methods have been proposed to enhance the focus and interpretability of deep learning models. In this dissertation, we investigate the effectiveness of attention mechanisms in improving prediction and modeling tasks across different domains. We propose three essays that utilize task-specific designed trainable attention modules in manufacturing, healthcare, and system identification applications. In essay 1, we introduce a novel computer vision tool that tracks the melt pool in X-ray images of laser powder bed fusion using attention modules. In essay 2, we present a mask-guided attention (MGA) classifier for COVID-19 classification on lung CT scan images. The MGA classifier incorporates lesion masks to improve both the accuracy and interpretability of the model, outperforming state-of-the-art models with limited training data. Finally, in essay 3, we propose a Transformer-based model, utilizing self-attention mechanisms, for parameter estimation in system dynamics models that outpaces the conventional system calibration methods. Overall, our results demonstrate the effectiveness of attention-based methods in improving deep learning model performance and reliability in diverse applications.
- A Dual Metamodeling Perspective for Design and Analysis of Stochastic Simulation ExperimentsWang, Wenjing (Virginia Tech, 2019-07-17)Fueled by a growing number of applications in science and engineering, the development of stochastic simulation metamodeling methodologies has gained momentum in recent years. A majority of the existing methods, such as stochastic kriging (SK), only focus on efficiently metamodeling the mean response surface implied by a stochastic simulation experiment. As the simulation outputs are stochastic with the simulation variance varying significantly across the design space, suitable methods for variance modeling are required. This thesis takes a dual metamodeling perspective and aims at exploiting the benefits of fitting the mean and variance functions simultaneously for achieving an improved predictive performance. We first explore the effects of replacing the sample variances with various smoothed variance estimates on the performance of SK and propose a dual metamodeling approach to obtain an efficient simulation budget allocation rule. Second, we articulate the links between SK and least-square support vector regression and propose to use a ``dense and shallow'' initial design to facilitate selection of important design points and efficient allocation of the computational budget. Third, we propose a variational Bayesian inference-based Gaussian process (VBGP) metamodeling approach to accommodate the situation where either one or multiple simulation replications are available at every design point. VBGP can fit the mean and variance response surfaces simultaneously, while taking into full account the uncertainty in the heteroscedastic variance. Lastly, we generalize VBGP for handling large-scale heteroscedastic datasets based on the idea of ``transductive combination of GP experts.''
- The Dynamics of Chinese Media Practices and Regulation: Explanations and InterpretationsChen, Xi (Virginia Tech, 2007-08-21)Based on the understanding that a country's media system can provide important insights into its politics, this dissertation reexamines the development of Chinese politics in the reform era through the media lens, and television in particular. Given that Chinese media have been a marker of the nation's socio-political developments, the media perspective is believed to be particularly useful in interpreting China's changing political circumstances. By tracing the dynamics of television news reporting practices and government regulation of the news media, this analysis will map out the evolving roles of television in today's China to use them as subtle indications of how Chinese politics are evolving in the reform era. Chinese television adopted a Soviet TASS style from its very beginnings due to the heavy Soviet influence that placed an emphasis on imparting a heavily ideological messages and propagating government policies and rules. This practice, however, has been substantially changed during the reform era. Television news reporting in today's China is moving towards the liberal media style in both format and content. What specific changes have taken places in television industry? To what extent has Chinese media departed from the Soviet style? What are the implications of these media changes for China's politics? To answer these questions, I conducted content analysis of the China Radio and Television Broadcasting Awards news reports and television regulations in the reform era, which revealed that Chinese media was developing towards a hybrid of Soviet and liberal models in which both control and liberalization trends can be identified. While encouraging and authorizing increased managerial, editorial, and programming freedom and autonomy, the Party-State has managed to retain its control over political content through increasingly indirect and sophisticated means. The continued marginalization of alternative political voices confirms that democracy with political pluralism, free flow of information and rule of law has not yet materialized after more than two decades' economic reform. By collaborating with market and technology, the Communist Party of China has actually managed to consolidate its control over both the political and economic power while authorizing increased freedom in individual, cultural, and social domains.
- Efficient Global Optimization of Multidisciplinary System using Variable Fidelity Analysis and Dynamic Sampling MethodPark, Jangho (Virginia Tech, 2019-07-22)Work in this dissertation is motivated by reducing the design cost at the early design stage while maintaining high design accuracy throughout all design stages. It presents four key design methods to improve the performance of Efficient Global Optimization for multidisciplinary problems. First, a fidelity-calibration method is developed and applied to lower-fidelity samples. Function values analyzed by lower fidelity analysis methods are updated to have equivalent accuracy to that of the highest fidelity samples, and these calibrated data sets are used to construct a variable-fidelity Kriging model. For the design of experiment (DOE), a dynamic sampling method is developed and includes filtering and infilling data based on mathematical criteria on the model accuracy. In the sample infilling process, multi-objective optimization for exploitation and exploration of design space is carried out. To indicate the fidelity of function analysis for additional samples in the variable-fidelity Kriging model, a dynamic fidelity indicator with the overlapping coefficient is proposed. For the multidisciplinary design problems, where multiple physics are tightly coupled with different coupling strengths, multi-response Kriging model is introduced and utilizes the method of iterative Maximum Likelihood Estimation (iMLE). Through the iMLE process, a large number of hyper-parameters in multi-response Kriging can be calculated with great accuracy and improved numerical stability. The optimization methods developed in the study are validated with analytic functions and showed considerable performance improvement. Consequentially, three practical design optimization problems of NACA0012 airfoil, Multi-element NLR 7301 airfoil, and all-moving-wingtip control surface of tailless aircraft are performed, respectively. The results are compared with those of existing methods, and it is concluded that these methods guarantee the equivalent design accuracy at computational cost reduced significantly.
- Exploring Multiple Hydrogen Bonding and Ionic Bonding in the Design of Supramolecular PolymersChen, Xi (Virginia Tech, 2020-06-03)Supramolecular polymers represent a family of polymeric materials that are held together with dynamic, noncovalent interactions. In contrast to conventional functional polymers that usually have high melt-viscosity due to their covalent nature and chain entanglement, supramolecular polymers combine excellent physical properties with low melt-viscosity, allowing for less energy-intensive processability and recyclability. Dynamic bonding with multiple binding sites, such as multiple hydrogen bonding or multiple ionic bonding, exhibits much stronger binding strength compared to the counterparts containing only a single binding site, thereby allowing for enhanced mechanical integrity to the polymers and facilitate self-assembly. This dissertation focuses on the design of novel supramolecular polymers building from the doubly-charged or quadruple hydrogen bonding (QHB) scaffolds utilizing chain-growth polymerization or step-growth polymerization, as well as elucidate the structure-property-morphology relationships of the polymers. A 2-step nucleophilic substitution reaction afforded a series of 1,4-diazabicyclo[2.2.2]octane (DABCO)-based styrenic monomers with two pairs of charged groups. An optimized 2-step reversible-addition-fragmentation chain-transfer (RAFT) polymerization synthesized ABA triblock thermoplastic elastomers (TPEs) with a low Tg poly (n-butyl acrylate) central block and a high Tg external charged blocks. Strong ionic interactions between doubly-charged units drove molecular self-assembly to form densely packed, hierarchical microstructures, which contributed to a robust, crosslinked physical network that allows the polymer to retain thermomechanical integrity until degradation. High-resolution single-crystal X-ray diffraction (SCXRD) coupled with powder X-ray diffraction (PXRD) further disclosed a detailed 3-D structural information of molecular arrangement and ion distribution within the charged phase through comparing DABCO-salt monomer single-crystal structure and the corresponding homopolymer XRD pattern. It was found that the physical properties of the DABCO-salt copolymers not only relied on their charge content and architectures but also dependent on their electrostatically-bonded counterions. The size and structure of the counterion determined the strength of dipole-dipole interaction, which significantly impact on thermal property, (thermo)mechanical performance, water affinity, and microstructure. A cytosine-functionalized monomer, cytosine acrylate (CyA), allowed the synthesis of acrylic ABA triblock TPEs with pendant nucleobase moieties in the external blocks and a low Tg central polymer matrix through RAFT polymerization. Post-functionalization of cytosine (Cyt) bidentate hydrogen bonding sites with alkyl isocyanate, allowed the formation of ureido-cytosine (UCyt) groups in the external block that were readily dimerized through QHB interactions. The UCyt units in the external block enhanced mechanical strength and induced stronger phase-separation of the block copolymers compared to the corresponding Cyt-containing TPE analogs. Facile conventional free-radical polymerization using CyA and subsequent post-functionalization enabled accessibility to random copolymers containing pendant UCyt QHB moieties in the soft polymer matrix. The synergy of the flexible polymer matrix and the dynamic character of QHB groups contributed to the ultra-high elasticity of the polymer and rapid self-healing properties. QHB interactions enabled efficient mechanical recovery upon deformation by facilitating elastic chain retraction to regenerate the original physical network. Finally, one-pot step-growth polymerization through chain extending a novel bis-Cyt monomer and a commercially available polyether diamine using a di-isocyanate extender afforded segmented polyurea series for extrusion additive manufacturing. The molecular design of the polyureas featured soft segments containing flexible polyether chain and a relatively weak urea hydrogen bonding sites in the soft segment and rigid UCyt hydrogen bonding groups in the hard segment. The reversible characteristics of QHB enabled low viscosity at the processing temperature while providing mechanical integrity after processing and reinforced bonding between the interlayers, which contributed to the remarkable strength, elasticity, toughness, and interlayer adhesion of the printed parts.
- High-resolution computational modeling of immune responses in the gutVerma, Meghna; Bassaganya-Riera, Josep; Leber, Andrew; Tubau-Juni, Nuria; Hoops, Stefan; Abedi, Vida; Chen, Xi; Hontecillas, Raquel (Oxford University Press, 2019-06-01)Background: Helicobacter pylori causes gastric cancer in 1-2% of cases but is also beneficial for protection against allergies and gastroesophageal diseases. An estimated 85% of H. pylori-colonized individuals experience no detrimental effects. To study the mechanisms promoting host tolerance to the bacterium in the gastrointestinal mucosa and systemic regulatory effects, we investigated the dynamics of immunoregulatory mechanisms triggered by H. pylori using a high-performance computing-driven ENteric Immunity SImulator multiscale model. Immune responses were simulated by integrating an agent-based model, ordinary, and partial differential equations. Results: The outputs were analyzed using 2 sequential stages: The first used a partial rank correlation coefficient regression-based and the second a metamodel-based global sensitivity analysis. The influential parameters screened from the first stage were selected to be varied for the second stage. The outputs from both stages were combined as a training dataset to build a spatiotemporal metamodel. The Sobol indices measured time-varying impact of input parameters during initiation, peak, and chronic phases of infection. The study identified epithelial cell proliferation and epithelial cell death as key parameters that control infection outcomes. In silico validation showed that colonization with H. pylori decreased with a decrease in epithelial cell proliferation, which was linked to regulatory macrophages and tolerogenic dendritic cells. Conclusions: The hybrid model of H. pylori infection identified epithelial cell proliferation as a key factor for successful colonization of the gastric niche and highlighted the role of tolerogenic dendritic cells and regulatory macrophages in modulating the host responses and shaping infection outcomes.
- Identifying intracellular signaling modules and exploring pathways associated with breast cancer recurrenceChen, Xi; Gu, Jinghua; Neuwald, Andrew F.; Hilakivi-Clarke, Leena; Clarke, Robert; Xuan, Jianhua (2021-01-11)Exploring complex modularization of intracellular signal transduction pathways is critical to understanding aberrant cellular responses during disease development and drug treatment. IMPALA (Inferred Modularization of PAthway LAndscapes) integrates information from high throughput gene expression experiments and genome-scale knowledge databases to identify aberrant pathway modules, thereby providing a powerful sampling strategy to reconstruct and explore pathway landscapes. Here IMPALA identifies pathway modules associated with breast cancer recurrence and Tamoxifen resistance. Focusing on estrogen-receptor (ER) signaling, IMPALA identifies alternative pathways from gene expression data of Tamoxifen treated ER positive breast cancer patient samples. These pathways were often interconnected through cytoplasmic genes such as IRS1/2, JAK1, YWHAZ, CSNK2A1, MAPK1 and HSP90AA1 and significantly enriched with ErbB, MAPK, and JAK-STAT signaling components. Characterization of the pathway landscape revealed key modules associated with ER signaling and with cell cycle and apoptosis signaling. We validated IMPALA-identified pathway modules using data from four different breast cancer cell lines including sensitive and resistant models to Tamoxifen. Results showed that a majority of genes in cell cycle/apoptosis modules that were up-regulated in breast cancer patients with short survivals (<5 years) were also over-expressed in drug resistant cell lines, whereas the transcription factors JUN, FOS, and STAT3 were down-regulated in both patient and drug resistant cell lines. Hence, IMPALA identified pathways were associated with Tamoxifen resistance and an increased risk of breast cancer recurrence. The IMPALA package is available at https://dlrl.ece.vt.edu/software/.
- The Impact of Group Discussion and Formation on Student Performance: An Experience Report in a Large CS1 CourseWu, Tong; Tang, Xiaohang; Wong, Sam; Chen, Xi; Shaffer, Clifford A.; Chen, Yan (ACM, 2025-02-12)Programming instructors often conduct collaborative learning activities, such as Peer Instruction (PI), to enhance student motivation, engagement, and learning gains. However, the impact of group discussion and formation mechanisms on student performance remains unclear. To investigate this, we conducted an 11- session experiment in a large, in-person CS1 course. We employed both random and expertise-balanced grouping methods to examine the efficacy of different group mechanisms and the impact of expert students’ presence on collaborative learning. Our observations revealed complex dynamics within the collaborative learning environment. Among 255 groups, 146 actively engaged in discussions, with 96 of these groups demonstrating improvement for poor-performing students. Interestingly, our analysis revealed that different grouping methods (expertise-balanced or random) did not significantly influence discussion engagement or poor-performing students’ improvement. In our deeper qualitative analysis, we found that struggling students often derived benefits from interactions with expert peers, but this positive effect was not consistent across all groups.We identified challenges that expert students face in peer instruction interactions, highlighting the complexity of leveraging expertise within group discussions.
- Inactivation of Arid1a in the endometrium is associated with endometrioid tumorigenesis through transcriptional reprogrammingRahmanto, Yohan Suryo; Shen, Wenjing; Shi, Xu; Chen, Xi; Yu, Yu; Yu, Zheng-Cheng; Miyamoto, Tsutomu; Lee, Meng-Horng; Singh, Vivek; Asaka, Ryoichi; Shimberg, Geoffrey; Vitolo, Michele, I.; Martin, Stuart S.; Wirtz, Denis; Drapkin, Ronny; Xuan, Jianhua; Wang, Tian-Li; Shih, Ie-Ming (2020-06-01)Somatic inactivating mutations of ARID1A, a SWI/SNF chromatin remodeling gene, are prevalent in human endometrium-related malignancies. To elucidate the mechanisms underlying how ARID1A deleterious mutation contributes to tumorigenesis, we establish genetically engineered murine models with Arid1a and/or Pten conditional deletion in the endometrium. Transcriptomic analyses on endometrial cancers and precursors derived from these mouse models show a close resemblance to human uterine endometrioid carcinomas. We identify transcriptional networks that are controlled by Arid1a and have an impact on endometrial tumor development. To verify findings from the murine models, we analyze ARID1A(WT) and ARID1A(KO) human endometrial epithelial cells. Using a system biology approach and functional studies, we demonstrate that ARID1A-deficiency lead to loss of TGF-beta tumor suppressive function and that inactivation of ARID1A/TGF-beta axis promotes migration and invasion of PTEN-deleted endometrial tumor cells. These findings provide molecular insights into how ARID1A inactivation accelerates endometrial tumor progression and dissemination, the major causes of cancer mortality.
- mAPC-GibbsOS: an integrated approach for robust identification of gene regulatory networksShi, Xu; Gu, Jinghua; Chen, Xi; Shajahan, Ayesha; Hilakivi-Clarke, Leena; Clarke, Robert; Xuan, Jianhua (2013-12-09)Background Identification of cooperative gene regulatory network is an important topic for biological study especially in cancer research. Traditional approaches suffer from large noise in gene expression data and false positive connections in motif binding data; they also fail to identify the modularized structure of gene regulatory network. Methods that are capable of revealing underlying modularized structure and robust to noise and false positives are needed to be developed. Results We proposed and developed an integrated approach to identify gene regulatory networks, which consists of a novel clustering method (namely motif-guided affinity propagation clustering (mAPC)) and a sampling based method (called Gibbs sampler based on outlier sum statistic (GibbsOS)). mAPC is used in the first step to obtain co-regulated gene modules by clustering genes with a similarity measurement taking into account both gene expression data and binding motif information. This clustering method can reduce the noise effect from microarray data to obtain modularized gene clusters. However, due to many false positives in motif binding data, some genes not regulated by certain transcription factors (TFs) will be falsely clustered with true target genes. To overcome this problem, GibbsOS is applied in the second step to refine each cluster for the identification of true target genes. In order to evaluate the performance of the proposed method, we generated simulation data under different signal-to-noise ratios and false positive ratios to test the method. The experimental results show an improved accuracy in terms of clustering and transcription factor identification. Moreover, an improved performance is demonstrated in target gene identification as compared with GibbsOS. Finally, we applied the proposed method to two breast cancer patient datasets to identify cooperative transcriptional regulatory networks associated with recurrence of breast cancer, as supported by their functional annotations. Conclusions We have developed a two-step approach for gene regulatory network identification, featuring an integrated method to identify modularized regulatory structures and refine their target genes subsequently. Simulation studies have shown the robustness of the method against noise in gene expression data and false positives in motif binding data. The proposed method has been applied to two breast cancer gene expression datasets to infer the hidden regulation mechanisms. The experimental results demonstrate the efficacy of the method in identifying key regulatory networks related to the progression and recurrence of breast cancer.