Browsing by Author "Zhang, Lin"
Now showing 1 - 7 of 7
Results Per Page
Sort Options
- Discrete Second Order Adjoints in Atmospheric Chemical Transport ModelingSandu, Adrian; Zhang, Lin (Department of Computer Science, Virginia Polytechnic Institute & State University, 2007)Atmospheric chemical transport models (CTMs) are essential tools for the study of air pollution, for environmental policy decisions, for the interpretation of observational data, and for producing air quality forecasts. Many air quality studies require sensitivity analyses, i.e., the computation of derivatives of the model output with respect to model parameters. The derivatives of a cost functional (defined on the model output) with respect to a large number of model parameters can be calculated efficiently through adjoint sensitivity analysis. While the traditional (first order) adjoint models give the gradient of the cost functional with respect to parameters, second order adjoint models give second derivative information in the form of products between the Hessian of the cost functional and a user defined vector. In this paper we discuss the mathematical foundations of the discrete second order adjoint sensitivity method and present a complete set of computational tools for performing second order sensitivity studies in three-dimensional atmospheric CTMs. The tools include discrete second order adjoints of Runge Kutta and of Rosenbrock time stepping methods for stiff equations together with efficient implementation strategies. Numerical examples illustrate the use of these computational tools in important applications like sensitivity analysis, optimization, uncertainty quantification, and the calculation of directions of maximal error growth in three-dimensional atmospheric CTMs.
- Efficient Web Archive SearchingCheng, Ming; Wu, Yijing; Zhou, Xiaolin; Li, Jinyang; Zhang, Lin (Virginia Tech, 2020-05)The field of efficient web archive searching is at a turning point. In the early years of web archive searching, the organizations only use the URL as a key to search through the dataset, which is inefficient but acceptable. In recent years, as the volume of data in web archives has grown larger and larger, the ordinary searching methods have been gradually replaced by more efficient searching methods. This project will address the theoretical and methodological implications of choosing and running some suitable hashing algorithms locally, and eventually to improve the whole performance of web archive searching in time complexity. At the same time, our project introduces the design and implementation of various hashing algorithms to convert URLs to a sortable and shortened format, as well as demonstrates the corresponding searching efficiency improvement with benchmark results.
- Evaluating Geologic Sources of Arsenic in Well Water in Virginia (USA)VanDerwerker, Tiffany; Zhang, Lin; Ling, Erin; Benham, Brian L.; Schreiber, Madeline E. (MDPI, 2018-04-18)We investigated if geologic factors are linked to elevated arsenic (As) concentrations above 5 μg/L in well water in the state of Virginia, USA. Using geologic unit data mapped within GIS and two datasets of measured As concentrations in well water (one from public wells, the other from private wells), we evaluated occurrences of elevated As (above 5 μg/L) based on geologic unit. We also constructed a logistic regression model to examine statistical relationships between elevated As and geologic units. Two geologic units, including Triassic-aged sedimentary rocks and Triassic-Jurassic intrusives of the Culpeper Basin in north-central Virginia, had higher occurrences of elevated As in well water than other geologic units in Virginia. Model results support these patterns, showing a higher probability for As occurrence above 5 μg/L in well water in these two units. Due to the lack of observations (<5%) having elevated As concentrations in our data set, our model cannot be used to predict As concentrations in other parts of the state. However, our results are useful for identifying areas of Virginia, defined by underlying geology, that are more likely to have elevated As concentrations in well water. Due to the ease of obtaining publicly available data and the accessibility of GIS, this study approach can be applied to other areas with existing datasets of As concentrations in well water and accessible data on geology.
- The involvement of IRAK-1 in the regulation of NFATc2 in T cellsZhang, Lin (Virginia Tech, 2008-08-29)Interleukin-1 receptor associated kinase -1 is a protein kinase pivotal in mediating signals for innate immune responses. Here, I report that IRAK-1 also regulates cell-mediated immune responses. NFATc2 (nuclear factor of activated T cells) was found to be associated with IRAK-1 in T cells in vitro and its activity was elevated in the absence of IRAK-1. In addition, IRAK-1-/- mice had increased naturally occurring regulatory T cells and inducible regulatory T cells as well as Th1 responses as compared to WT mice. The findings suggest that activated T cells might employ IRAK-1 to mediate the regulation of acquired immunity. Therefore, IRAK-1 may participate in direct signaling cross talk between the innate and the acquired immunity.
- Large-Scale Simulations Using First and Second Order Adjoints with Applications in Data AssimilationZhang, Lin (Virginia Tech, 2007-06-09)In large-scale air quality simulations we are interested in the influence factors which cause changes of pollutants, and optimization methods which improve forecasts. The solutions to these problems can be achieved by incorporating adjoint models, which are efficient in computing the derivatives of a functional with respect to a large number of model parameters. In this research we employ first order adjoints in air quality simulations. Moreover, we explore theoretically the computation of second order adjoints for chemical transport models, and illustrate their feasibility in several aspects. We apply first order adjoints to sensitivity analysis and data assimilation. Through sensitivity analysis, we can discover the area that has the largest influence on changes of ozone concentrations at a receptor. For data assimilation with optimization methods which use first order adjoints, we assess their performance under different scenarios. The results indicate that the L-BFGS method is the most efficient. Compared with first order adjoints, second order adjoints have not been used to date in air quality simulation. To explore their utility, we show the construction of second order adjoints for chemical transport models and demonstrate several applications including sensitivity analysis, optimization, uncertainty quantification, and Hessian singular vectors. Since second order adjoints provide second order information in the form of Hessian-vector product instead of the entire Hessian matrix, it is possible to implement applications for large-scale models which require second order derivatives. Finally, we conclude that second order adjoints for chemical transport models are computationally feasible and effective.
- Semiparametric Bayesian Kernel Survival Model for Highly Correlated High-Dimensional Data.Zhang, Lin (Virginia Tech, 2018-05-01)We are living in an era in which many mysteries related to science, technologies and design can be answered by "learning" the huge amount of data accumulated over the past few decades. In the processes of those endeavors, highly-correlated high-dimensional data are frequently observed in many areas including predicting shelf life, controlling manufacturing processes, and identifying important pathways related with diseases. We define a "set" as a group of highly-correlated high-dimensional (HCHD) variables that possess a certain practical meaning or control a certain process, and define an "element" as one of the HCHD variables within a certain set. Such an elements-within-a-set structure is very complicated because: (i) the dimensions of elements in different sets can vary dramatically, ranging from two to hundreds or even thousands; (ii) the true relationships, include element-wise associations, set-wise interactions, and element-set interactions, are unknown; (iii) and the sample size (n) is usually much smaller than the dimension of the elements (p). The goal of this dissertation is to provide a systematic way to identify both the set effects and the element effects associated with survival outcomes from heterogeneous populations using Bayesian survival kernel models. By connecting kernel machines with semiparametric Bayesian hierarchical models, the proposed unified model frameworks can identify significant elements as well as sets regardless of mis-specifications of distributions or kernels. The proposed methods can potentially be applied to a vast range of fields to solve real-world problems.
- Tumour heterogeneity revealed by unsupervised decomposition of dynamic contrast-enhanced magnetic resonance imaging is associated with underlying gene expression patterns and poor survival in breast cancer patientsFan, Ming; Xia, Pingping; Liu, Bin; Zhang, Lin; Wang, Yue; Gao, Xin; Li, Lihua (2019-10-17)Background Heterogeneity is a common finding within tumours. We evaluated the imaging features of tumours based on the decomposition of tumoural dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) data to identify their prognostic value for breast cancer survival and to explore their biological importance. Methods Imaging features (n = 14), such as texture, histogram distribution and morphological features, were extracted to determine their associations with recurrence-free survival (RFS) in patients in the training cohort (n = 61) from The Cancer Imaging Archive (TCIA). The prognostic value of the features was evaluated in an independent dataset of 173 patients (i.e. the reproducibility cohort) from the TCIA I-SPY 1 TRIAL dataset. Radiogenomic analysis was performed in an additional cohort, the radiogenomic cohort (n = 87), using DCE-MRI from TCGA-BRCA and corresponding gene expression data from The Cancer Genome Atlas (TCGA). The MRI tumour area was decomposed by convex analysis of mixtures (CAM), resulting in 3 components that represent plasma input, fast-flow kinetics and slow-flow kinetics. The prognostic MRI features were associated with the gene expression module in which the pathway was analysed. Furthermore, a multigene signature for each prognostic imaging feature was built, and the prognostic value for RFS and overall survival (OS) was confirmed in an additional cohort from TCGA. Results Three image features (i.e. the maximum probability from the precontrast MR series, the median value from the second postcontrast series and the overall tumour volume) were independently correlated with RFS (p values of 0.0018, 0.0036 and 0.0032, respectively). The maximum probability feature from the fast-flow kinetics subregion was also significantly associated with RFS and OS in the reproducibility cohort. Additionally, this feature had a high correlation with the gene expression module (r = 0.59), and the pathway analysis showed that Ras signalling, a breast cancer-related pathway, was significantly enriched (corrected p value = 0.0044). Gene signatures (n = 43) associated with the maximum probability feature were assessed for associations with RFS (p = 0.035) and OS (p = 0.027) in an independent dataset containing 1010 gene expression samples. Among the 43 gene signatures, Ras signalling was also significantly enriched. Conclusions Dynamic pattern deconvolution revealed that tumour heterogeneity was associated with poor survival and cancer-related pathways in breast cancer.