Browsing by Author "Tang, Man"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
- Bayesian population dynamics modeling to guide population restoration and recovery of endangered mussels in the Clinch River, Tennessee and VirginiaTang, Man (Virginia Tech, 2013-01-16)Freshwater mussels have played an important role in the history of human culture and also in ecosystem functioning. But during the past several decades, the abundance and diversity of mussel species has declined all over the world. To address the urgent need to maintain and restore populations of endangered freshwater mussels, quantitative population dynamics modeling is needed to evaluate population status and guide the management of endangered freshwater mussels. One endangered mussel species, the oyster mussel (Epioblasma capsaeformis), was selected to study its population dynamics for my research. The analysis was based on two datasets, length frequency data from annual surveys conducted at three sites in Clinch River: Wallen Bend (Clinch River Mile 192) from 2004-2010, Frost Ford (CRM 182) from 2005 to 2010 and Swan Island (CRM 172) from 2005 to 2010, and age-length data based on shell thin-sections. Three hypothetical scenarios were assumed in model estimations: (1) constant natural mortality; (2) one constant natural mortality rate for young mussels and another one for adult mussels; (3) age-specific natural mortality. A Bayesian approach was used to analyze the age-structured models and a Bayesian model averaging approach was applied to average the results by weighting each model using the deviance information criterion (DIC). A risk assessment was conducted to evaluate alternative restoration strategies for E. capsaeformis. The results indicated that releasing adult mussels was the quickest way to increase mussel population size and increasing survival and fertility of young mussels was a suitable way to restore mussel populations in the long term. The population of E. capsaeformis at Frost Ford had a lower risk of decline compared with the populations at Wallen Bend and Swan Island. Passive integrated transponder (PIT) tags were applied in my fieldwork to monitor the translocation efficiency of E. capsaeformis and Actinonaias pectorosa at Cleveland Islands (CRM 270.8). Hierarchical Bayesian models were developed to address the individual variability and sex-related differences in growth. In model selection, the model considering individual variability and sex-related differences (if a species has sexual dimorphism) yielded the lowest DIC value. The results from the best model showed that the mean asymptotic length and mean growth rate of female E. capsaeformis were 45.34 mm and 0.279, which were higher than values estimated for males (42.09 mm and 0.216). The mean asymptotic length and mean growth rate for A. pectorosa were 104.2 mm and 0.063, respectively. To test for the existence of individual and sex-related variability in survival and recapture rates, Bayesian models were developed to address the variability in the analysis of the mark-recapture data of E. capsaeformis and A. pectorosa. DIC was used to compare different models. The median survival rates of male E. capsaeformis, female E. capsaeformis and A. pectorosa were high (>87%, >74% and >91%), indicating that the habitat at Cleveland Islands was suitable for these two mussel species within this survey duration. In addition, the median recapture rates for E. capsaeformis and A. pectorosa were >93% and >96%, indicating that the PIT tag technique provided an efficient monitoring approach. According to model comparison results, the non-hierarchical model or the model with sex--related differences (if a species is sexually dimorphic) in survival rate was suggested for analyzing mark-recapture data when sample sizes are small.
- Identifying Transcriptional Regulatory Modules Among Different Chromatin States in Mouse Neural Stem CellsBanerjee, Sharmi; Zhu, Hongxiao; Tang, Man; Feng, Wu-chun; Wu, Xiaowei; Xie, Hehuang David (Frontiers, 2019-01-15)Gene expression regulation is a complex process involving the interplay between transcription factors and chromatin states. Significant progress has been made toward understanding the impact of chromatin states on gene expression. Nevertheless, the mechanism of transcription factors binding combinatorially in different chromatin states to enable selective regulation of gene expression remains an interesting research area. We introduce a nonparametric Bayesian clustering method for inhomogeneous Poisson processes to detect heterogeneous binding patterns of multiple proteins including transcription factors to form regulatory modules in different chromatin states. We applied this approach on ChIP-seq data for mouse neural stem cells containing 21 proteins and observed different groups or modules of proteins clustered within different chromatin states. These chromatin-state-specific regulatory modules were found to have significant influence on gene expression. We also observed different motif preferences for certain TFs between different chromatin states. Our results reveal a degree of interdependency between chromatin states and combinatorial binding of proteins in the complex transcriptional regulatory process. The software package is available on Github at - https://github.com/BSharmi/DPM-LGCP.
- Statistical methods for variant discovery and functional genomic analysis using next-generation sequencing dataTang, Man (Virginia Tech, 2020-01-03)The development of high-throughput next-generation sequencing (NGS) techniques produces massive amount of data, allowing the identification of biomarkers in early disease diagnosis and driving the transformation of most disciplines in biology and medicine. A greater concentration is needed in developing novel, powerful, and efficient tools for NGS data analysis. This dissertation focuses on modeling ``omics'' data in various NGS applications with a primary goal of developing novel statistical methods to identify sequence variants, find transcription factor (TF) binding patterns, and decode the relationship between TF and gene expression levels. Accurate and reliable identification of sequence variants, including single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (INDELs), plays a fundamental role in NGS applications. Existing methods for calling these variants often make simplified assumption of positional independence and fail to leverage the dependence of genotypes at nearby loci induced by linkage disequilibrium. We propose vi-HMM, a hidden Markov model (HMM)-based method for calling SNPs and INDELs in mapped short read data. Simulation experiments show that, under various sequencing depths, vi-HMM outperforms existing methods in terms of sensitivity and F1 score. When applied to the human whole genome sequencing data, vi-HMM demonstrates higher accuracy in calling SNPs and INDELs. One important NGS application is chromatin immunoprecipitation followed by sequencing (ChIP-seq), which characterizes protein-DNA relations through genome-wide mapping of TF binding sites. Multiple TFs, binding to DNA sequences, often show complex binding patterns, which indicate how TFs with similar functionalities work together to regulate the expression of target genes. To help uncover the transcriptional regulation mechanism, we propose a novel nonparametric Bayesian method to detect the clustering pattern of multiple-TF bindings from ChIP-seq datasets. Simulation study demonstrates that our method performs best with regard to precision, recall, and F1 score, in comparison to traditional methods. We also apply the method on real data and observe several TF clusters that have been recognized previously in mouse embryonic stem cells. Recent advances in ChIP-seq and RNA sequencing (RNA-Seq) technologies provides more reliable and accurate characterization of TF binding sites and gene expression measurements, which serves as a basis to study the regulatory functions of TFs on gene expression. We propose a log Gaussian cox process with wavelet-based functional model to quantify the relationship between TF binding site locations and gene expression levels. Through the simulation study, we demonstrate that our method performs well, especially with large sample size and small variance. It also shows a remarkable ability to distinguish real local feature in the function estimates.
- vi-HMM: a novel HMM-based method for sequence variant identification in short-read dataTang, Man; Hasan, Mohammad Shabbir; Zhu, Hongxiao; Zhang, Liqing; Wu, Xiaowei (2019-02-13)Background Accurate and reliable identification of sequence variants, including single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (INDELs), plays a fundamental role in next-generation sequencing (NGS) applications. Existing methods for calling these variants often make simplified assumptions of positional independence and fail to leverage the dependence between genotypes at nearby loci that is caused by linkage disequilibrium (LD). Results and conclusion We propose vi-HMM, a hidden Markov model (HMM)-based method for calling SNPs and INDELs in mapped short-read data. This method allows transitions between hidden states (defined as “SNP,” “Ins,” “Del,” and “Match”) of adjacent genomic bases and determines an optimal hidden state path by using the Viterbi algorithm. The inferred hidden state path provides a direct solution to the identification of SNPs and INDELs. Simulation studies show that, under various sequencing depths, vi-HMM outperforms commonly used variant calling methods in terms of sensitivity and F1 score. When applied to the real data, vi-HMM demonstrates higher accuracy in calling SNPs and INDELs.