Chen, XiNeuwald, Andrew F.Hilakivi-Clarke, LeenaClarke, RobertXuan, Jianhua2022-02-222022-02-222021-07-011553-734XPCOMPBIOL-D-20-01578 (PII)http://hdl.handle.net/10919/108817Transcription factors (TFs) often function as a module including both master factors and mediators binding at cis-regulatory regions to modulate nearby gene transcription. ChIPseq profiling of multiple TFs makes it feasible to infer functional TF modules. However, when inferring TF modules based on co-localization of ChIP-seq peaks, often many weak binding events are missed, especially for mediators, resulting in incomplete identification of modules. To address this problem, we develop a ChIP-seq data-driven Gibbs Sampler to infer Modules (ChIP-GSM) using a Bayesian framework that integrates ChIP-seq profiles of multiple TFs. ChIP-GSM samples read counts of module TFs iteratively to estimate the binding potential of a module to each region and, across all regions, estimates the module abundance. Using inferred module-region probabilistic bindings as feature units, ChIP-GSM then employs logistic regression to predict active regulatory elements. Validation of ChIPGSM predicted regulatory regions on multiple independent datasets sharing the same context confirms the advantage of using TF modules for predicting regulatory activity. In a case study of K562 cells, we demonstrate that the ChIP-GSM inferred modules form as groups, activate gene expression at different time points, and mediate diverse functional cellular processes. Hence, ChIP-GSM infers biologically meaningful TF modules and improves the prediction accuracy of regulatory region activities.22 page(s)application/pdfenCreative Commons Attribution 4.0 InternationalBiochemical Research MethodsMathematical & Computational BiologyBiochemistry & Molecular BiologyCHROMATIN-STATE DISCOVERYSEQEXPRESSIONENHANCERSLINEAGEROLES01 Mathematical Sciences06 Biological Sciences08 Information and Computing SciencesBioinformaticsK562 CellsChromatinHumansTranscription FactorsModels, StatisticalBayes TheoremComputational BiologyGene Expression RegulationEpigenesis, GeneticBinding SitesRegulatory Sequences, Nucleic AcidGene Regulatory NetworksEnhancer Elements, GeneticPromoter Regions, GeneticMCF-7 CellsChromatin Immunoprecipitation SequencingChIP-GSM: Inferring active transcription factor modules to predict functional regulatory elementsArticle - Refereed2022-02-22PLOS Computational Biologyhttps://doi.org/10.1371/journal.pcbi.1009203177342929301553-7358