Recent Advances on Statistical Network Analysis and Multi-task Learning for Complex Data
Files
TR Number
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Real-world data are increasingly complex, arising from diverse domains that demand sophisticated analytical approaches. This dissertation focuses on developing advanced statistical methods to address the challenges in analyzing social network data and clinical trial data. First, I propose a novel exponential random graph model (ERGM) to study the common knowledge (CK) phenomenon in Facebook social networks. Unlike traditional contagion models, CK allows individuals to coordinate their activation as a group, thereby facilitating both the initiation and propagation of information. To investigate how network structure influences CK-based contagion, I develop an ERGM to generate networks while controlling for bicliques, which are the characterizing graph substructures for generating CK. Second, according to FDA guidance, prognostic variables—baseline covariates associated with clinical trial study outcomes—must be pre-specified at the study design stage to improve the precision of treatment effect estimation. To support this, I develop a multi-task learning approach that leverages historical trials of the treatment being studied to identify prognostic variables, which can guide the design and analysis of new studies. The performance is validated through simulations and demonstrated using real-world clinical trial data. In addition, I propose a frequentist dynamic borrowing approach that borrows information from the control arms of historical trials similar to the current study. This approach augments the control arm, improving the precision of estimating the treatment effect and the efficiency of conducting randomized controlled trials.