Representation Learning Based Causal Inference in Observational Studies

TR Number
Journal Title
Journal ISSN
Volume Title
Virginia Tech

This dissertation investigates novel statistical approaches for causal effect estimation in observational settings, where controlled experimentation is infeasible and confounding is the main hurdle in estimating causal effect. As such, deconfounding constructs the main subject of this dissertation, that is (i) to restore the covariate balance between treatment groups and (ii) to attenuate spurious correlations in training data to derive valid causal conclusions that generalize. By incorporating ideas from representation learning, adversarial matching, generative causal estimation, and invariant risk modeling, this dissertation establishes a causal framework that balances the covariate distribution in latent representation space to yield individualized estimations, and further contributes novel perspectives on causal effect estimation based on invariance principles.

The dissertation begins with a systematic review and examination of classical propensity score based balancing schemes for population-level causal effect estimation, presented in Chapter 2. Three causal estimands that target different foci in the population are considered: average treatment effect on the whole population (ATE), average treatment effect on the treated population (ATT), and average treatment effect on the overlap population (ATO). The procedure is demonstrated in a naturalistic driving study (NDS) to evaluate the causal effect of cellphone distraction on crash risk. While highlighting the importance of adopting causal perspectives in analyzing risk factors, discussions on the limitations in balance efficiency, robustness against high-dimensional data and complex interactions, and the need for individualization are provided to motivate subsequent developments.

Chapter 3 presents a novel generative Bayesian causal estimation framework named Balancing Variational Neural Inference of Causal Effects (BV-NICE). Via appealing to the Robinson factorization and a latent Bayesian model, a novel variational bound on likelihood is derived, explicitly characterized by the causal effect and propensity score. Notably, by treating observed variables as noisy proxies of unmeasurable latent confounders, the variational posterior approximation is re-purposed as a stochastic feature encoder that fully acknowledges representation uncertainties. To resolve the imbalance in representations, BV-NICE enforces KL-regularization on the respective representation marginals using Fenchel mini-max learning, justified by a new generalization bound on the counterfactual prediction accuracy. The robustness and effectiveness of this framework are demonstrated through an extensive set of tests against competing solutions on semi-synthetic and real-world datasets.

In recognition of the reliability issue when extending causal conclusions beyond training distributions, Chapter 4 argues ascertaining causal stability is the key and introduces a novel procedure called Risk Invariant Causal Estimation (RICE). By carefully re-examining the relationship between statistical invariance and causality, RICE cleverly leverages the observed data disparities to enable the identification of stable causal effects. Concretely, the causal inference objective is reformulated under the framework of invariant risk modeling (IRM), where a population-optimality penalty is enforced to filter out un-generalizable effects across heterogeneous populations. Importantly, RICE allows settings where counterfactual reasoning with unobserved confounding or biased sampling designs become feasible. The effectiveness of this new proposal is verified with respect to a variety of study designs on real and synthetic data.

In summary, this dissertation presents a flexible causal inference framework that acknowledges the representation uncertainties and data heterogeneities. It enjoys three merits: improved balance to complex covariate interactions, enhanced robustness to unobservable latent confounders, and better generalizability to novel populations.

Causal Inference, Representation Learning, Naturalistic Driving Study, Propensity Score, Representation Balancing, Invariant Risk Minimization