Bayesian Hierarchical Latent Model for Gene Set Analysis
Pathway is a set of genes which are predefined and serve a particular celluar or physiological function. Ranking pathways relevant to a particular phenotype can help researchers focus on a few sets of genes in pathways. In this thesis, a Bayesian hierarchical latent model was proposed using generalized linear random effects model. The advantage of the approach was that it can easily incorporate prior knowledges when the sample size was small and the number of genes was large. For the covariance matrix of a set of random variables, two Gaussian random processes were considered to construct the dependencies among genes in a pathway. One was based on the polynomial kernel and the other was based on the Gaussian kernel. Then these two kernels were compared with constant covariance matrix of the random effect by using the ratio, which was based on the joint posterior distribution with respect to each model. For mixture models, log-likelihood values were computed at different values of the mixture proportion, compared among mixtures of selected kernels and point-mass density (or constant covariance matrix). The approach was applied to a data set (Mootha et al., 2003) containing the expression profiles of type II diabetes where the motivation was to identify pathways that can discriminate between normal patients and patients with type II diabetes.