Frequentist-Bayesian Hybrid Tests in Semi-parametric and Non-parametric Models with Low/High-Dimensional Covariate

Files

TR Number

Date

2014-12-03

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

We provide a Frequentist-Bayesian hybrid test statistic in this dissertation for two testing problems. The first one is to design a test for the significant differences between non-parametric functions and the second one is to design a test allowing any departure of predictors of high dimensional X from constant. The implementation is also given in construction of the proposal test statistics for both problems.

For the first testing problem, we consider the statistical difference among massive outcomes or signals to be of interest in many diverse fields including neurophysiology, imaging, engineering, and other related fields. However, such data often have nonlinear system, including to row/column patterns, having non-normal distribution, and other hard-to-identifying internal relationship, which lead to difficulties in testing the significance in difference between them for both unknown relationship and high-dimensionality. In this dissertation, we propose an Adaptive Bayes Sum Test capable of testing the significance between two nonlinear system basing on universal non-parametric mathematical decomposition/smoothing components. Our approach is developed from adapting the Bayes sum test statistic by Hart (2009). Any internal pattern is treated through Fourier transformation. Resampling techniques are applied to construct the empirical distribution of test statistic to reduce the effect of non-normal distribution. A simulation study suggests our approach performs better than the alternative method, the Adaptive Neyman Test by Fan and Lin (1998). The usefulness of our approach is demonstrated with an application in the identification of electronic chips as well as an application to test the change of pattern of precipitations.

For the second testing problem, currently numerous statistical methods have been developed for analyzing high-dimensional data. These methods mainly focus on variable selection approach, but are limited for purpose of testing with high-dimensional data, and often are required to have explicit derivative likelihood functions. In this dissertation, we propose ``Hybrid Omnibus Test'' for high-dimensional data testing purpose with much less requirements. Our Hybrid Omnibus Test is developed under semi-parametric framework where likelihood function is no longer necessary. Our Hybrid Omnibus Test is a version of Freqentist-Bayesian hybrid score-type test for a functional generalized partial linear single index model, which has link being functional of predictors through a generalized partially linear single index. We propose an efficient score based on estimating equation to the mathematical difficulty in likelihood derivation and construct our Hybrid Omnibus Test. We compare our approach with a empirical likelihood ratio test and Bayesian inference based on Bayes factor using simulation study in terms of false positive rate and true positive rate. Our simulation results suggest that our approach outperforms in terms of false positive rate, true positive rate, and computation cost in high-dimensional case and low-dimensional case. The advantage of our approach is also demonstrated by published biological results with application to a genetic pathway data of type II diabetes.

Description

Keywords

Bayes Factor, Bayes Sum Test, Discrete Fourier Transform, Hybrid, Laplace approximation, Neyman Test, Omnibus, Resampling, Score, Single index, Spline Approximation

Citation