Variable screening and graphical modeling for ultra-high dimensional longitudinal data

Zhang, Yafei

Variable screening and graphical modeling for ultra-high dimensional longitudinal data

dc.contributor.author	Zhang, Yafei	en
dc.contributor.committeechair	Du, Pang	en
dc.contributor.committeemember	Wu, Xiaowei	en
dc.contributor.committeemember	Kim, Inyoung	en
dc.contributor.committeemember	Hong, Yili	en
dc.contributor.department	Statistics	en
dc.date.accessioned	2020-12-24T07:00:36Z	en
dc.date.available	2020-12-24T07:00:36Z	en
dc.date.issued	2019-07-02	en
dc.description.abstract	Ultrahigh-dimensional variable selection is of great importance in the statistical research. And independence screening is a powerful tool to select important variable when there are massive variables. Some commonly used independence screening procedures are based on single replicate data and are not applicable to longitudinal data. This motivates us to propose a new Sure Independence Screening (SIS) procedure to bring the dimension from ultra-high down to a relatively large scale which is similar to or smaller than the sample size. In chapter 2, we provide two types of SIS, and their iterative extensions (iterative SIS) to enhance the finite sample performance. An upper bound on the number of variables to be included is derived and assumptions are given under which sure screening is applicable. The proposed procedures are assessed by simulations and an application of them to a study on systemic lupus erythematosus illustrates the practical use of these procedures. After the variables screening process, we then explore the relationship among the variables. Graphical models are commonly used to explore the association network for a set of variables, which could be genes or other objects under study. However, graphical modes currently used are only designed for single replicate data, rather than longitudinal data. In chapter 3, we propose a penalized likelihood approach to identify the edges in a conditional independence graph for longitudinal data. We used pairwise coordinate descent combined with second order cone programming to optimize the penalized likelihood and estimate the parameters. Furthermore, we extended the nodewise regression method the for longitudinal data case. Simulation and real data analysis exhibit the competitive performance of the penalized likelihood method.	en
dc.description.abstractgeneral	Longitudinal data have received a considerable amount of attention in the fields of health science studies. The information from this type of data could be helpful with disease detection and control. Besides, a graph of factors related to the disease can also be built up to represent their relationships between each other. In this dissertation, we develop a framework to find out important factor(s) from thousands of factors in longitudinal data that is/are related to the disease. In addition, we develop a graphical method that can show the relationship among the important factors identified from the previous screening. In practice, combining these two methods together can identify important factors for a disease as well as the relationship among the factors, and thus provide us a deeper understanding about the disease.	en
dc.description.degree	Doctor of Philosophy	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:20483	en
dc.identifier.uri	http://hdl.handle.net/10919/101662	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	graphical model	en
dc.subject	variable screening	en
dc.subject	longitudinal data analysis	en
dc.title	Variable screening and graphical modeling for ultra-high dimensional longitudinal data	en
dc.type	Dissertation	en
thesis.degree.discipline	Statistics	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	doctoral	en
thesis.degree.name	Doctor of Philosophy	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Zhang_Y_D_2019.pdf
Size:: 546.61 KB
Format:: Adobe Portable Document Format

Download

Collections

Doctoral Dissertations