Statistical Methods for Variability Management in High-Performance Computing

dc.contributor.authorXu, Lien
dc.contributor.committeechairHong, Yilien
dc.contributor.committeechairWatson, Layne T.en
dc.contributor.committeememberSmith, Eric P.en
dc.contributor.committeememberGramacy, Robert B.en
dc.contributor.committeememberDeng, Xinweien
dc.contributor.departmentStatisticsen
dc.date.accessioned2021-07-16T08:00:08Zen
dc.date.available2021-07-16T08:00:08Zen
dc.date.issued2021-07-15en
dc.description.abstractHigh-performance computing (HPC) variability management is an important topic in computer science. Research topics include experimental designs for efficient data collection, surrogate models for predicting the performance variability, and system configuration optimization. Due to the complex architecture of HPC systems, a comprehensive study of HPC variability needs large-scale datasets, and experimental design techniques are useful for improved data collection. Surrogate models are essential to understand the variability as a function of system parameters, which can be obtained by mathematical and statistical models. After predicting the variability, optimization tools are needed for future system designs. This dissertation focuses on HPC input/output (I/O) variability through three main chapters. After the general introduction in Chapter 1, Chapter 2 focuses on the prediction models for the scalar description of I/O variability. A comprehensive comparison study is conducted, and major surrogate models for computer experiments are investigated. In addition, a tool is developed for system configuration optimization based on the chosen surrogate model. Chapter 3 conducts a detailed study for the multimodal phenomena in I/O throughput distribution and proposes an uncertainty estimation method for the optimal number of runs for future experiments. Mixture models are used to identify the number of modes for throughput distributions at different configurations. This chapter also addresses the uncertainty in parameter estimation and derives a formula for sample size calculation. The developed method is then applied to HPC variability data. Chapter 4 focuses on the prediction of functional outcomes with both qualitative and quantitative factors. Instead of a scalar description of I/O variability, the distribution of I/O throughput provides a comprehensive description of I/O variability. We develop a modified Gaussian process for functional prediction and apply the developed method to the large-scale HPC I/O variability data. Chapter 5 contains some general conclusions and areas for future work.en
dc.description.abstractgeneralThis dissertation focuses on three projects that are all related to statistical methods in performance variability management in high-performance computing (HPC). HPC systems are computer systems that create high performance by aggregating a large number of computing units. The performance of HPC is measured by the throughput of a benchmark called the IOZone Filesystem Benchmark. The performance variability is the variation among throughputs when the system configuration is fixed. Variability management involves studying the relationship between performance variability and the system configuration. In Chapter 2, we use several existing prediction models to predict the standard deviation of throughputs given different system configurations and compare the accuracy of predictions. We also conduct HPC system optimization using the chosen prediction model as the objective function. In Chapter 3, we use the mixture model to determine the number of modes in the distribution of throughput under different system configurations. In addition, we develop a model to determine the number of additional runs for future benchmark experiments. In Chapter 4, we develop a statistical model that can predict the throughout distributions given the system configurations. We also compare the prediction of summary statistics of the throughput distributions with existing prediction models.en
dc.description.degreeDoctor of Philosophyen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:31861en
dc.identifier.urihttp://hdl.handle.net/10919/104184en
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectcomputer experimentsen
dc.subjectfunctional predictionen
dc.subjectGaussian processen
dc.subjectMachine learningen
dc.subjectprediction modelen
dc.subjectperformance variabilityen
dc.subjectmixture modelen
dc.subjectquantile regressionen
dc.titleStatistical Methods for Variability Management in High-Performance Computingen
dc.typeDissertationen
thesis.degree.disciplineStatisticsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.leveldoctoralen
thesis.degree.nameDoctor of Philosophyen

Files

Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
Xu_L_D_2021.pdf
Size:
3.21 MB
Format:
Adobe Portable Document Format
Name:
Xu_L_D_2021_support_1.pages
Size:
169.59 KB
Format:
Unknown data format
Description:
Supporting documents