Automatic Scheduling of Compute Kernels Across Heterogeneous Architectures

dc.contributor.authorLyerly, Robert Frantzen
dc.contributor.committeechairRavindran, Binoyen
dc.contributor.committeememberPlassmann, Paulen
dc.contributor.committeememberPatterson, Cameron D.en
dc.contributor.departmentElectrical and Computer Engineeringen
dc.date.accessioned2017-06-13T19:44:10Zen
dc.date.adate2014-06-24en
dc.date.available2017-06-13T19:44:10Zen
dc.date.issued2014-05-07en
dc.date.rdate2014-06-24en
dc.date.sdate2014-05-20en
dc.description.abstractThe world of high-performance computing has shifted from increasing single-core performance to extracting performance from heterogeneous multi- and many-core processors due to the power, memory and instruction-level parallelism walls. All trends point towards increased processor heterogeneity as a means for increasing application performance, from smartphones to servers. These various architectures are designed for different types of applications — traditional "big" CPUs (like the Intel Xeon) are optimized for low latency while other architectures (such as the NVidia Tesla K20x) are optimized for high-throughput. These architectures have different tradeoffs and different performance profiles, meaning fantastic performance gains for the right types of applications. However applications that are ill-suited for a given architecture may experience significant slowdown; therefore, it is imperative that applications are scheduled onto the correct processor. In order to perform this scheduling, applications must be analyzed to determine their execution characteristics. Traditionally this application-to-hardware mapping was determined statically by the programmer. However, this requires intimate knowledge of the application and underlying architecture, and precludes load-balancing by the system. We demonstrate and empirically evaluate a system for automatically scheduling compute kernels by extracting program characteristics and applying machine learning techniques. We develop a machine learning process that is system-agnostic, and works for a variety of contexts (e.g. embedded, desktop/workstation, server). Finally, we perform scheduling in a workload-aware and workload-adaptive manner for these compute kernels.en
dc.description.degreeMaster of Scienceen
dc.identifier.otheretd-05202014-193503en
dc.identifier.sourceurlhttp://scholar.lib.vt.edu/theses/available/etd-05202014-193503/en
dc.identifier.urihttp://hdl.handle.net/10919/78130en
dc.language.isoen_USen
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectHigh-Performance Computingen
dc.subjectRuntime Systemsen
dc.subjectHeterogeneous Architecturesen
dc.subjectCompilersen
dc.subjectSchedulingen
dc.titleAutomatic Scheduling of Compute Kernels Across Heterogeneous Architecturesen
dc.typeThesisen
dc.type.dcmitypeTexten
thesis.degree.disciplineElectrical and Computer Engineeringen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
etd-05202014-193503_Lyerly_RF_T_2014_2.pdf
Size:
895.24 KB
Format:
Adobe Portable Document Format

Collections