Scalable and Energy Efficient Execution Methods for Multicore Systems

dc.contributor.authorLi, Dongen
dc.contributor.committeechairCameron, Kirk W.en
dc.contributor.committeecochairNikolopoulos, Dimitrios S.en
dc.contributor.committeememberde Supinski, Bronis R.en
dc.contributor.committeememberFeng, Wu-chunen
dc.contributor.committeememberMa, Xiaosongen
dc.contributor.departmentComputer Scienceen
dc.date.accessioned2014-03-14T20:07:05Zen
dc.date.adate2011-02-16en
dc.date.available2014-03-14T20:07:05Zen
dc.date.issued2011-01-26en
dc.date.rdate2011-02-16en
dc.date.sdate2011-02-02en
dc.description.abstractMulticore architectures impose great pressure on resource management. The exploration spaces available for resource management increase explosively, especially for large-scale high end computing systems. The availability of abundant parallelism causes scalability concerns at all levels. Multicore architectures also impose pressure on power management. Growth in the number of cores causes continuous growth in power. In this dissertation, we introduce methods and techniques to enable scalable and energy efficient execution of parallel applications on multicore architectures. We study strategies and methodologies that combine DCT and DVFS for the hybrid MPI/OpenMP programming model. Our algorithms yield substantial energy saving (8.74% on average and up to 13.8%) with either negligible performance loss or performance gain (up to 7.5%). To save additional energy for high-end computing systems, we propose a power-aware MPI task aggregation framework. The framework predicts the performance effect of task aggregation in both computation and communication phases and its impact in terms of execution time and energy of MPI programs. Our framework provides accurate predictions that lead to substantial energy saving through aggregation (64.87% on average and up to 70.03%) with tolerable performance loss (under 5%). As we aggregate multiple MPI tasks within the same node, we have the scalability concern of memory registration for high performance networking. We propose a new memory registration/deregistration strategy to reduce registered memory on multicore architectures with helper threads. We investigate design polices and performance implications of the helper thread approach. Our method efficiently reduces registered memory (23.62% on average and up to 49.39%) and avoids memory registration/deregistration costs for reused communication memory. Our system enables the execution of application input sets that could not run to the completion with the memory registration limitation.en
dc.description.degreePh. D.en
dc.identifier.otheretd-02022011-182442en
dc.identifier.sourceurlhttp://scholar.lib.vt.edu/theses/available/etd-02022011-182442/en
dc.identifier.urihttp://hdl.handle.net/10919/26098en
dc.publisherVirginia Techen
dc.relation.haspartLi_Dong_D_2011.pdfen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectPerformance Modeling and Analysisen
dc.subjectMulticore Processorsen
dc.subjectPower-Aware Computingen
dc.subjectConcurrency Throttlingen
dc.subjectHigh-Performance Computingen
dc.titleScalable and Energy Efficient Execution Methods for Multicore Systemsen
dc.typeDissertationen
thesis.degree.disciplineComputer Scienceen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.leveldoctoralen
thesis.degree.namePh. D.en

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Li_Dong_D_2011.pdf
Size:
7.68 MB
Format:
Adobe Portable Document Format