Compute Overlap Stall (COS): Predicting Performance of Power Management for Shared Memory Codes When Throttling Processors, Memory, and Thread Concurrency

dc.contributor.authorMcCoy, Alexandra Kirinen
dc.contributor.committeechairCameron, Kirk W.en
dc.contributor.committeememberEllis, Margaret O.'Neilen
dc.contributor.committeememberBack, Godmar Volkeren
dc.contributor.departmentComputer Science and#38; Applicationsen
dc.date.accessioned2026-06-03T08:00:25Zen
dc.date.available2026-06-03T08:00:25Zen
dc.date.issued2026-06-02en
dc.description.abstractMaximizing performance under power constraints is a priority for highly parallel scientific applications. Modern systems offer control over operating modes, including processor speed (DVFS), memory speed (DMT), and concurrency level (DCT). Throttling speed and core usage reduces energy consumption at the cost of possible performance loss. Accurate execution time prediction mechanisms are useful for choosing system configurations that yield workload efficiency. The Compute Overlap Stall model predicts execution time of parallel applications across these operating modes. The key insight of the model is that pure compute time, pure stall time, and compute-memory overlap are discretely affected by these three operating modes. We validate and update the model with an emergent architecture and reduce the size of the training set with negligible loss in prediction accuracy. We extend the model to support performance prediction for heterogeneous multi-core processors. We employ the optimized COS model on three architectures for 14 application benchmarks. We observe a mean prediction error within 10% for the homogeneous model, and within 13% for the heterogeneous-aware model for most applications.en
dc.description.abstractgeneralLarge-scale computers incur high monetary and environmental costs to power and use. Because of this, power itself has become the bottleneck in data-center level systems. Modern computers expose interfaces for system administrators to slow down the machine and therefore save power. These techniques must be used in intelligent coordination with the desired workload to avoid sacrificing a timely job completion in the pursuit of power savings. This document analyzes two analytical performance models that predict the impacts of throttling three machine characteristics on job completion time. The document first adapts an existing model to a contemporary machine, and then extends the model to support accurate completion time prediction on an emergent computer architecture.en
dc.description.degreeMaster of Scienceen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:46701en
dc.identifier.urihttps://hdl.handle.net/10919/143232en
dc.language.isoenen
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectDynamic Voltage and Frequency Scalingen
dc.subjectDynamic Memory Throttlingen
dc.subjectDynamic Concurrency Throttlingen
dc.subjectExecution Time Predictionen
dc.subjectLinear Regressionen
dc.subjectPerformance Predictionen
dc.subjectAsymmetric Multiprocessingen
dc.titleCompute Overlap Stall (COS): Predicting Performance of Power Management for Shared Memory Codes When Throttling Processors, Memory, and Thread Concurrencyen
dc.typeThesisen
thesis.degree.disciplineComputer Science & Applicationsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
McCoy_AK_T_2026.pdf
Size:
5.64 MB
Format:
Adobe Portable Document Format

Collections