Compute Overlap Stall (COS): Predicting Performance of Power Management for Shared Memory Codes When Throttling Processors, Memory, and Thread Concurrency

Loading...
Thumbnail Image

TR Number

Date

2026-06-02

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

Maximizing performance under power constraints is a priority for highly parallel scientific applications. Modern systems offer control over operating modes, including processor speed (DVFS), memory speed (DMT), and concurrency level (DCT). Throttling speed and core usage reduces energy consumption at the cost of possible performance loss. Accurate execution time prediction mechanisms are useful for choosing system configurations that yield workload efficiency. The Compute Overlap Stall model predicts execution time of parallel applications across these operating modes. The key insight of the model is that pure compute time, pure stall time, and compute-memory overlap are discretely affected by these three operating modes. We validate and update the model with an emergent architecture and reduce the size of the training set with negligible loss in prediction accuracy. We extend the model to support performance prediction for heterogeneous multi-core processors. We employ the optimized COS model on three architectures for 14 application benchmarks. We observe a mean prediction error within 10% for the homogeneous model, and within 13% for the heterogeneous-aware model for most applications.

Description

Keywords

Dynamic Voltage and Frequency Scaling, Dynamic Memory Throttling, Dynamic Concurrency Throttling, Execution Time Prediction, Linear Regression, Performance Prediction, Asymmetric Multiprocessing

Citation

Collections