An Experimental Evaluation of the Scalability of Real-Time Scheduling Algorithms on Large-Scale Multicore Platforms
This thesis studies the problem of experimentally evaluating the scaling behaviors of existing multicore real-time task scheduling algorithms on large-scale multicore platforms. As chip manufacturers rapidly increase the core count of processors, it becomes imperative that multicore real-time scheduling algorithms keep pace. Thus, it must be determined if existing algorithms can scale to these new high core-count platforms. Significant research exists on the theoretical performance of multicore real-time scheduling algorithms, but the vast majority of this research ignores the effects of scalability. It has been demonstrated that multicore real-time scheduling algorithms are feasible for small core-count systems (e.g. 8-core or less), but thus far the majority of the algorithmic research has never been tested on high core-count systems (e.g. 48-core or more).
We present an experimental analysis of the scalability of 16 multicore real-time scheduling algorithms. These algorithms include global, clustered, and partitioned algorithms. We cover a broad range of algorithms, including deadline-based and utility accrual scheduling algorithms. These algorithms are compared under metrics including schedulability, tardiness, deadline satisfaction ratio, and utility accrual ratio. We consider multicore platforms ranging from 8 to 48 cores. The algorithms are implemented in a real-time Linux kernel we create called ChronOS. ChronOS is based on the Linux kernel's PREEMPT RT patch, which provides the underlying operating system kernel with real-time capabilities such as full kernel preemptibility and priority inheritance for kernel locking primitives. ChronOS extends these capabilities with a flexible, scalable real-time scheduling framework.
Our study shows that it is possible to implement global fixed and dynamic priority and simple global utility accrual real-time scheduling algorithms which will scale to large-scale multicore platforms. Interestingly, and in contrast to the conclusion of prior research, our results reveal that some global scheduling algorithms (e.g. G-NP-EDF) is actually scalable on large core counts (e.g. 48). In our implementation, scalability is restricted by lock contention over the global schedule and the cost of inter-processor communication, rather than the global task queue implementation. We also demonstrate that certain classes of utility accrual algorithms such as the GUA class are inherently not scalable. We show that algorithms implemented with scalability as a first-order implementation goal are able to provide real-time guarantees on our 48-core platform.