Collaborative Scheduling and Synchronization of Distributable Real-Time Threads
Fahmy, Sherif Fadel
MetadataShow full item record
In this dissertation, we consider the problem of scheduling and synchronization of distributable real-time threads --- Real-Time CORBA's first-class abstraction for programming real-time, multi-node sequential behaviors. Distributable real-time threads can be scheduled, broadly, using two paradigms: node independent scheduling, in which nodes independently construct thread schedules, based on node-level decomposition of distributable thread (or DT) scheduling parameters, and collaborative scheduling, in which nodes collaborate to construct system-wide thread schedules, which may or may not involve scheduling parameter decomposition. While significant literature exists on node independent scheduling, little is known about collaborative scheduling and its concomitant tradeoffs. We design three collaborative scheduling algorithms, called ACUA, QBUA, and DQBUA. ACUA uses theory of consensus and QBUA uses theory of quorums for distributable thread schedule construction. DQBUA extends QBUA with lock-based, local and distributed concurrency control. The algorithms consider a model where distributable threads arrive arbitrarily, have time/utility function time constraints, access resources in an arbitrary way (e.g., arbitrary lock acquire/release order, arbitrary nestings), and are subject to arbitrary node crash failures and message losses. We analytically establish several properties of the algorithms including probabilistic end-to-end termination time satisfactions, timeliness optimality during underloads, bounded exception handling time, and correctness of the algorithms in partially synchronous systems. We implement distributable real-time threads in the Linux kernel as a first-class programming and scheduling abstraction. The resulting kernel, called ChronOS, provides application interfaces for creating and manipulating distributable threads, as well as kernel interfaces and mechanisms for scheduling them (using both independent and collaborative approaches). ChronOS also has failure detector mechanisms for detecting and recovering from distributable thread failures. We implement the proposed scheduling algorithms and their competitors in ChronOS and compare their behavior. Our studies reveal that the collaborative scheduling algorithms are superior to independent scheduling algorithms for certain thread sets, in particular, when thread sections have significantly varying execution time. This variability, especially if the variability is not consistent among the threads, may cause each node to make conflicting decisions in the absence of global information. We observe that collaborative schedulers outperform independent schedulers (e.g., EDF augmented with PIP) in terms of accrued utility by as much as 75%. We identify distributed dependencies as one of the major sources of overhead in collaborative scheduling. In particular, the cost of distributed lock-based concurrency control (e.g., lock management, distributed deadlock detection/resolution) can significantly reduce the problem space for which collaborative scheduling is beneficial. To mitigate this, we consider the use of software transactional memory (or STM), an optimistic, non-blocking synchronization alternative to lock-based concurrency control which has been extensively studied in non real-time contexts. We consider distributable real-time threads with STM concurrency control, and develop techniques for analyzing and bounding their end-to-end response times on distributed single-processor and distributed multiprocessor systems. We also develop contention management techniques, a key component of STM, which are driven by threads' real-time scheduling parameters, and establish their tradeoffs against non-real-time contention managers.
- Doctoral Dissertations