Designing, Modeling, and Optimizing Transactional Data Structures
Transactional memory (TM) has emerged as a promising synchronization abstraction for multi-core architectures. Unlike traditional lock-based approaches, TM shifts the burden of implementing threads synchronization from the programmer to an underlying framework using hardware (HTM) and/or software (STM) components.
Although TM can be leveraged to implement transactional data structures (i.e., those where multiple operations are allowed to execute atomically, all-or-nothing, according to the transaction paradigm), its intensive speculation may result in significantly lower performance than the optimized concurrent data structures. This poor performance motivates the need to find other, more effective, alternatives for designing transactional data structures without losing the simple programming abstraction proposed by TM.
To do so, we identified three major challenges that need to be addressed to design efficient transactional data structures. The first challenge is composability, namely allowing an atomic execution of two or more data structure operations in the same way as TM provides, but without its high overheads. The second challenge is integration, which enables the execution of data structure operations within generic transactions that may contain other memory- based operations. The last challenge is modeling, which encompasses the necessity of defining a unified formal methodology to reason about the correctness of transactional data structures.
In this dissertation, we propose different approaches to address the above challenges. First, we address the composability challenge by introducing an optimistic methodology to effi- ciently convert concurrent data structures into transactional ones. Second, we address the integration challenge by injecting the semantic operations of those transactional data struc- ture into TM frameworks, and by presenting two novel STM algorithms in order to enhance the overall performance of those frameworks. Finally, we address the modeling challenge by presenting two models for concurrent and transactional data structures designs.
• Our first main contribution in this dissertation is Optimistic transactional boosting (OTB), a methodology to design transactional versions of the highly concurrent optimistic (i.e., lazy) data structures. An earlier (pessimistic) boosting proposal added a layer of abstract locks on top of existing concurrent data structures. Instead, we propose an optimistic boosting methodology, which allows greater data structure-specific optimizations, easier integration with TM frameworks, and lower restrictions on the operations than the original (more pessimistic) boosting methodology.
Based on the proposed OTB methodology, we implement the transactional version of two list-based data structures (i.e., set and priority queue). Then, we present TxCF-Tree, a balanced tree whose design is optimized to support transactional accesses. The core optimizations of TxCF-Tree's operations are: providing a traversal phase that does not use any lock and/or speculation and deferring the lock acquisition or physical modification to the transaction's commit phase; isolating the structural operations (such as re-balancing) in an interference-less housekeeping thread; and minimizing the interference between structural operations and the critical path of semantic operations (i.e., additions and removals on the tree).
• Our second main contribution is to integrate OTB with both STM and HTM algorithms. For STM, we extend the design of both DEUCE, a Java STM framework, and RSTM, a C++ STM framework, to support the integration with OTB. Using our extension, programmers can include both OTB data structure operations and traditional memory reads/writes in the same transaction. Results show that OTB performance is closer to the optimal lazy (non-transactional) data structures than the original boosting algorithm.
On the HTM side, we introduce a methodology to inject semantic operations into the well-known hybrid transactional memory algorithms (e.g., HTM-GL, HyNOrec, and NOre- cRH). In addition, we enhance the proposed semantically-enabled HTM algorithms with a lightweight adaptation mechanism that allows bypassing the HTM paths if the overhead of the semantic operations causes repeated HTM aborts. Experiments on micro- and macro- benchmarks confirm that our proposals outperform the other TM solutions in almost all the tested workloads.
• Our third main contribution is to enhance the performance of TM frameworks in gen- eral by introducing two novel STM algorithms. Remote Transaction Commit (RTC) is a mechanism for executing commit phases of STM transactions in dedicated server cores. RTC shows significant improvements compared to its corresponding validation based STM algorithm (up to 4x better) as it decreases the overhead of spin locking during commit, in terms of cache misses, blocking of lock holders, and CAS operations. Remote Inval- idation (RInval) applies the same idea of RTC on invalidation based STM algorithms. Furthermore, it allows more concurrency by executing commit and invalidation routines concurrently in different servers. RInval performs up to 10x better than its corresponding invalidation based STM algorithm (InvalSTM), and up to 2x better than its corresponding validation-based algorithm (NOrec).
• Our fourth and final main contribution is to provide a theoretical model for concurrent and transactional data structures. We exploit the similarities of the OTB-based data structures and provide a unified model to reason about the correctness of those designs. Specifically, we extend a recent approach that models data structures with concurrent readers and a single writer (called SWMR), and we propose two novel models that additionally allow multiple writers and transactional execution. Those models are more practical because they cover a wider set of data structures than the original SWMR model.