Optimizing Distributed Transactions: Speculative Client Execution, Certified Serializability, and High Performance Run-Time

TR Number

Date

2016-09-01

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

On-line services already form an important part of modern life with an immense potential for growth. Most of these services are supported by transactional systems, which are backed by database management systems (DBMS) in many cases. Many on-line services use replication to ensure high-availability, fault tolerance and scalability. Replicated systems typically consist of different nodes running the service co-ordinated by a distributed algorithm which aims to drive all the nodes along the same sequence of states by providing a total order to their operations. Thus optimization of both local DBMS operations through concurrency control and the distributed algorithm driving replicated services can lead to enhancing the performance of the on-line services.

Deferred Update Replication (DUR) is a well-known approach to design scalable replicated systems. In this method, the database is fully replicated on each distributed node. User threads perform transactions locally and optimistically before a total order is reached. DUR based systems find their best usage when remote transactions rarely conflict. Even in such scenarios, transactions may abort due to local contention on nodes. A generally adopted method to alleviate the local contention is to invoke a local certification phase to check if a transaction conflicts with other local transactions already completed. If so, the given transaction is aborted locally without burdening the ordering layer. However, this approach still results in many local aborts which significantly degrades the performance.

The first main contribution of this thesis is PXDUR, a DUR based transactional system, which enhances the performance of DUR based systems by alleviating local contention and increasing the transaction commit rate. PXDUR alleviates local contention by allowing speculative forwarding of shared objects from locally committed transactions awaiting total order to running transactions. PXDUR allows transactions running in parallel to use speculative forwarding, thereby enabling the system to utilize the highly parallel multi-core platforms. PXDUR also enhances the performance by optimizing the transaction commit process. It allows the committing transactions to skip read-set validation when it is safe to do so. PXDUR achieves performance gains of an order of magnitude over closest competitors under favorable conditions.

Transactions also form an important part of centralized DBMS, which tend to support multi-threaded access to utilize the highly parallel hardware platforms. The applications can be wrapped in transactions which can then access the DBMS as per the rules of concurrency control. This allows users to develop applications that can run on DBMSs without worrying about synchronization. texttt{Serializability} is the de-facto standard form of isolation required by transactions for many applications. The existing methods employed by DBMSs to enforce serializability employ explicit fine-grained locking. The eager-locking based approach is pessimistic and can be too conservative for many applications.

The locking approach can severely limit the performance of DBMSs especially for scenarios with moderate to high contention. This leads to the second major contribution of this thesis is TSAsR, an adaptive transaction processing framework, which can be applied to DBMSs to improve performance. TSAsR allows the DBMS's internal synchronization to be more relaxed and enforces serializability through the processng of external meta-data in an optimistic manner. It does not require any changes in the application code and achieves orders of magnitude performance improvements for high and moderate contention cases.

The replicated transaction processing systems require a distributed algorithm to keep the system consistent by ensuring that each node executes the same sequence of deterministic commands. These algorithms generally employ texttt{State Machine Replication (SMR)}. Enhancing the performance of such algorithms is a potential way to increase the performance of distributed systems. However, developing new SMR algorithms is limited in production settings because of the huge verification cost involved in proving their correctness.

There are frameworks that allow easy specification of SMR algorithms and subsequent verification. However, algorithms implemented in such framework, give poor performance. This leads to the third major contribution of this thesis Verified JPaxos, a JPaxos based runtime system which can be integrated to an easy to verify I/O automaton based on Multipaxos protocol. Multipaxos is specified in Higher Order Logic (HOL) for ease of verification which is used to generates executable code representing the Multipaxos state changes (I/O Automaton). The runtime drives the HOL generated code and interacts with the service and network to create a fully functional replicated Multipaxos system. The runtime inherits its design from JPaxos along with some optimizations. It achieves significant improvement over a state-of-art SMR verification framework while still being comparable to the performance of non-verified systems.

Description

Keywords

Distributed systems, Transactions, Databases, Verification, Run-time, Concurrency, Paxos, Replication.

Citation

Collections