Extracting Parallelism from Legacy Sequential Code Using Transactional Memory

Saad Ibrahim, Mohamed Mohamed

Extracting Parallelism from Legacy Sequential Code Using Transactional Memory

dc.contributor.author	Saad Ibrahim, Mohamed Mohamed	en
dc.contributor.committeechair	Ravindran, Binoy	en
dc.contributor.committeemember	Palmieri, Roberto	en
dc.contributor.committeemember	Broadwater, Robert P.	en
dc.contributor.committeemember	Vullikanti, Anil Kumar S.	en
dc.contributor.committeemember	Plassmann, Paul E.	en
dc.contributor.committeemember	Riad, Sedki Mohamed	en
dc.contributor.department	Electrical and Computer Engineering	en
dc.date.accessioned	2016-07-27T08:00:13Z	en
dc.date.available	2016-07-27T08:00:13Z	en
dc.date.issued	2016-07-26	en
dc.description.abstract	Increasing the number of processors has become the mainstream for the modern chip design approaches. However, most applications are designed or written for single core processors; so they do not benefit from the numerous underlying computation resources. Moreover, there exists a large base of legacy software which requires an immense effort and cost of rewriting and re-engineering to be made parallel. In the past decades, there has been a growing interest in automatic parallelization. This is to relieve programmers from the painful and error-prone manual parallelization process, and to cope with new architecture trend of multi-core and many-core CPUs. Automatic parallelization techniques vary in properties such as: the level of paraellism (e.g., instructions, loops, traces, tasks); the need for custom hardware support; using optimistic execution or relying on conservative decisions; online, offline or both; and the level of source code exposure. Transactional Memory (TM) has emerged as a powerful concurrency control abstraction. TM simplifies parallel programming to the level of coarse-grained locking while achieving fine-grained locking performance. This dissertation exploits TM as an optimistic execution approach for transforming a sequential application into parallel. The design and the implementation of two frameworks that support automatic parallelization: Lerna and HydraVM, are proposed, along with a number of algorithmic optimizations to make the parallelization effective. HydraVM is a virtual machine that automatically extracts parallelism from legacy sequential code (at the bytecode level) through a set of techniques including code profiling, data dependency analysis, and execution analysis. HydraVM is built by extending the Jikes RVM and modifying its baseline compiler. Correctness of the program is preserved through exploiting Software Transactional Memory (STM) to manage concurrent and out-of-order memory accesses. Our experiments show that HydraVM achieves speedup between 2×-5× on a set of benchmark applications. Lerna is a compiler framework that automatically and transparently detects and extracts parallelism from sequential code through a set of techniques including code profiling, instrumentation, and adaptive execution. Lerna is cross-platform and independent of the programming language. The parallel execution exploits memory transactions to manage concurrent and out-of-order memory accesses. This scheme makes Lerna very effective for sequential applications with data sharing. This thesis introduces the general conditions for embedding any transactional memory algorithm into Lerna. In addition, the ordered version of four state-of-art algorithms have been integrated and evaluated using multiple benchmarks including RSTM micro benchmarks, STAMP and PARSEC. Lerna showed great results with average 2.7× (and up to 18×) speedup over the original (sequential) code. While prior research shows that transactions must commit in order to preserve program semantics, placing the ordering enforces scalability constraints at large number of cores. In this dissertation, we eliminates the need for commit transactions sequentially without affecting program consistency. This is achieved by building a cooperation mechanism in which transactions can forward some changes safely. This approach eliminates some of the false conflicts and increases the concurrency level of the parallel application. This thesis proposes a set of commit order algorithms that follow the aforementioned approach. Interestingly, using the proposed commit-order algorithms the peak gain over the sequential non-instrumented execution in RSTM micro benchmarks is 10× and 16.5× in STAMP. Another main contribution is to enhance the concurrency and the performance of TM in general, and its usage for parallelization in particular, by extending TM primitives. The extended TM primitives extracts the embedded low level application semantics without affecting TM abstraction. Furthermore, as the proposed extensions capture common code patterns, it is possible to be handled automatically through the compilation process. In this work, that was done through modifying the GCC compiler to support our TM extensions. Results showed speedups of up to 4× on different applications including micro benchmarks and STAMP. Our final contribution is supporting the commit-order through Hardware Transactional Memory (HTM). HTM contention manager cannot be modified because it is implemented inside the hardware. Given such constraint, we exploit HTM to reduce the transactional execution overhead by proposing two novel commit order algorithms, and a hybrid reduced hardware algorithm. The use of HTM improves the performance by up to 20% speedup.	en
dc.description.degree	Ph. D.	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:8464	en
dc.identifier.uri	http://hdl.handle.net/10919/71861	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Transaction Memory	en
dc.subject	Automatic Parallelization	en
dc.subject	Low-Level Virtual Machine	en
dc.subject	Optimistic Concurrency	en
dc.subject	Speculative Execution	en
dc.subject	Legacy Systems	en
dc.subject	Age Commitment Order	en
dc.subject	Low-Level TM Semantics	en
dc.subject	TM Friendly Semantics	en
dc.title	Extracting Parallelism from Legacy Sequential Code Using Transactional Memory	en
dc.type	Dissertation	en
thesis.degree.discipline	Computer Engineering	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	doctoral	en
thesis.degree.name	Ph. D.	en

Files

Original bundle

Now showing 1 - 2 of 2

Name:: Saad_Ibrahim_MM_D_2016.pdf
Size:: 4.03 MB
Format:: Adobe Portable Document Format

Download

Name:: Saad_Ibrahim_MM_D_2016_support_1.pdf
Size:: 193.02 KB
Format:: Adobe Portable Document Format
Description:: Supporting documents

Download

Collections

Doctoral Dissertations