Evaluating the Design and Performance of a Single-Chip Parallel Computer Using System-Level Models and Methodology
MetadataShow full item record
As single-chip systems are predicted to soon contain over a billion transistors, design methodologies are evolving dramatically to account for the fast evolution of technologies and product properties. Novel methodologies feature the exploration of design alternatives early in development, the support for IPs, and early error detection â all with a decreasing time-to-market. In order to accommodate these product complexities and development needs, the modeling levels at which designers are working have quickly changed, as development at higher levels of abstraction allows for faster simulations of system models and earlier estimates of system performance while considering design trade-offs.
Recent design advancements to exploit instruction-level parallelism on single-processor computer systems have become exceedingly complex, and modern applications are presenting an increasing potential to be partitioned and parallelized at the thread level. The new Single-Chip, Message-Passing (SCMP) parallel computer is a tightly coupled mesh of processing nodes that is designed to exploit thread-level parallelism as efficiently as possible. By minimizing the latency of communication among processors, memory access time, and the time for context switching, the system designer will undoubtedly observe an overall performance increase. This study presents in-depth evaluations and quantitative analyses of various design and performance aspects of SCMP through the development of abstract hardware models by following a formalized, well-defined methodology. The performance evaluations are taken through benchmark simulation while taking into account system-level communication and synchronization among nodes as well as node-level timing and interaction amongst node components. Through the exploration of alternatives and optimization of the components within the SCMP models, maximum system performance in the hardware implementation can be achieved.
- Masters Theses