Browsing by Author "Schaumont, Patrick R."
Now showing 1 - 20 of 131
Results Per Page
Sort Options
- Abstraction Guided Semi-formal VerificationParikh, Ankur (Virginia Tech, 2007-06-15)Abstraction-guided simulation is a promising semi-formal framework for design validation in which an abstract model of the design is used to guide a logic simulator towards a target property. However, key issues still need to be addressed before this framework can truly deliver on it's promise. Concretizing, or finding a real trace from an abstract trace, remains a hard problem. Abstract traces are often spurious, for which no corresponding real trace exits. This is a direct consequence of the abstraction being an over-approximation of the real design. Further, the way in which the abstract model is constructed is an open-ended problem which has a great impact on the performance of the simulator. In this work, we propose a novel approaches to address these issues. First, we present a genetic algorithm to select sets of state variables directly from the gate-level net-list of the design, which are highly correlated to the target property. The sets of selected variables are used to build the Partition Navigation Tracks (PNTs). PNTs capture the behavior of expanded portions of the state space as they related to the target property. Moreover, the computation and storage costs of the PNTs is small, making them scale well to large designs. Our experiments show that we are able to reach many more hard-to-reach states using our proposed techniques, compared to state-of-the-art methods. Next, we propose a novel abstraction strengthening technique, where the abstract design is constrained to make it more closely resemble the concrete design. Abstraction strengthening greatly reduces the need to refine the abstract model for hard to reach properties. To achieve this, we efficiently identify sequentially unreachable partial sates in the concrete design via intelligent partitioning, resolution and cube enlargement. Then, these partial states are added as constraints in the abstract model. Our experiments show that the cost to compute these constraints is low and the abstract traces obtained from the strengthened abstract model are far easier to concretize.
- Accelerating Hardware Simulation on Multi-coresNanjundappa, Mahesh (Virginia Tech, 2010-05-04)Electronic design automation (EDA) tools play a central role in bridging the productivity gap for designing complex hardware systems. However, with an increase in the size and complexity of today's design requirements, current methodologies and EDA tools are unable to effectively mitigate the further widening of productivity gap. It is estimated that testing and verification takes 2/3rd of the total development time of complex hardware systems. Functional simulation forms the main stay of testing and verification process and is the most widely used technique for testing and verification. Most of the simulation algorithms and their implementations are designed for uniprocessor systems that cannot easily leverage the parallelism in multi-core and GPU platforms. For example, logic simulation often uses levelized sequential algorithms, whereas the discrete-event simulation frameworks for Verilog, VHDL and SystemC employ concurrency in the form of multi-threading to given an illusion of the inherent parallelism present in circuits. However, the discrete-event model of computation requires a global notion of an event-queue, which makes improving its simulation performance via parallelization even more challenging. This work investigates automatic parallelization of simulation algorithms used to simulate hardware models. In particular, we focus on parallelizing the simulation of hardware designs described at the RTL using SystemC/HDL with examples to clearly describe the parallelization. Even though multi-cores and GPUs other parallelism, efficiently exploiting this parallelism with their programming models is not straightforward. To overcome this, we also focus our research on building intelligent translators to map simulation applications onto multi-cores and GPUs such that the complexity of the low-level programming models is hidden from the designers.
- Acceleration of Hardware Testing and Validation Algorithms using Graphics Processing UnitsLi, Min (Virginia Tech, 2012-09-17)With the advances of very large scale integration (VLSI) technology, the feature size has been shrinking steadily together with the increase in the design complexity of logic circuits. As a result, the efforts taken for designing, testing, and debugging digital systems have increased tremendously. Although the electronic design automation (EDA) algorithms have been studied extensively to accelerate such processes, some computational intensive applications still take long execution times. This is especially the case for testing and validation. In order tomeet the time-to-market constraints and also to come up with a bug-free design or product, the work presented in this dissertation studies the acceleration of EDA algorithms on Graphics Processing Units (GPUs). This dissertation concentrates on a subset of EDA algorithms related to testing and validation. In particular, within the area of testing, fault simulation, diagnostic simulation and reliability analysis are explored. We also investigated the approaches to parallelize state justification on GPUs, which is one of the most difficult problems in the validation area. Firstly, we present an efficient parallel fault simulator, FSimGP2, which exploits the high degree of parallelism supported by a state-of-the-art graphic processing unit (GPU) with the NVIDIA Compute Unified Device Architecture (CUDA). A novel three-dimensional parallel fault simulation technique is proposed to achieve extremely high computation efficiency on the GPU. The experimental results demonstrate a speedup of up to 4Ã compared to another GPU-based fault simulator. Then, another GPU based simulator is used to tackle an even more computation-intensive task, diagnostic fault simulation. The simulator is based on a two-stage framework which exploits high computation efficiency on the GPU. We introduce a fault pair based approach to alleviate the limited memory capacity on GPUs. Also, multi-fault-signature and dynamic load balancing techniques are introduced for the best usage of computing resources on-board. With continuously feature size scaling and advent of innovative nano-scale devices, the reliability analysis of the digital systems becomes more important nowadays. However, the computational cost to accurately analyze a large digital system is very high. We proposes an high performance reliability analysis tool on GPUs. To achieve highmemory bandwidth on GPUs, two algorithms for simulation scheduling and memory arrangement are proposed. Experimental results demonstrate that the parallel analysis tool is efficient, reliable and scalable. In the area of design validation, we investigate state justification. By employing the swarm intelligence and the power of parallelism on GPUs, we are able to efficiently find a trace that could help us reach the corner cases during the validation of a digital system. In summary, the work presented in this dissertation demonstrates that several applications in the area of digital design testing and validation can be successfully rearchitected to achieve maximal performance on GPUs and obtain significant speedups. The proposed algorithms based on GPU parallelism collectively aim to contribute to improving the performance of EDA tools in Computer aided design (CAD) community on GPUs and other many-core platforms.
- Achieving Security and Privacy in the Internet Protocol Version 6 Through the Use of Dynamically Obscured AddressesDunlop, Matthew William (Virginia Tech, 2012-03-15)Society's increased use of network applications, such as email, social networking, and web browsing, creates a massive amount of information floating around in cyber space. An attacker can collect this information to build a profile of where people go, what their interests are, and even what they are saying to each other. For certain government and corporate entities, the exposure of this information could risk national security or loss of capital. This work identifies vulnerabilities in the way the Internet Protocol version 6 (IPv6) forms addresses. These vulnerabilities provide attackers with the ability to track a node's physical location, correlate network traffic with specific users, and even launch attacks against users' systems. A Moving Target IPv6 Defense (MT6D) that rotates through dynamically obscured network addresses while maintaining existing connections was developed to prevent these addressing vulnerabilities.MT6D is resistant to the IPv6 addressing vulnerabilities since addresses are not tied to host identities and continuously change. MT6D leverages the immense address space of IPv6 to provide an environment that is infeasible to search efficiently. Address obscuration in MT6D occurs throughout ongoing sessions to provide continued anonymity, confidentiality, and security to communicating hosts. Rotating addresses mid-session prevents an attacker from determining that the same two hosts are communicating. The dynamic addresses also force an attacker to repeatedly reacquire the target node before he or she can launch a successful attack. A proof of concept was developed that demonstrates the feasibility of MT6D and its ability to seamlessly bind new IPv6 addresses. Also demonstrated is MT6D's ability to rotate addresses mid-session without dropping or renegotiating sessions.This work makes three contributions to the state-of-the-art IPv6 research. First, it fully explores the security vulnerabilities associated with IPv6 address formation and demonstrates them on a production IPv6 network. Second, it provides a method for dynamically rotating network addresses that defeats these vulnerabilities. Finally, a functioning prototype is presented that proves how network addresses can be dynamically rotated without losing established network connections. If IPv6 is to be globally deployed, it must not provide additional attack vectors that expose user information.
- Advances in the Side-Channel Analysis of Symmetric CryptographyTaha, Mostafa Mohamed Ibrahim (Virginia Tech, 2014-06-10)Side-Channel Analysis (SCA) is an implementation attack where an adversary exploits unintentional outputs of a cryptographic module to reveal secret information. Unintentional outputs, also called side-channel outputs, include power consumption, electromagnetic radiation, execution time, photonic emissions, acoustic waves and many more. The real threat of SCA lies in the ability to mount attacks over small parts of the key and to aggregate information over many different traces. The cryptographic community acknowledges that SCA can break any security module if the adequate protection is not implemented. In this dissertation, we propose several advances in side-channel attacks and countermeasures. We focus on symmetric cryptographic primitives, namely: block-ciphers and hashing functions. In the first part, we focus on improving side-channel attacks. First, we propose a new method to profile highly parallel cryptographic modules. Profiling, in the context of SCA, characterizes the power consumption of a fully-controlled module to extract power signatures. Then, the power signatures are used to attack a similar module. Parallel designs show excessive algorithmic-noise in the power trace. Hence, we propose a novel attack that takes design parallelism into consideration, which results in a more powerful attack. Also, we propose the first comprehensive SCA of the new secure hashing function mbox{SHA-3}. Although the main application of mbox{SHA-3} is hashing, there are other keyed applications including Message Authentication Codes (MACs), where protection against SCA is required. We study the SCA properties of all the operations involved in mbox{SHA-3}. We also study the effect of changing the key-length on the difficulty of mounting attacks. Indeed, changing the key-length changes the attack methodology. Hence, we propose complete attacks against five different case studies, and propose a systematic algorithm to choose an attack methodology based on the key-length. In the second part, we propose different techniques for protection against SCA. Indeed, the threat of SCA can be mitigated if the secret key changes before every execution. Although many contributions, in the domain of leakage resilient cryptography, tried to achieve this goal, the proposed solutions were inefficient and required very high implementation cost. Hence, we highlight a generic framework for efficient leakage resiliency through lightweight key-updating. Then, we propose two complete solutions for protecting AES modes of operation. One uses a dedicated circuit for key-updating, while the other uses the underlying AES block cipher itself. The first one requires small area (for the additional circuit) but achieves negligible performance overhead. The second one has no area overhead but requires small performance overhead. Also, we address the problem of executing all the applications of hashing functions, e.g. the unkeyed application of regular hashing and the keyed application of generating MACs, on the same core. We observe that, running unkeyed application on an SCA-protected core will involve a huge loss of performance (3x to 4x). Hence, we propose a novel SCA-protected core for hashing. Our core has no overhead in unkeyed applications, and negligible overhead in keyed ones. Our research provides a better understanding of side-channel analysis and supports the cryptographic community with lightweight and efficient countermeasures.
- Algorithms and Low Cost Architectures for Trace Buffer-Based Silicon DebugPrabhakar, Sandesh (Virginia Tech, 2009-12-01)An effective silicon debug technique uses a trace buffer to monitor and capture a portion of the circuit response during its functional, post-silicon operation. Due to the limited space of the available trace buffer, selection of the critical trace signals plays an important role in both minimizing the number of signals traced and maximizing the observability/restorability of other untraced signals during post-silicon validation. In this thesis, a new method is proposed for trace buffer signal selection for the purpose of post-silicon debug. The selection is performed by favoring those signals with the most number of implications that are not implied by other signals. Then, based on the values of the traced signals during silicon debug, an algorithm which uses a SAT-based multi-node implication engine is introduced to restore the values of untraced signals across multiple time-frames. A new multiplexer-based trace signal interconnection scheme and a new heuristic for trace signal selection based on implication-based correlation are also described. By this approach, we can effectively trace twice as many signals with the same trace buffer width. A SAT-based greedy heuristic is also proposed to prune the selected trace signal list further to take into account those multi-node implications. A state restoration algorithm is developed for the multiplexer-based trace signal interconnection scheme. Experimental results show that the proposed approaches select the trace signals effectively, giving a high restoration percentage compared with other techniques. We finally propose a lossless compression technique to increase the capacity of the trace buffer. We propose real-time compression of the trace data using Frequency-Directed Run-Length (FDR) code. In addition, we also propose source transformation functions, namely difference vector computation, efficient ordering of trace flip-flops and alternate vector reversal that reduces the entropy of the trace data, making them more amenable for compression. The order of the trace flip-flops is computed off-chip using a probabilistic algorithm. The difference vector computation and alternate vector reversal are implemented on-chip and incurs negligible hardware overhead. Experimental results for sequential benchmark circuits shows that this method gives a better compression percentage compared to dictionary-based techniques and yields up to 3X improvement in the diagnostic capability. We also observe that the area overhead of the proposed approach is less compared to dictionary-based compression techniques.
- Analysis and Enforcement of Properties in Software SystemsWu, Meng (Virginia Tech, 2019-07-02)Due to the lack of effective techniques for detecting and mitigating property violations, existing approaches to ensure the safety and security of software systems are often labor intensive and error prone. Furthermore, they focus primarily on functional correctness of the software code while ignoring micro-architectural details of the underlying processor, such as cache and speculative execution, which may undermine their soundness guarantees. To fill the gap, I propose a set of new methods and tools for ensuring the safety and security of software systems. Broadly speaking, these methods and tools fall into three categories. The first category is concerned with static program analysis. Specifically, I develop a novel abstract interpretation framework that considers both speculative execution and a cache model, and guarantees to be sound for estimating the execution time of a program and detecting side-channel information leaks. The second category is concerned with static program transformation. The goal is to eliminate side channels by equalizing the number of CPU cycles and the number of cache misses along all program paths for all sensitive variables. The third category is concerned with runtime safety enforcement. Given a property that may be violated by a reactive system, the goal is to synthesize an enforcer, called the shield, to correct the erroneous behaviors of the system instantaneously, so that the property is always satisfied by the combined system. I develop techniques to make the shield practical by handling both burst error and real-valued signals. The proposed techniques have been implemented and evaluated on realistic applications to demonstrate their effectiveness and efficiency.
- Analysis of Firmware Security in Embedded ARM EnvironmentsBrown, Dane Andrew (Virginia Tech, 2019-09-30)Modern enterprise-grade systems with virtually unlimited resources have many options when it comes to implementing state of the art intrusion prevention and detection solutions. These solutions are costly in terms of energy, execution time, circuit board area, and capital. Sustainable Internet of Things devices and power-constrained embedded systems are thus forced to make suboptimal security trade-offs. One such trade-off is the design of architectures which prevent execution of injected shell code, yet have allowed Return Oriented Programming (ROP) to emerge as a more reliable way to execute malicious code following attacks. ROP is a method used to take over the execution of a program by causing the return address of a function to be modified through an exploit vector, then returning to small segments of otherwise innocuous code located in executable memory one after the other to carry out the attacker's aims. We show that the Tiva TM4C123GH6PM microcontroller, which utilizes anARM Cortex-M4F processor, can be fully controlled with this technique. Firmware code is pre-loaded into a ROM on Tiva microcontrollers which can be subverted to erase and rewrite the flash memory where the program resides. That same firmware is searched for a Turing-complete gadget set which allows for arbitrary execution. We then design and evaluate a method for verifying the integrity of firmware on embedded systems, in this case Solid State Drives (SSDs). Some manufacturers make firmware updates available, but their proprietary protections leave end users unable to verify the authenticity of the firmware post installation. This means that attackers who are able to get a malicious firmware version installed on a victim SSD are able to operate with full impunity, as the owner will have no tools for detection. We have devised a method for performing side channel analysis of the current drawn by an SSD, which can compare its behavior while running genuine firmware against its behavior when running modified firmware. We train a binary classifier with samples of both versions and are able to consistently discriminate between genuine firmware and modified firmware, even despite changes in external factors such as temperature and supplied power.
- Applications of TORC: An Open Toolkit for Reconfigurable ComputingCouch, Jacob Donald (Virginia Tech, 2011-08-05)Two research projects are proposed that rely on Tools for open Reconfigurable Computing (TORC) and the openness of the Xilinx tool chain. The first project, the Embedded FPGA Transmitter, relies on the ability to add arbitrary routes to a physical FPGA which serve no obvious purpose. These routes can then mimic an antenna and transmit directly from the FPGA. This mechanism is not supported utilizing standard hardware description languages; however, the Embedded FPGA Transmitter requires measurements on a real FPGA to determine success. The second project is a back-end tools accelerator designed to reduce the compilation time for FPGA times. As the complexity of FPGAs have exceeded over a million logic cells, the compilation problem size has greatly expanded. The open-source project, TORC, provides an excellent framework for new FPGA research that provides physical, real-world results to ensure the applicability of the research.
- An Architecture Study on a Xilinx Zynq Cluster with Software Defined Radio ApplicationsDobson, Christopher Vaness (Virginia Tech, 2014-07-16)The rapid rise in computational performance offered by computer systems has greatly increased the number of practical software defined radio applications. The addition of FPGAs to these flexible systems has resulted in platforms that can address a multitude of applications with performance levels that were once only known to ASICs. This work presents an embedded heterogeneous scalable cluster platform with software defined radio applications. The Xilinx Zynq chip provides a hybrid platform consisting of an embedded ARM general-purpose processing core and a low-power FPGA. The ARM core provides all of the benefits and ease of use common to modern high-level software languages while the FPGA segment offers high performance for computationally intensive components of the application. Four of these chips were combined in a scalable cluster and a task assigner was written to automatically place data flows across the FPGAs and ARM cores. The rapid reconfiguration software tFlow was used to dynamically build arbitrary FPGA images out of a library of pre-built modules.
- Architecture Support for Countermeasures against Side-Channel Analysis and Fault AttackKiaei, Pantea (Virginia Tech, 2019)The cryptographic algorithms are designed to be mathematically secure; however, side-channel analysis attacks go beyond mathematics by taking measurements of the device’s electrical activity to reveal the secret data of a cipher. These attacks also go hand in hand with fault analysis techniques to disclose the secret key used in cryptographic ciphers with even fewer measurements. This is of practical concern due to the ubiquity of embedded systems that allow physical access to the adversary such as smart cards, ATMs, etc.. Researchers through the years have come up with techniques to block physical attacks to the hardware or make such attacks less likely to succeed. Most of the conducted research consider one or the other of side-channel analysis and fault injection attacks whereas, in a real setting, the adversary can simultaneously take advantage of both to retrieve the secret data with less effort. Furthermore, very little work considers a software implementation of these ciphers although, with the availability of small and affordable or free microarchitectures, and flexibility and simplicity of software implementations, it is at times more practical to have a software implementation of ciphers instead of dedicated hardware chips. In this project, we come up with a modular presentation, suitable for software implementation of ciphers, to allow having simultaneous resistance against side-channel and fault analysis attacks. We also present an extension at the microarchitecture level to make our proposed countermeasures more intact and efficient.
- Battery-Sensing Intrusion Protection System (B-SIPS)Buennemeyer, Timothy Keith (Virginia Tech, 2008-12-05)This dissertation investigates using instantaneous battery current sensing techniques as a means of detecting IEEE 802.15.1 Bluetooth and 802.11b (Wi-Fi) attacks and anomalous activity on small mobile wireless devices. This research explores alternative intrusion detection methods in an effort to better understand computer networking threats. This research applies to Personal Digital Assistants (PDAs) and smart phones, operating with sensing software in wireless network environments to relay diagnostic battery readings and threshold breaches to indicate possible battery exhaustion attack, intrusion, virus, and worm activity detections. The system relies on host-based software to collect smart battery data to sense instantaneous current characteristics of anomalous network activity directed against small mobile devices. This effort sought to develop a methodology, design and build a net-centric system, and then further explore this non-traditional intrusion detection system (IDS) approach. This research implements the Battery-Sensing Intrusion Protection System (B-SIPS) client detection capabilities for small mobile devices, a server-based Correlation Intrusion Detection Engine (CIDE) for attack correlation with Snort's network-based IDS, device power profiling, graph views, security administrator alert notification, and a database for robust data storage. Additionally, the server-based CIDE provides the interface and filtering tools for a security administrator to further mine our database and conduct forensic analysis. A separate system was developed using a digital oscilloscope to observe Bluetooth, Wi-Fi, and blended attack traces and to create unique signatures. The research endeavor makes five significant contributions to the security field of intrusion detection. First, this B-SIPS work creates an effective intrusion detection approach that can operate on small, mobile host devices in networking environments to sense anomalous patterns in instantaneous battery current as an indicator of malicious activity using an innovative Dynamic Threshold Calculation (DTC) algorithm. Second, the Current Attack Signature Identification and Matching System (CASIMS) provides a means for high resolution current measurements and supporting analytical tools. This system investigates Bluetooth, Wi-Fi, and blended exploits using an oscilloscope to gather high fidelity data. Instantaneous current changes were examined on mobile devices during representative attacks to determine unique attack traces and recognizable signatures. Third, two B-SIPS supporting theoretical models are presented to investigate static and dynamic smart battery polling. These analytical models are employed to examine smart battery characteristics to support the theoretical intrusion detection limits and capabilities of B-SIPS. Fourth, a new genre of attack, known as a Battery Polling Cycle Timing Attack, is introduced. Today's smart battery technology polling rates are designed to support Advanced Power Management needs. Every PDA and smart phone has a polling rate that is determined by the device and smart battery original equipment manufacturers. If an attacker knows the precise timing of the polling rate of the battery's chipset, then the attacker could attempt to craft intrusion packets to arrive within those limited time windows and between the battery's polling intervals. Fifth, this research adds to the body of knowledge about non-traditional attack sensing and correlation by providing a component of an intrusion detection strategy. This work expands today's research knowledge towards a more robust multilayered network defense by creating a novel design and methodology for employing mobile computing devices as a first line of defense to improve overall network security and potentially through extension to other communication mediums in need of defensive capabilities. Mobile computing and communications devices such as PDAs, smart phones, and ultra small general purpose computing devices are the typical targets for the results of this work. Additionally, field-deployed battery operated sensors and sensor networks will also benefit by incorporating security mechanisms developed and described here.
- Biological Agent Sensing Integrated Circuit (BASIC): A New Complementary Metal-oxide-semiconductor (CMOS) Magnetic Biosensor SystemZheng, Yi (Virginia Tech, 2014-06-10)Fast and accurate diagnosis is always in demand by modern medical professionals and in the area of national defense. At present, limitations of testing speed, sample conditions, and levels of precision exist under current technologies, which are usually slow and involve testing the specimen under laboratory conditions. Typically, these methods also involve several biochemical processing steps and subsequent detection of low energy luminescence or electrical changes, all of which reduce the speed of the test as well as limit the precision. In order to solve these problems and improve the sensing performance, this project proposes an innovative CMOS magnetic biological sensor system for rapidly testing the presence of potential pathogens and bioterrorism agents (zoonotic microorganisms) both in specimens and especially in the environment. The sensor uses an electromagnetic detection mechanism to measure changes in the number of microorganisms--tagged by iron nanoparticles--that are placed on the surface of an integrated circuit (IC) chip. Measured magnetic effects are transformed into electronic signals that count the number and type of organisms present. This biosensor introduces a novel design of a conical-shaped inductor, which achieves ultra-accuracy of sensing biological pathogens. The whole system is integrated on a single chip based on the fabrication process of IBM 180 nm (CMOS_IBM_7RF), which makes the sensor small-sized, portable, high speed, and low cost. The results of designing, simulating, and fabricating the sensor are reported in this dissertation.
- Characterization of FPGA-based High Performance ComputersPimenta Pereira, Karl Savio (Virginia Tech, 2011-08-09)As CPU clock frequencies plateau and the doubling of CPU cores per processor exacerbate the memory wall, hybrid core computing, utilizing CPUs augmented with FPGAs and/or GPUs holds the promise of addressing high-performance computing demands, particularly with respect to performance, power and productivity. While traditional approaches to benchmark high-performance computers such as SPEC, took an architecture-based approach, they do not completely express the parallelism that exists in FPGA and GPU accelerators. This thesis follows an application-centric approach, by comparing the sustained performance of two key computational idioms, with respect to performance, power and productivity. Specifically, a complex, single precision, floating-point, 1D, Fast Fourier Transform (FFT) and a Molecular Dynamics modeling application, are implemented on state-of-the-art FPGA and GPU accelerators. As results show, FPGA floating-point FFT performance is highly sensitive to a mix of dedicated FPGA resources; DSP48E slices, block RAMs, and FPGA I/O banks in particular. Estimated results show that for the floating-point FFT benchmark on FPGAs, these resources are the performance limiting factor. Fixed-point FFTs are important in a lot of high performance embedded applications. For an integer-point FFT, FPGAs exploit a flexible data path width to trade-off circuit cost and speed of computation, improving performance and resource utilization. GPUs cannot fully take advantage of this, having a fixed data-width architecture. For the molecular dynamics application, FPGAs benefit from the flexibility in creating a custom, tightly-pipelined datapath, and a highly optimized memory subsystem of the accelerator. This can provide a 250-fold improvement over an optimized CPU implementation and 2-fold improvement over an optimized GPU implementation, along with massive power savings. Finally, to extract the maximum performance out of the FPGA, each implementation requires a balance between the formulation of the algorithm on the platform, the optimum use of available external memory bandwidth, and the availability of computational resources; at the expense of a greater programming effort.
- Circuit Design Methods with Emerging NanotechnologiesZheng, Yexin (Virginia Tech, 2009-12-08)As complementary metal-oxide semiconductor (CMOS) technology faces more and more severe physical barriers down the path of continuously feature size scaling, innovative nano-scale devices and other post-CMOS technologies have been developed to enhance future circuit design and computation. These nanotechnologies have shown promising potentials to achieve magnitude improvement in performance and integration density. The substitution of CMOS transistors with nano-devices is expected to not only continue along the exponential projection of Moore's Law, but also raise significant challenges and opportunities, especially in the field of electronic design automation. The major obstacles that the designers are experiencing with emerging nanotechnology design include: i) the existing computer-aided design (CAD) approaches in the context of conventional CMOS Boolean design cannot be directly employed in the nanoelectronic design process, because the intrinsic electrical characteristics of many nano-devices are not best suited for Boolean implementations but demonstrate strong capability for implementing non-conventional logic such as threshold logic and reversible logic; ii) due to the density and size factors of nano-devices, the defect rate of nanoelectronic system is much higher than conventional CMOS systems, therefore existing design paradigms cannot guarantee design quality and lead to even worse result in high failure ratio. Motivated by the compelling potentials and design challenges of emerging post-CMOS technologies, this dissertation work focuses on fundamental design methodologies to effectively and efficiently achieve high quality nanoscale design. A novel programmable logic element (PLE) is first proposed to explore the versatile functionalities of threshold gates (TGs) and multi-threshold threshold gates (MTTGs). This PLE structure can realize all three- or four-variable logic functions through configuring binary control bits. This is the first single threshold logic structure that provides complete Boolean logic implementation. Based on the PLEs, a reconfigurable architecture is constructed to offer dynamic reconfigurability with little or no reconfiguration overhead, due to the intrinsic self-latching property of nanopipelining. Our reconfiguration data generation algorithm can further reduce the reconfiguration cost. To fully take advantage of such threshold logic design using emerging nanotechnologies, we also developed a combinational equivalence checking (CEC) framework for threshold logic design. Based on the features of threshold logic gates and circuits, different techniques of formulating a given threshold logic in conjunctive normal form (CNF) are introduced to facilitate efficient SAT-based verification. Evaluated with mainstream benchmarks, our hybrid algorithm, which takes into account both input symmetry and input weight order of threshold gates, can efficiently generate CNF formulas in terms of both SAT solving time and CNF generating time. Then the reversible logic synthesis problem is considered as we focus on efficient synthesis heuristics which can provide high quality synthesis results within a reasonable computation time. We have developed a weighted directed graph model for function representation and complexity measurement. An atomic transformation is constructed to associate the function complexity variation with reversible gates. The efficiency of our heuristic lies in maximally decreasing the function complexity during synthesis steps as well as the capability to climb out of local optimums. Thereafter, swarm intelligence, one of the machine learning techniques is employed in the space searching for reversible logic synthesis, which achieves further performance improvement. To tackle the high defect-rate during the emerging nanotechnology manufacturing process, we have developed a novel defect-aware logic mapping framework for nanowire-based PLA architecture via Boolean satisfiability (SAT). The PLA defects of various types are formulated as covering and closure constraints. The defect-aware logic mapping is then solved efficiently by using available SAT solvers. This approach can generate valid logic mapping with a defect rate as high as 20%. The proposed method is universally suitable for various nanoscale PLAs, including AND/OR, NOR/NOR structures, etc. In summary, this work provides some initial attempts to address two major problems confronting future nanoelectronic system designs: the development of electronic design automation tools and the reliability issues. However, there are still a lot of challenging open questions remain in this emerging and promising area. We hope our work can lay down stepstones on nano-scale circuit design optimization through exploiting the distinctive characteristics of emerging nanotechnologies.
- Compiler-Directed Error Resilience for Reliable ComputingLiu, Qingrui (Virginia Tech, 2018-08-08)Error resilience has become as important as power and performance in modern computing architecture. There are various sources of errors that can paralyze real-world computing systems. Of particular interest to this dissertation are single-event errors. They can be the results of energetic particle strike or abrupt power outage that corrupts the program states leading to system failures. Specifically, energetic particle strike is the major cause of soft error while abrupt power outage can result in memory inconsistency in the nonvolatile memory systems. Unfortunately, existing techniques to handle those single-event errors are either resource consuming (e.g., hardware approaches) or heavy-weight (e.g., software approaches). To address this problem, this dissertation identifies idempotent processing as an alternative recovery technique to handle the system failures in an efficient and low-cost manner. Then, this dissertation first proposes to design and develop a compiler-directed lightweight methodology which leverages idempotent processing and the state-of-the-art sensor-based detection to achieve soft error resilience at low-cost. This dissertation also introduces a lightweight soft error tolerant hardware design that redefines idempotent processing where the idempotent regions can be created, verified and recovered from the processor's point of view. Furthermore, this dissertation proposes a series of compiler optimizations that significantly reduce the hardware and runtime overhead of the idempotent processing. Lastly, this dissertation proposes a failure-atomic system integrated with idempotent processing to resolve another type of single-event error, i.e., failure-induced memory inconsistency in the nonvolatile memory systems.
- Constraint Based Program Synthesis for Embedded SoftwareEldib, Hassan Shoukry (Virginia Tech, 2015-07-30)In the world that we live in today, we greatly rely on software in nearly every aspect of our lives. In many critical applications, such as in transportation and medical systems, catastrophic consequences could occur in case of buggy software. As the computational power and storage capacity of computer hardware keep increasing, so are the size and complexity of the software. This makes testing and verification increasingly challenging in practice, and consequentially creates a chance for software with critical bugs to find their way into the consumer market. In this dissertation, I present a set of innovative new methods for automatically verifying, as well as synthesizing, critical software and hardware in embedded computing applications. Based on a set of rigorous formal analysis techniques, my methods can guarantee that the resulting software are efficient and secure as well as provably correct.
- Constraint-Based Thread-Modular Abstract InterpretationKusano, Markus Jan Urban (Virginia Tech, 2018-07-25)In this dissertation, I present a set of novel constraint-based thread-modular abstract-interpretation techniques for static analysis of concurrent programs. Specifically, I integrate a lightweight constraint solver into a thread-modular abstract interpreter to reason about inter-thread interference more accurately. Then, I show how to extend the new analyzer from programs running on sequentially consistent memory to programs running on weak memory. Finally, I show how to perform incremental abstract interpretation, with and without the previously mentioned constraint solver, by analyzing only regions of the program impacted by a program modification. I also demonstrate, through experiments, that these new constraint-based static analyzers are significantly more accurate than prior abstract interpretation-based static analyzers, with lower runtime overhead, and that the incremental technique can drastically reduce runtime overhead in the presence of small program modifications.
- Cybersecurity for the Internet of Things: A Micro Moving Target IPv6 DefenseZeitz, Kimberly Ann (Virginia Tech, 2019-09-04)As the use of low-power and low-resource embedded devices continues to increase dramatically with the introduction of new Internet of Things (IoT) devices, security techniques are necessary which are compatible with these devices. This research advances the knowledge in the area of cybersecurity for the IoT through the exploration of a moving target defense to apply for limiting the time attackers may conduct reconnaissance on embedded systems while considering the challenges presented from IoT devices such as resource and performance constraints. We introduce the design and optimizations for µMT6D, a Micro-Moving Target IPv6 Defense, including a description of the modes of operation and use of lightweight hash algorithms. Through simulations and experiments µMT6D is shown to be viable for use on low power and low resource embedded devices in terms of footprint, power consumption, and energy consumption increases in comparison to the given security benefits. Finally, this provides information on other future considerations and possible avenues of further experimentation and research.
- Demonstration of Vulnerabilities in Globally Distributed Additive ManufacturingNorwood, Charles Ellis (Virginia Tech, 2020-06-24)Globally distributed additive manufacturing is a relatively new frontier in the field of product lifecycle management. Designers are independent of additive manufacturing services, often thousands of miles apart. Manufacturing data must be transmitted electronically from designer to manufacturer to realize the benefits of such a system. Unalterable blockchain legers can record transactions between customers, designers, and manufacturers allowing each to trust the other two without needing to be familiar with each other. Although trust can be established, malicious printers or customers still have the incentive to produce unauthorized or pirated parts. To prevent this, machine instructions are encrypted and electronically transmitted to the printing service, where an authorized printer decrypts the data and prints an approved number of parts or products. The encrypted data may include G-Code machine instructions which contain every motion of every motor on a 3D printer. Once these instructions are decrypted, motor drivers send control signals along wires to the printer's stepper motors. The transmission along these wires is no longer encrypted. If the signals along the wires are read, the motion of the motor can be analyzed, and G-Code can be reverse engineered. This thesis demonstrates such a threat through a simulated attack on a G-Code controlled device. A computer running a numeric controller and G-Code interpreter is connected to standard stepper motors. As G-Code commands are delivered, the magnetic field generated by the transmitted signals is read by a Hall Effect sensor. The rapid oscillation of the magnetic field corresponds to the stepper motor control signals which rhythmically move the motor. The oscillating signals are recorded by a high speed analog to digital converter attached to a second computer. The two systems are completely electronically isolated. The recorded signals are saved as a string of voltage data with a matching time stamp. The voltage data is processed through a Matlab script which analyzes the direction the motor spins and the number of steps the motor takes. With these two pieces of data, the G-Code instructions which produced the motion can be recreated. The demonstration shows the exposure of previously encrypted data, allowing for the unauthorized production of parts, revealing a security flaw in a distributed additive manufacturing environment.