VTechWorks Repository :: Browsing by Author "Hsiao, Michael S."

Browsing by Author "Hsiao, Michael S."

Now showing 1 - 20 of 155

Abstraction Guided Semi-formal Verification
Parikh, Ankur (Virginia Tech, 2007-06-15)
Abstraction-guided simulation is a promising semi-formal framework for design validation in which an abstract model of the design is used to guide a logic simulator towards a target property. However, key issues still need to be addressed before this framework can truly deliver on it's promise. Concretizing, or finding a real trace from an abstract trace, remains a hard problem. Abstract traces are often spurious, for which no corresponding real trace exits. This is a direct consequence of the abstraction being an over-approximation of the real design. Further, the way in which the abstract model is constructed is an open-ended problem which has a great impact on the performance of the simulator. In this work, we propose a novel approaches to address these issues. First, we present a genetic algorithm to select sets of state variables directly from the gate-level net-list of the design, which are highly correlated to the target property. The sets of selected variables are used to build the Partition Navigation Tracks (PNTs). PNTs capture the behavior of expanded portions of the state space as they related to the target property. Moreover, the computation and storage costs of the PNTs is small, making them scale well to large designs. Our experiments show that we are able to reach many more hard-to-reach states using our proposed techniques, compared to state-of-the-art methods. Next, we propose a novel abstraction strengthening technique, where the abstract design is constrained to make it more closely resemble the concrete design. Abstraction strengthening greatly reduces the need to refine the abstract model for hard to reach properties. To achieve this, we efficiently identify sequentially unreachable partial sates in the concrete design via intelligent partitioning, resolution and cube enlargement. Then, these partial states are added as constraints in the abstract model. Our experiments show that the cost to compute these constraints is low and the abstract traces obtained from the strengthened abstract model are far easier to concretize.
Acceleration of Hardware Testing and Validation Algorithms using Graphics Processing Units
Li, Min (Virginia Tech, 2012-09-17)
With the advances of very large scale integration (VLSI) technology, the feature size has been shrinking steadily together with the increase in the design complexity of logic circuits. As a result, the efforts taken for designing, testing, and debugging digital systems have increased tremendously. Although the electronic design automation (EDA) algorithms have been studied extensively to accelerate such processes, some computational intensive applications still take long execution times. This is especially the case for testing and validation. In order tomeet the time-to-market constraints and also to come up with a bug-free design or product, the work presented in this dissertation studies the acceleration of EDA algorithms on Graphics Processing Units (GPUs). This dissertation concentrates on a subset of EDA algorithms related to testing and validation. In particular, within the area of testing, fault simulation, diagnostic simulation and reliability analysis are explored. We also investigated the approaches to parallelize state justification on GPUs, which is one of the most difficult problems in the validation area. Firstly, we present an efficient parallel fault simulator, FSimGP2, which exploits the high degree of parallelism supported by a state-of-the-art graphic processing unit (GPU) with the NVIDIA Compute Unified Device Architecture (CUDA). A novel three-dimensional parallel fault simulation technique is proposed to achieve extremely high computation efficiency on the GPU. The experimental results demonstrate a speedup of up to 4Ã compared to another GPU-based fault simulator. Then, another GPU based simulator is used to tackle an even more computation-intensive task, diagnostic fault simulation. The simulator is based on a two-stage framework which exploits high computation efficiency on the GPU. We introduce a fault pair based approach to alleviate the limited memory capacity on GPUs. Also, multi-fault-signature and dynamic load balancing techniques are introduced for the best usage of computing resources on-board. With continuously feature size scaling and advent of innovative nano-scale devices, the reliability analysis of the digital systems becomes more important nowadays. However, the computational cost to accurately analyze a large digital system is very high. We proposes an high performance reliability analysis tool on GPUs. To achieve highmemory bandwidth on GPUs, two algorithms for simulation scheduling and memory arrangement are proposed. Experimental results demonstrate that the parallel analysis tool is efficient, reliable and scalable. In the area of design validation, we investigate state justification. By employing the swarm intelligence and the power of parallelism on GPUs, we are able to efficiently find a trace that could help us reach the corner cases during the validation of a digital system. In summary, the work presented in this dissertation demonstrates that several applications in the area of digital design testing and validation can be successfully rearchitected to achieve maximal performance on GPUs and obtain significant speedups. The proposed algorithms based on GPU parallelism collectively aim to contribute to improving the performance of EDA tools in Computer aided design (CAD) community on GPUs and other many-core platforms.
Algorithms and Low Cost Architectures for Trace Buffer-Based Silicon Debug
Prabhakar, Sandesh (Virginia Tech, 2009-12-01)
An effective silicon debug technique uses a trace buffer to monitor and capture a portion of the circuit response during its functional, post-silicon operation. Due to the limited space of the available trace buffer, selection of the critical trace signals plays an important role in both minimizing the number of signals traced and maximizing the observability/restorability of other untraced signals during post-silicon validation. In this thesis, a new method is proposed for trace buffer signal selection for the purpose of post-silicon debug. The selection is performed by favoring those signals with the most number of implications that are not implied by other signals. Then, based on the values of the traced signals during silicon debug, an algorithm which uses a SAT-based multi-node implication engine is introduced to restore the values of untraced signals across multiple time-frames. A new multiplexer-based trace signal interconnection scheme and a new heuristic for trace signal selection based on implication-based correlation are also described. By this approach, we can effectively trace twice as many signals with the same trace buffer width. A SAT-based greedy heuristic is also proposed to prune the selected trace signal list further to take into account those multi-node implications. A state restoration algorithm is developed for the multiplexer-based trace signal interconnection scheme. Experimental results show that the proposed approaches select the trace signals effectively, giving a high restoration percentage compared with other techniques. We finally propose a lossless compression technique to increase the capacity of the trace buffer. We propose real-time compression of the trace data using Frequency-Directed Run-Length (FDR) code. In addition, we also propose source transformation functions, namely difference vector computation, efficient ordering of trace flip-flops and alternate vector reversal that reduces the entropy of the trace data, making them more amenable for compression. The order of the trace flip-flops is computed off-chip using a probabilistic algorithm. The difference vector computation and alternate vector reversal are implemented on-chip and incurs negligible hardware overhead. Experimental results for sequential benchmark circuits shows that this method gives a better compression percentage compared to dictionary-based techniques and yields up to 3X improvement in the diagnostic capability. We also observe that the area overhead of the proposed approach is less compared to dictionary-based compression techniques.
Analysis and Enforcement of Properties in Software Systems
Wu, Meng (Virginia Tech, 2019-07-02)
Due to the lack of effective techniques for detecting and mitigating property violations, existing approaches to ensure the safety and security of software systems are often labor intensive and error prone. Furthermore, they focus primarily on functional correctness of the software code while ignoring micro-architectural details of the underlying processor, such as cache and speculative execution, which may undermine their soundness guarantees. To fill the gap, I propose a set of new methods and tools for ensuring the safety and security of software systems. Broadly speaking, these methods and tools fall into three categories. The first category is concerned with static program analysis. Specifically, I develop a novel abstract interpretation framework that considers both speculative execution and a cache model, and guarantees to be sound for estimating the execution time of a program and detecting side-channel information leaks. The second category is concerned with static program transformation. The goal is to eliminate side channels by equalizing the number of CPU cycles and the number of cache misses along all program paths for all sensitive variables. The third category is concerned with runtime safety enforcement. Given a property that may be violated by a reactive system, the goal is to synthesize an enforcer, called the shield, to correct the erroneous behaviors of the system instantaneously, so that the property is always satisfied by the combined system. I develop techniques to make the shield practical by handling both burst error and real-valued signals. The proposed techniques have been implemented and evaluated on realistic applications to demonstrate their effectiveness and efficiency.
Anti-Counterfeit and Anti-Tamper Hardware Implementation using Hardware Obfuscation
Desai, Avinash R. (Virginia Tech, 2013-09-06)
Tampering and Reverse Engineering of a chip to extract the hardware Intellectual Property (IP) core or to inject malicious alterations is a major concern. First, offshore chip manufacturing allows the design secrets of the IP cores to be transparent to the foundry and other entities along the production chain. Second, small malicious modifications to the design may not be detectable after fabrication without anti-tamper mechanisms. Counterfeit Integrated Circuits (ICs) also have become an important security issue in recent years, in which counterfeit ICs that perform incorrectly or sub-par to the expected can lead to catastrophic consequences in safety and/or mission-critical applications, in addition to the tremendous economic toll they incur to the semiconductor industry. Some techniques have been developed in the past to improve the defense against such attacks but they tend to fall prey to the increasing power of the attacker. We present a new way to protect against tampering by a clever obfuscation of the design, which can be unlocked with a specific, dynamic path traversal. Hence, the functional mode of the controller is hidden with the help of obfuscated states, and the functional mode is made operational only on the formation of a specific interlocked Code-Word during state transition. A novel time-stamp is proposed that can provide the date at which the IC was manufactured for counterfeit detection. Furthermore, we propose a second layer of tamper resistance to the time-stamp circuit to make it even more difficult to modify. Results show that methods proposed offer higher levels of security with small area overhead. A side benefit is that any small alteration will be magnified via the obfuscated design proposed in these methods.
Application of Computer Vision Techniques for Railroad Inspection using UAVs
Harekoppa, Pooja Puttaswamygowda (Virginia Tech, 2016-08-16)
The task of railroad inspection is a tedious one. It requires a lot of skilled experts and long hours of frequent on-field inspection. Automated ground equipment systems that have been developed to address this problem have the drawback of blocking the rail service during inspection process. As an alternative, using aerial imagery from a UAV, Computer Vision and Machine Learning based techniques were developed in this thesis to analyze two kinds of defects on the rail tracks. The defects targeted were missing spikes on tie plates and cracks on ties. In order to perform this inspection, the rail region was identified in the image and then the tie plate and tie regions on the track were detected. These steps were performed using morphological operations, filtering and intensity analysis. Once the tie plate was localized, the regions of interest on the plate were used to train a machine learning model to recognize missing spikes. Classification using SVM resulted in an accuracy of around 96% and varied greatly based on the tie plate illumination and ROI alignment for Lampasas and Chickasha subdivision datasets. Also, many other different classifiers were used for training and testing and an ensemble method with majority vote scheme was also explored for classification. The second category of learning model used was a multi-layered neural network. The major drawback of this method was, it required a lot of images for training. However, it performed better than feature based classifiers with availability of larger training dataset. As a second kind of defect, tie conditions were analyzed. From the localized tie region, the tie cracks were detected using thresholding and morphological operations. A machine learning classifier was developed to predict the condition of a tie based on training examples of images with extracted features. The multi-class classification accuracy obtained was around 83% and there were no misclassifications seen between two extreme classes of tie condition on the test data.
Architecture Support for Countermeasures against Side-Channel Analysis and Fault Attack
Kiaei, Pantea (Virginia Tech, 2019)
The cryptographic algorithms are designed to be mathematically secure; however, side-channel analysis attacks go beyond mathematics by taking measurements of the device’s electrical activity to reveal the secret data of a cipher. These attacks also go hand in hand with fault analysis techniques to disclose the secret key used in cryptographic ciphers with even fewer measurements. This is of practical concern due to the ubiquity of embedded systems that allow physical access to the adversary such as smart cards, ATMs, etc.. Researchers through the years have come up with techniques to block physical attacks to the hardware or make such attacks less likely to succeed. Most of the conducted research consider one or the other of side-channel analysis and fault injection attacks whereas, in a real setting, the adversary can simultaneously take advantage of both to retrieve the secret data with less effort. Furthermore, very little work considers a software implementation of these ciphers although, with the availability of small and affordable or free microarchitectures, and flexibility and simplicity of software implementations, it is at times more practical to have a software implementation of ciphers instead of dedicated hardware chips. In this project, we come up with a modular presentation, suitable for software implementation of ciphers, to allow having simultaneous resistance against side-channel and fault analysis attacks. We also present an extension at the microarchitecture level to make our proposed countermeasures more intact and efficient.
The Art of SRAM Security: Tactics for Remanence-based Attack and Strategies for Defense
Mahmod, Jubayer (Virginia Tech, 2024-05-02)
The importance of securing hardware, particularly in the context of the Internet of Things (IoT), cannot be overstated in light of the increasing prevalence of low-level attacks. As the IoT industry continues to expand, security has become a more holistic concern, as evidenced by the wide range of attacks that we observed, from large-scale distributed denial-of-service attacks to data theft through monitoring a device's low-level behavior, such as power consumption. Traditional software-based security measures fall short in defending against the full spectrum of attacks, particularly those involving physical tampering with system hardware. This underscores the critical importance of proactively integrating attack vectors that encompass both hardware and software domains, with a particular emphasis on considering both the analog and digital characteristics of hardware. This thesis investigates system security from a hardware perspective, specifically examining how low-level circuit behavior and architectural design choices impact SRAM's data remanence and its implications for security. This dissertation not only identifies new vulnerabilities due to SRAM data remanence but also paves the way for novel security solutions in the ongoing "security arms race". I present an attack, volt boot, that executes cold-boot style short-term data remanence in on-chip SRAM without using temperature effect. This attack exploits the fact that SRAM's power bus is externally accessible and allows data retention using a simple voltage probe. Next, I present a steganography method that hides information in the SRAM exploiting long-term data remanence. This approach leverages aging-induced degradation to imprint data in SRAM's analog domain, ultimately resulting in hidden and plausibly deniable information storage in the hardware. Finally, I show how an adversary weaponizes SRAM data remanence to develop an attack on a hardware-backed security isolation mechanism. The following provides a brief overview of the three major contributions of this thesis: 1. Volt boot is an attack that demonstrates the vulnerability of on-chip SRAM due to the physical separation common in modern SoCs' power distribution networks. By probing external power pins (to the cache) of an SoC while simultaneously shutting down the main system power, Volt boot creates data retention across power cycles. On-chip SRAM can be a safe memory when the threat model considers traditional off-chip cold-boot-style attacks. This research demonstrates an alternative method for preserving information in on-chip SRAM through power cycles, expanding our understanding of data retention capabilities. Volt boot leverages asymmetrical power states (e.g., on vs. off) to force SRAM state retention across power cycles, eliminating the need for traditional cold boot attack enablers, such as low-temperature or intrinsic data retention time. 2. Invisible Bits is a hardware steganography technique that hides secret messages in the analog domain of SRAM embedded within a computing device. Exploiting accelerated transistor aging, Invisible Bits stores hidden data along with system data in an on-chip cache and provides a plausible deniability guarantee from statistical analysis. Aging changes the transistor's behavior which I exploit to store data permanently (ie long-term data remanence) in an SRAM. Invisible Bits presents unique opportunities for safeguarding electronic devices when subjected to inspections by authorities. 3. UntrustZone utilizes long-term data remanence to exfiltrate secrets from on-chip SRAM. An attacker application must be able to read retained states in the SRAM upon power cycles, but this needs changing the security privilege. Hardware security schemes, such as ARM TrustZone, erase a memory block before changing its security attributes and releasing it to other applications, making short-term data remanence attacks ineffective. That is, attacks such as Volt boot fail when hardware-backed isolation such as TEE is enforced. UntrustZone unveils a new threat to all forms of on-chip SRAM even when backed by hardware isolation: long-term data remanence. I show how an attacker systematically accelerates data imprinting on SRAM's analog domain to effectively burn in on-chip secrets and bypass TrustZone isolation.
ATPG and DFT Algorithms for Delay Fault Testing
Liu, Xiao (Virginia Tech, 2004-06-10)
With ever shrinking geometries, growing metal density and increasing clock rate on chips, delay testing is becoming a necessity in industry to maintain test quality for speed-related failures. The purpose of delay testing is to verify that the circuit operates correctly at the rated speed. However, functional tests for delay defects are usually unacceptable for large scale designs due to the prohibitive cost of functional test patterns and the difficulty in achieving very high fault coverage. Scan-based delay testing, which could ensure a high delay fault coverage at reasonable development cost, provides a good alternative to the at-speed functional test. This dissertation addresses several key challenges in scan-based delay testing and develops efficient Automatic Test Pattern Generation (ATPG) and Design-for-testability (DFT) algorithms for delay testing. In the dissertation, two algorithms are first proposed for computing and applying transition test patterns using stuck-at test vectors, thus avoiding the need for a transition fault test generator. The experimental results show that we can improve both test data volume and test application time by 46.5% over a commercial transition ATPG tool. Secondly, we propose a hybrid scan-based delay testing technique for compact and high fault coverage test set, which combines the advantages of both the skewed-load and broadside test application methods. On an average, about 4.5% improvement in fault coverage is obtained by the hybrid approach over the broad-side approach, with very little hardware overhead. Thirdly, we propose and develop a constrained ATPG algorithm for scan-based delay testing, which addresses the overtesting problem due to the possible detection of functionally untestable faults in scan-based testing. The experimental results show that our method efficiently generates a test set for functionally testable transition faults and reduces the yield loss due to overtesting of functionally untestable transition faults. Finally, a new approach on identifying functionally untestable transition faults in non-scan sequential circuits is presented. We formulate a new dominance relationship for transition faults and use it to help identify more untestable transition faults on top of a fault-independent method based on static implications. The experimental results for ISCAS89 sequential benchmark circuits show that our approach can identify many more functionally untestable transition faults than previously reported.
ATPG based Preimage Computation: Efficient Search Space Pruning using ZBDD
Chandrasekar, Kameshwar (Virginia Tech, 2003-07-28)
Preimage Computation is a fundamental step in Formal Verification of VLSI designs. Conventional OBDD-based methods for Formal Verification suffer from spatial explosion, since large designs can blow up in terms of memory. On the other hand, SAT/ATPG based methods are less demanding on memory. But the run-time can be huge for these methods, since they must explore an exponential search space. In order to reduce this temporal explosion of SAT/ATPG based methods, efficient learning techniques are needed. Conventional ATPG aims at computing a single solution for its objective. In preimage computation, we must enumerate all solutions for the target state during the search. Similar sub-problems often occur during preimage computation that can be identified by the internal state of the circuit. Therefore, it is highly desirable to learn from these search-states and avoid repeated search of identical solution/conflict subspaces, for better performance. In this thesis, we present a new ZBDD based method to compactly store and efficiently search previously explored search-states. We learn from these search-states and avoid repeating subsets and supersets of previously encountered search spaces. Both solution and conflict subspaces are pruned based on simple set operations using ZBDDs. We integrate our techniques into a PODEM based ATPG engine and demonstrate their efficiency on ISCAS '89 benchmark circuits. Experimental results show that upto 90% of the search-space is pruned due to the proposed techniques and we are able to compute preimages for target states where a state-of-the-art technique fails.
Automatic Instantiation and Timing-Aware Placement of Bus Macros for Partially Reconfigurable FPGA Designs
Subbarayan, Guruprasad (Virginia Tech, 2010-11-19)
FPGA design implementation and debug tools have not kept pace with the advances in FPGA device density. The emphasis on area optimization and circuit speed has resulted in longer runtimes of the implementation tools. We address the implementation problem using a divide-and-conquer approach in which some device area and circuit speed is sacrificed for improved implementation turnaround time. The PATIS floorplanner enables dynamic modular design that accelerates implementation for incremental changes to a design. While the existing implementation flows facilitate timing closure late in the design cycle by reusing the layout of unmodified blocks, dynamic modular design accelerates implementation by achieving timing closure for each block independently. A complete re-implementation is still rapid as the design blocks can be processed by independent and concurrent invocations of the standard tools. PATIS creates the floorplan for implementing modules in the design. Bus macros serve as module interfaces and enable independent implementation of the modules. The dynamic modular design flow achieves around 10x speedup over the standard design flow for our benchmark designs.
Bayesian Integration and Modeling for Next-generation Sequencing Data Analysis
Chen, Xi (Virginia Tech, 2016-07-01)
Computational biology currently faces challenges in a big data world with thousands of data samples across multiple disease types including cancer. The challenging problem is how to extract biologically meaningful information from large-scale genomic data. Next-generation Sequencing (NGS) can now produce high quality data at DNA and RNA levels. However, in cells there exist a lot of non-specific (background) signals that affect the detection accuracy of true (foreground) signals. In this dissertation work, under Bayesian framework, we aim to develop and apply approaches to learn the distribution of genomic signals in each type of NGS data for reliable identification of specific foreground signals. We propose a novel Bayesian approach (ChIP-BIT) to reliably detect transcription factor (TF) binding sites (TFBSs) within promoter or enhancer regions by jointly analyzing the sample and input ChIP-seq data for one specific TF. Specifically, a Gaussian mixture model is used to capture both binding and background signals in the sample data; and background signals are modeled by a local Gaussian distribution that is accurately estimated from the input data. An Expectation-Maximization algorithm is used to learn the model parameters according to the distributions on binding signal intensity and binding locations. Extensive simulation studies and experimental validation both demonstrate that ChIP-BIT has a significantly improved performance on TFBS detection over conventional methods, particularly on weak binding signal detection. To infer cis-regulatory modules (CRMs) of multiple TFs, we propose to develop a Bayesian integration approach, namely BICORN, to integrate ChIP-seq and RNA-seq data of the same tissue. Each TFBS identified from ChIP-seq data can be either a functional binding event mediating target gene transcription or a non-functional binding. The functional bindings of a set of TFs usually work together as a CRM to regulate the transcription processes of a group of genes. We develop a Gibbs sampling approach to learn the distribution of CRMs (a joint distribution of multiple TFs) based on their functional bindings and target gene expression. The robustness of BICORN has been validated on simulated regulatory network and gene expression data with respect to different noise settings. BICORN is further applied to breast cancer MCF-7 ChIP-seq and RNA-seq data to identify CRMs functional in promoter or enhancer regions. In tumor cells, the normal regulatory mechanism may be interrupted by genome mutations, especially those somatic mutations that uniquely occur in tumor cells. Focused on a specific type of genome mutation, structural variation (SV), we develop a novel pattern-based probabilistic approach, namely PSSV, to identify somatic SVs from whole genome sequencing (WGS) data. PSSV features a mixture model with hidden states representing different mutation patterns; PSSV can thus differentiate heterozygous and homozygous SVs in each sample, enabling the identification of those somatic SVs with a heterozygous status in the normal sample and a homozygous status in the tumor sample. Simulation studies demonstrate that PSSV outperforms existing tools. PSSV has been successfully applied to breast cancer patient WGS data for identifying somatic SVs of key factors associated with breast cancer development. In this dissertation research, we demonstrate the advantage of the proposed distributional learning-based approaches over conventional methods for NGS data analysis. Distributional learning is a very powerful approach to gain biological insights from high quality NGS data. Successful applications of the proposed Bayesian methods to breast cancer NGS data shed light on underlying molecular mechanisms of breast cancer, enabling biologists or clinicians to identify major cancer drivers and develop new therapeutics for cancer treatment.
BitMaT - Bitstream Manipulation Tool for Xilinx FPGAs
Morford, Casey Justin (Virginia Tech, 2005-12-15)
With the introduction of partially reconfigurable FPGAs, we are now able to perform dynamic changes to hardware running on an FPGA without halting the operation of the design. Module based partial reconfiguration allows the hardware designer to create multiple hardware modules that perform different tasks and swap them in and out of designated dynamic regions on an FPGA. However, the current mainstream partial reconfiguration flow provides a limited and inefficient approach that requires a strict set of guidelines to be met. This thesis introduces BitMaT, a tool that provides the low-level bitstream manipulation as a member tool of an alternative, automated, modular partial reconfiguration flow.
Branch Guided Metrics for Functional and Gate-level Testing
Acharya, Vineeth Vadiraj (Virginia Tech, 2015-03-31)
With the increasing complexity of modern day processors and system-on-a-chip (SOCs), designers invest a lot of time and resources into testing and validating these designs. To reduce the time-to-market and cost, the techniques used to validate these designs have to constantly improve. Since most of the design activity has moved to the register transfer level (RTL), test methodologies at the RTL have been gaining momentum. We present a novel functional test generation framework for functional test generation at RTL. A popular software-based metric for measuring the effectiveness of an RTL test suite is branch coverage. But exercising hard-to-reach branches is still a challenge and requires good understanding of the design semantics. The proposed framework uses static analysis to extract certain semantics of the circuit and uses several data structures to model these semantics. Using these data structures, we assist the branch-guided search to exercise these hard-to-reach branches. Since the correlation between high branch coverage and detecting defects and bugs is not clear, we present a new metric at the RTL which augments the RTL branch coverage with state values. Vectors which have higher scores on the new metric achieve higher branch and state coverages, and therefore can be applied at different levels of abstraction such as post-silicon validation. Experimental results show that use of the new metric in our test generation framework can achieve a high level of branch and fault coverage for several benchmark circuits, while reducing the length of the vector sequence. This work was supported in part by the NSF grant 1016675.
Building a Cognitive Radio: From Architecture Definition to Prototype Implementation
Le, Bin (Virginia Tech, 2007-06-11)
Cognitive radio (CR) technology introduces a revolutionary wireless communication mechanism in terminals and network segments, so that they are able to learn their environment and adapt intelligently to the most appropriate way of providing the service for the user's exact need. By supporting multi-band, mode-mode cognitive applications, the cognitive radio addresses an interactive way of managing the spectrum that harmonizes technology, market and regulation. This dissertation gives a complete story of building a cognitive radio. It goes through concept clarification, architecture definition, functional block building, system integration, and finally to the implementation of a fully-functional cognitive radio node prototype that can be directly packaged for application use. This dissertation starts with a comprehensive review of CR research from its origin to today. Several fundamental research issues are then addressed to let the reader know what makes CR a challenging and interesting research area. Then the CR system solution is introduced with the details of its hierarchical functional architecture called the Egg Model, modular software system called the cognitive engine, and the kernel machine learning mechanism called the cognition cycle. Next, this dissertation discusses the design of specific functional building blocks which incorporate environment awareness, solution making, and adaptation. These building blocks are designed to focus on the radio domain that mainly concerns the radio environment and the radio platform. Awareness of the radio environment is achieved by extracting the key environmental features and applying statistical pattern recognition methods including artificial neural networks and k-nearest neighbor clustering. Solutions for the radio behavior are made according to the recognized environment and the previous knowledge through case based reasoning, and further adapted or optimized through genetic algorithm solution search. New experiences are gained through the practice of the new solution, and thus the CR's knowledge evolves for future use; therefore, the CR's performance continues improving with this reinforcement learning approach. To deploy the solved solution in terms of the radio's parameters, a platform independent radio interface is designed. With this general radio interface, the algorithms in the cognitive engine software system can be applied to various radio hardware platforms. To support and verify designed cognitive algorithms and cognitive functionalities, a complete reconfigurable SDR platform, called the CWT2 waveform framework, is designed in this dissertation. In this waveform framework, a hierarchical configuration and control system is constructed to support flexible, real-time waveform reconfigurability. Integrating all the building blocks described above allows a complete CR node system. Based on this general CR node structure, a fully-functional Public Safety Cognitive Radio (PSCR) node is prototyped to provide the universal interoperability for public safety communications. Although the complete PSCR node software system has been packaged to an official release including installation guide and user/developer manuals, the process of building a cognitive radio from concept to a functional prototype is not the end of the CR research; on-going and future research issues are addressed in the last chapter of the dissertation.
Cellular Automata for Structural Optimization on Recongfigurable Computers
Hartka, Thomas Ryan (Virginia Tech, 2004-05-12)
Structural analysis and design optimization is important to a wide variety of disciplines. The current methods for these tasks require significant time and computing resources. Reconfigurable computers have shown the ability to speed up many applications, but are unable to handle efficiently the precision requirements for traditional analysis and optimization techniques. Cellular automata theory provides a method to model these problems in a format conducive to representation on a reconfigurable computer. The calculations do not need to be executed with high precision and can be performed in parallel. By implementing cellular automata simulations on a reconfigurable computer, structural analysis and design optimization can be performed significantly faster than conventional methods.
Circuit Design Methods with Emerging Nanotechnologies
Zheng, Yexin (Virginia Tech, 2009-12-08)
As complementary metal-oxide semiconductor (CMOS) technology faces more and more severe physical barriers down the path of continuously feature size scaling, innovative nano-scale devices and other post-CMOS technologies have been developed to enhance future circuit design and computation. These nanotechnologies have shown promising potentials to achieve magnitude improvement in performance and integration density. The substitution of CMOS transistors with nano-devices is expected to not only continue along the exponential projection of Moore's Law, but also raise significant challenges and opportunities, especially in the field of electronic design automation. The major obstacles that the designers are experiencing with emerging nanotechnology design include: i) the existing computer-aided design (CAD) approaches in the context of conventional CMOS Boolean design cannot be directly employed in the nanoelectronic design process, because the intrinsic electrical characteristics of many nano-devices are not best suited for Boolean implementations but demonstrate strong capability for implementing non-conventional logic such as threshold logic and reversible logic; ii) due to the density and size factors of nano-devices, the defect rate of nanoelectronic system is much higher than conventional CMOS systems, therefore existing design paradigms cannot guarantee design quality and lead to even worse result in high failure ratio. Motivated by the compelling potentials and design challenges of emerging post-CMOS technologies, this dissertation work focuses on fundamental design methodologies to effectively and efficiently achieve high quality nanoscale design. A novel programmable logic element (PLE) is first proposed to explore the versatile functionalities of threshold gates (TGs) and multi-threshold threshold gates (MTTGs). This PLE structure can realize all three- or four-variable logic functions through configuring binary control bits. This is the first single threshold logic structure that provides complete Boolean logic implementation. Based on the PLEs, a reconfigurable architecture is constructed to offer dynamic reconfigurability with little or no reconfiguration overhead, due to the intrinsic self-latching property of nanopipelining. Our reconfiguration data generation algorithm can further reduce the reconfiguration cost. To fully take advantage of such threshold logic design using emerging nanotechnologies, we also developed a combinational equivalence checking (CEC) framework for threshold logic design. Based on the features of threshold logic gates and circuits, different techniques of formulating a given threshold logic in conjunctive normal form (CNF) are introduced to facilitate efficient SAT-based verification. Evaluated with mainstream benchmarks, our hybrid algorithm, which takes into account both input symmetry and input weight order of threshold gates, can efficiently generate CNF formulas in terms of both SAT solving time and CNF generating time. Then the reversible logic synthesis problem is considered as we focus on efficient synthesis heuristics which can provide high quality synthesis results within a reasonable computation time. We have developed a weighted directed graph model for function representation and complexity measurement. An atomic transformation is constructed to associate the function complexity variation with reversible gates. The efficiency of our heuristic lies in maximally decreasing the function complexity during synthesis steps as well as the capability to climb out of local optimums. Thereafter, swarm intelligence, one of the machine learning techniques is employed in the space searching for reversible logic synthesis, which achieves further performance improvement. To tackle the high defect-rate during the emerging nanotechnology manufacturing process, we have developed a novel defect-aware logic mapping framework for nanowire-based PLA architecture via Boolean satisfiability (SAT). The PLA defects of various types are formulated as covering and closure constraints. The defect-aware logic mapping is then solved efficiently by using available SAT solvers. This approach can generate valid logic mapping with a defect rate as high as 20%. The proposed method is universally suitable for various nanoscale PLAs, including AND/OR, NOR/NOR structures, etc. In summary, this work provides some initial attempts to address two major problems confronting future nanoelectronic system designs: the development of electronic design automation tools and the reliability issues. However, there are still a lot of challenging open questions remain in this emerging and promising area. We hope our work can lay down stepstones on nano-scale circuit design optimization through exploiting the distinctive characteristics of emerging nanotechnologies.
Cognitive Gateway to Promote Interoperability, Coverage and Throughput in Heterogeneous Communication Systems
Chen, Qinqin (Virginia Tech, 2009-12-08)
With the reality that diverse air interfaces and dissimilar access networks coexist, accompanied by the trend that dynamic spectrum access (DSA) is allowed and will be gradually employed, cognition and cooperation form a promising framework to achieve the ideality of seamless ubiquitous connectivity in future communication networks. In this dissertation, the cognitive gateway (CG), conceived as a special cognitive radio (CR) node, is proposed and designed to facilitate universal interoperability among incompatible waveforms. A proof-of-concept prototype is built and tested. Located in places where various communication nodes and diverse access networks coexist, the CG can be easily set up and works like a network server with differentiated service (Diffserv) architecture to provide automatic traffic relaying and link establishment. The author extracts a scalable '“source-CG-destination“ snapshot from the entire network and investigates the key enabling technologies for such a snapshot. The CG features provide universal interoperability, which is enabled by a generic waveform representation format and the reconfigurable software defined radio platform. According to the trend of an all IP-based solution for future communication systems, the term “waveform“ in this dissertation has been defined as a protocol stack specification suite. The author gives a generic waveform representation format based on the five-layer TCP/IP protocol stack architecture. This format can represent the waveforms used by Ethernet, WiFi, cellular system, P25, cognitive radios etc. A significant advantage of CG over other interoperability solutions lies in its autonomy, which is supported by appropriate signaling processes and automatic waveform identification. The service process in a CG is usually initiated by the users who send requests via their own waveforms. These requests are transmitted during the signaling procedures. The complete operating procedure of a CG is depicted as a waveform-oriented cognition loop, which is primarily executed by the waveform identifier, scenario analyzer, central controller, and waveform converter together. The author details the service process initialized by a primary user (e.g. legacy public safety radio) and that initialized by a secondary user (e.g. CR), and describes the signaling procedures between CG and clients for the accomplishment of CG discovery, user registration and un-registration, link establishment, communication resumption, service termination, route discovery, etc. From the waveforms conveyed during the signaling procedures, the waveform identifier extracts the parameters that can be used for a CG to identify the source waveform and the destination waveform. These parameters are called “waveform indicators.“ The author analyzes the four types of waveforms of interest and outlines the waveform indicators for different types of communication initiators. In particular, a multi-layer waveform identifier is designed for a CG to extract the waveform indicators from the signaling messages. For the physical layer signal recognition, a Universal Classification Synchronization (UCS) system has been invented. UCS is conceived as a self-contained system which can detect, classify, synchronize with a received signal and provide all parameters needed for physical layer demodulation without prior information from the transmitter. Currently, it can accommodate the modulations including AM, FM, FSK, MPSK, QAM and OFDM. The design and implementation details of a UCS have been presented. The designed system has been verified by over-the-air (OTA) experiments and its performance has been evaluated by theoretical analysis and software simulation. UCS can be ported to different platforms and can be applied for various scenarios. An underlying assumption for UCS is that the target signal is transmitted continually. However, it is not the case for a CG since the detection objects of a CG are signaling messages. In order to ensure higher recognition accuracy, signaling efficiency, and lower signaling overhead, the author addresses the key issues for signaling scheme design and their dependence on waveform identification strategy. In a CG, waveform transformation (WT) is the last step of the link establishment process. The resources required for transformation of waveform pairs, together with the application priority, constitute the major factors that determine the link control and scheduling scheme in a CG. The author sorts different WT into five categories and describes the details of implementing the four typical types of WT (including physical layer analog – analog gateway, up to link layer digital – digital gateway, up-to-network-layer digital gateway, and Voice over IP (VoIP) – an up to transport layer gateway) in a practical CG prototype. The issues that include resource management and link scheduling have also been addressed. This dissertation presents a CG prototype implemented on the basis of GNU Radio plus multiple USRPs. In particular, the service process of a CG is modeled as a two-stage tandem queue, where the waveform identifier queues at the first stage can be described as M/D/1/1 models and the waveform converter queue at the second stage can be described as G/M/K/K model. Based on these models, the author derives the theoretical block probability and throughput of a CG. Although the “source-CG-destination” snapshot considers only neighboring nodes which are one-hop away from the CG, it is scalable to form larger networks. CG can work in either ad-hoc or infrastructure mode. Utilizing its capabilities, CG nodes can be placed in different network architectures/topologies to provide auxiliary connectivity. Multi-hop cooperative relaying via CGs will be an interesting research topic deserving further investigation.
Communication Synthesis for MIMO Decoder Matrices
Quesenberry, Joshua Daniel (Virginia Tech, 2011-08-09)
The design in this work provides an easy and cost-efficient way of performing an FPGA implementation of a specific algorithm through use of a custom hardware design language and communication synthesis. The framework is designed to optimize performance with matrix-type mathematical operations. The largest matrices used in this process are 4x4 matrices. The primary example modeled in this work is MIMO decoding. Making this possible are 16 functional unit containers within the framework, with generalized interfaces, which can hold custom user hardware and IP cores. This framework, which is controlled by a microsequencer, is centered on a matrix-based memory structure comprised of 64 individual dual-ported memory blocks. The microsequencer uses an instruction word that can control every element of the architecture during a single clock cycle. Routing to and from the memory structure uses an optimized form of a crossbar switch with predefined routing paths supporting any combination of input/output pairs needed by the algorithm. A goal at the start of the design was to achieve a clock speed of over 100 MHz; a clock speed of 183 MHz has been achieved. This design is capable of performing a 4x4 matrix inversion within 335 clock cycles, or 1,829 ns. The power efficiency of the design is measured at 17.15 MFLOPS/W.
Comparison and Investigation of Solar Spectral Irradiance with Solar Aspect Monitor
Lin, Ying-Tsen (Virginia Tech, 2014-09-30)
On-board the International Space Station (ISS), the Remote Atmospheric and Ionospheric Detection System (RAIDS) is a suite of limb-scanning monitors taking measurements from the extreme ultraviolet (EUV) to the near infrared (NIR). A single-scattering Rayleigh model is developed to eliminate the scattered brightness below 90 km and an inversion technique is applied on limb-scanned radiance profiles at 236.5 nm, NO (0,1) gamma band. The ISS orbit allows observations from 7:00 to 16:00 local hours over a one-month period from mid-June to mid-July of 2010 and observation of the local-time variation of NO abundance in the lower thermosphere is derived. The uniquely stable solar activity during 2010 allows the local time variation of NO to be observed with limited influence of solar variability. The comparison with a 1D model shows good agreement at altitude above 120 km, suggesting that most of the local time variation of NO is due to solar illumination, radiation, chemistry, and vertical diffusion. Solar soft X-ray is the major driver of the variability observed in the ionospheric and thermospheric constituents at the equatorial region. Over the years measurements in these wavelengths are scarce and discrepancies lie among the existing data. The Solar Aspect Monitor (SAM) is a pinhole camera on the Extreme-ultraviolet Variability Experiment (EVE) flying on the Solar Dynamics Observatory (SDO). Every 10 seconds SAM projects the solar disk onto the CCD through a metallic filter designed to allow only solar photons shortward of 7 nm to pass. Contamination from energetic particles and out-of-band irradiance is, however, present. The broadband (BB) technique is developed for isolating the 0.1 to 7 nm integrated irradiance to produce broadband irradiance. The results agree with the zeroth-order product from the EUV SpectroPhotometer (ESP) with 25% regardless of solar activity level. Active regions in the solar atmosphere are tracked by the Apertural Progression Procedure for Light Estimate (APPLE). The photon event detection (PED) algorithm takes both BB and APPLE results as prior information to extract in-band photons. Applications of the PED products, including solar feature studies and spectral resolved irradiance, are demonstrated.

Browsing by Author "Hsiao, Michael S."

Results Per Page

Sort Options