Browsing by Author "Hicks, Matthew"
Now showing 1 - 20 of 36
Results Per Page
Sort Options
- Algorithms and Frameworks for Accelerating Security Applications on HPC PlatformsYu, Xiaodong (Virginia Tech, 2019-09-09)Typical cybersecurity solutions emphasize on achieving defense functionalities. However, execution efficiency and scalability are equally important, especially for real-world deployment. Straightforward mappings of cybersecurity applications onto HPC platforms may significantly underutilize the HPC devices' capacities. On the other hand, the sophisticated implementations are quite difficult: they require both in-depth understandings of cybersecurity domain-specific characteristics and HPC architecture and system model. In our work, we investigate three sub-areas in cybersecurity, including mobile software security, network security, and system security. They have the following performance issues, respectively: 1) The flow- and context-sensitive static analysis for the large and complex Android APKs are incredibly time-consuming. Existing CPU-only frameworks/tools have to set a timeout threshold to cease the program analysis to trade the precision for performance. 2) Network intrusion detection systems (NIDS) use automata processing as its searching core and requires line-speed processing. However, achieving high-speed automata processing is exceptionally difficult in both algorithm and implementation aspects. 3) It is unclear how the cache configurations impact time-driven cache side-channel attacks' performance. This question remains open because it is difficult to conduct comparative measurement to study the impacts. In this dissertation, we demonstrate how application-specific characteristics can be leveraged to optimize implementations on various types of HPC for faster and more scalable cybersecurity executions. For example, we present a new GPU-assisted framework and a collection of optimization strategies for fast Android static data-flow analysis that achieve up to 128X speedups against the plain GPU implementation. For network intrusion detection systems (IDS), we design and implement an algorithm capable of eliminating the state explosion in out-of-order packet situations, which reduces up to 400X of the memory overhead. We also present tools for improving the usability of Micron's Automata Processor. To study the cache configurations' impact on time-driven cache side-channel attacks' performance, we design an approach to conducting comparative measurement. We propose a quantifiable success rate metric to measure the performance of time-driven cache attacks and utilize the GEM5 platform to emulate the configurable cache.
- The Art of SRAM Security: Tactics for Remanence-based Attack and Strategies for DefenseMahmod, Jubayer (Virginia Tech, 2024-05-02)The importance of securing hardware, particularly in the context of the Internet of Things (IoT), cannot be overstated in light of the increasing prevalence of low-level attacks. As the IoT industry continues to expand, security has become a more holistic concern, as evidenced by the wide range of attacks that we observed, from large-scale distributed denial-of-service attacks to data theft through monitoring a device's low-level behavior, such as power consumption. Traditional software-based security measures fall short in defending against the full spectrum of attacks, particularly those involving physical tampering with system hardware. This underscores the critical importance of proactively integrating attack vectors that encompass both hardware and software domains, with a particular emphasis on considering both the analog and digital characteristics of hardware. This thesis investigates system security from a hardware perspective, specifically examining how low-level circuit behavior and architectural design choices impact SRAM's data remanence and its implications for security. This dissertation not only identifies new vulnerabilities due to SRAM data remanence but also paves the way for novel security solutions in the ongoing "security arms race". I present an attack, volt boot, that executes cold-boot style short-term data remanence in on-chip SRAM without using temperature effect. This attack exploits the fact that SRAM's power bus is externally accessible and allows data retention using a simple voltage probe. Next, I present a steganography method that hides information in the SRAM exploiting long-term data remanence. This approach leverages aging-induced degradation to imprint data in SRAM's analog domain, ultimately resulting in hidden and plausibly deniable information storage in the hardware. Finally, I show how an adversary weaponizes SRAM data remanence to develop an attack on a hardware-backed security isolation mechanism. The following provides a brief overview of the three major contributions of this thesis: 1. Volt boot is an attack that demonstrates the vulnerability of on-chip SRAM due to the physical separation common in modern SoCs' power distribution networks. By probing external power pins (to the cache) of an SoC while simultaneously shutting down the main system power, Volt boot creates data retention across power cycles. On-chip SRAM can be a safe memory when the threat model considers traditional off-chip cold-boot-style attacks. This research demonstrates an alternative method for preserving information in on-chip SRAM through power cycles, expanding our understanding of data retention capabilities. Volt boot leverages asymmetrical power states (e.g., on vs. off) to force SRAM state retention across power cycles, eliminating the need for traditional cold boot attack enablers, such as low-temperature or intrinsic data retention time. 2. Invisible Bits is a hardware steganography technique that hides secret messages in the analog domain of SRAM embedded within a computing device. Exploiting accelerated transistor aging, Invisible Bits stores hidden data along with system data in an on-chip cache and provides a plausible deniability guarantee from statistical analysis. Aging changes the transistor's behavior which I exploit to store data permanently (ie long-term data remanence) in an SRAM. Invisible Bits presents unique opportunities for safeguarding electronic devices when subjected to inspections by authorities. 3. UntrustZone utilizes long-term data remanence to exfiltrate secrets from on-chip SRAM. An attacker application must be able to read retained states in the SRAM upon power cycles, but this needs changing the security privilege. Hardware security schemes, such as ARM TrustZone, erase a memory block before changing its security attributes and releasing it to other applications, making short-term data remanence attacks ineffective. That is, attacks such as Volt boot fail when hardware-backed isolation such as TEE is enforced. UntrustZone unveils a new threat to all forms of on-chip SRAM even when backed by hardware isolation: long-term data remanence. I show how an attacker systematically accelerates data imprinting on SRAM's analog domain to effectively burn in on-chip secrets and bypass TrustZone isolation.
- Chameleon Interference: Assessing Vulnerability of Magnetic Sensors to Spoofing and Signal injection attacks through Environmental interference in Mobile DevicesGleason, David Theodore (Virginia Tech, 2023-01-06)Embedded sensors are a fixture of most devices in the current computer industry. These small devices are used for a variety of purposes throughout many fields to collect whatever kind of information is needed by the user. From data on device acceleration to data on position relative to the Earth's magnetic field, embedded sensors can provide it for any number of tasks. The advent of these devices has made work and research in the computer industry significantly easier but they are not without their drawbacks. Most of these sensors operate by drawing external data from the environment through send and receive signals. This mode of operation leaves them vulnerable to external malicious users who seek access to the data being stored and handled by the sensors. Concerns over security and privacy of embedded sensor data has become a topic of great concern with the continued digitization of sensitive personal data. Within the last five years, studies have shown the ability to manipulate embedded magnetic sensors in order to gain access to various forms of sensitive personal data. This is of great concern to the developers of mobile devices as most mobile devices possess embedded magnetic sensors. The vulnerability of sensors to external influence leads to concerns for both data privacy and degradation of public trust in the ability of their devices to keep their personal information safe and out of the wrong hands. Degradation of public trust in security methodologies is a major concern to many in the research and tech industry as much of the work conducted to advance both security and technology depends on large amounts of public data. If the public loses trust in the ability of the devices used by researchers to protect and ensure the safety of the data provided to them, then they may stop providing data which would then make the work of researchers and other tech workers considerably more difficult. To address these concerns, this thesis will present an introduction to Magnetic sensor devices (a prominent tool for data collection), how these sensors work and the ways they handle data. We shall then examine the techniques used to interfere with the functioning and output of magnetic sensors employed by mobile devices. Finally, we shall examine existing techniques for defending against these kinds of attacks as well as propose potential new techniques. The end goal of this work is to provide a broader perspective on the nature of environmental/natural interference and its relationship to scientific study and technological advancement. Literature around this topic does exist, however, all existing works currently in the literature focus exclusively on one form of interference i.e., light which leads to a smaller/narrower perspective which this work seeks to remedy. The end result is meant to give a broader perspective of multiple forms of interference and their interrelations between each other than is possible by current perspectives due to their narrow lens.
- Circuit Support for Practical and Performant Batteryless SystemsWilliams, Harrison Ridgway (Virginia Tech, 2024-06-03)Tiny, ultra-low-power embedded processors enable sophisticated computing deployments in a myriad of areas previously off limits to computing power, ranging from intelligent medical implants to massive scale 'smart dust'-type sensing deployments. While today's computing and sensing hardware is well-suited for these next generation deployments, the batteries powering them are not: the size and weight of today's mobile and Internet-of-Things devices are dominated by their batteries, which also limit systems' lifespans and potential for deployment in sensitive contexts. Academic efforts have demonstrated the feasibility of harvesting energy on-demand from the environment as a practical alternative to classical battery power, instead buffering harvested energy in a capacitor to power intermittent bursts of operation. Energy harvesting circuits are miniaturizable, inexpensive, and enable effectively indefinite operation when compared to batteries---but introduce new problems stemming from the lack of a reliable power source. Unfortunately, these problems have so far confined batteryless systems to small-scale research deployments. The central design challenge for effective batteryless operation is efficiently using scarce input power from the energy harvesting frontend. Despite advances in both harvester and processor efficiency, digital systems often consume orders of magnitude more power than can be supplied by harvesting circuits---forcing systems to operate in short bursts punctuated by power failure and a long recharge period. Today's batteryless systems pay a steep price to sustain operation across these common-case power losses: current platforms depend on high-performance non-volatile memory to quickly and efficiently checkpoint program state before power loss, limiting batteryless operation to a small selection of devices which integrate these novel memory technologies. Choosing exactly when to checkpoint to non-volatile memory represents a challenge in itself: the hardware required to detect impending power failure often represents a large proportion of the system's overall energy consumption, forcing designers to choose between the energy overhead of voltage monitoring or the runtime overhead of 'energy-oblivious' checkpointing models. Finally, the choice of buffer capacitor size has a large impact on overall energy efficiency---but the optimal choice depends on runtime energy dynamics which are difficult to predict at design time, leaving designers to make at best educated guesses about future environmental conditions. This work approaches energy harvesting system design from a circuits perspective, answering the following research questions towards practical and performant batteryless operation: 1. Can the emergent properties of today's low-power systems be used to enable efficient intermittent operation on new classes of devices? 2. What compromises can we make in voltage monitor design to minimize power consumption while maintaining just enough functionality for batteryless operation? 3. How can we buffer harvested energy in a way that maximizes energy efficiency despite unpredictable system-level power dynamics? This work answers the following questions by producing the following research artifacts: 1. The first non-volatile memory invariant system to enable intermittent operation on embedded devices lacking high-performance memory (Chapter 2). 2. The first voltage monitoring circuit designed for batteryless systems to enable energy-aware operation without sacrificing efficiency (Chapter 3). 3. The first highly efficient power-adaptive energy buffer to store harvested energy without compromising on efficiency or performance (Chapter 4).
- Closure: Transforming Source Code for Faster FuzzingPaterson, Ian G. (Virginia Tech, 2022-05-27)Fuzzing, the method of generating inputs to run on a target program while monitoring its execution, is a widely adopted and pragmatic methodology for bug hunting as a means of software hardening. Technical improvements in throughput have shown to be critical to increasing the rate at which new bugs can be discovered time and time again. Persistent fuzzing, which keeps the fuzz target alive via looping, provides increased throughput at the cost for manual development of harnesses to account for invalid states and coverage of the programs code base, while relying on forking to reset the state accrued by looping over the same piece of code multiple times. Stale state can lead to wasted fuzzing efforts as certain areas of code may be conditionally ignored due to a stale global. I propose Closure, a toolset which enables programs to run at persistent speeds while avoiding the downsides of stale state and other bottlenecks associated with persistent fuzzing.
- CLOSUREX: Transforming Source Code for Correct Persistent FuzzingRanjan, Rishi (Virginia Tech, 2024-05-29)Fuzzing is a popular technique which has been adopted for automated vulnerability research for software hardening. Research reveals that increasing fuzzing throughput directly increases bug discovery rate. Given fuzzing revolves around executing a large number of test cases, test case execution rate is the dominant component of overall fuzzing throughput. To increase test case execution rate, researchers provide techniques that reduce the amount of time spent performing work that is independent of specific test case data. The highest performance approach is persistent fuzzing, which reuses a single process for all test cases by looping back to the start instead of exiting. This eliminates all process initialization and tear-down costs. Unfortunately, persistent fuzzing leads to semantically inconsistent program states because process state changes from one test case remains for subsequent test cases. This semantic inconsistency results in both missed crashes and false crashes, undermining fuzzing effectiveness. I observe that existing fuzzing execution mechanisms exist on a continuum, based on the amount of state that gets discarded and restored between test cases. I present a fuzzing execution mechanism that sits at a new spot on this state restoration continuum, where only test-case-execution-specific state is reset. This fine-grain state restoration provides near-persistent performance with the correctness of heavyweight state restoration. I construct CLOSUREX as a set of LLVM compiler passes that integrate with AFL++. Our evaluation on ten popular open-source fuzzing targets show that CLOSUREX maintains semantic correctness all while increasing test case execution rate by over 3.5x, on average, compared to AFL++. CLOSUREX also finds bugs more consistently and 1.9x faster than AFL++, with CLOSUREX discovering 15 0-day bugs (4 CVEs).
- Compiler Support for Long-life, Low-overhead Intermittent Computation on Energy Harvesting Flash-based DevicesAhmad, Saim (Virginia Tech, 2021-05-19)With the advent of energy harvesters, supporting fast and efficient computation on energy harvesting devices has become a key challenge in the field of energy harvesting on ubiquitous devices. Computation on energy harvesting devices is equivalent to spreading the execution time of a lasting application over short, frequent cycles of power. However, we must ensure that results obtained from intermittently executing an application do produce results that are congruent to those produced by executing the application on a device with a continuous source of power. The current state-of-the-art systems that enable intermittent computation on energy harvesters make use of novel compiler analysis techniques as well as on-board hardware on devices to measure the energy remaining for useful computation. However, currently available programming models, which mostly target devices with FRAM as the NVM, would cause failure on devices that employ the Flash as primary NVM, thereby resulting in a non-universal solution that is restricted by the choice of NVM. This is primarily the result of the Flash's limited read/write endurance. This research aims to contribute to the world of energy harvesting devices by providing solutions that would enable intermittent computation regardless of the choice of NVM on a device by utilizing only the SRAM to save state and perform computation. Utilizing the SRAM further reduces run-time overhead as SRAM reads/writes are less costlier than NVM reads/writes. Our proposed solutions rely on programmer-guidance and compiler analysis to correct and efficient intermittent computation. We then extend our system to provide a complete compiler-based solution without programmer intervention. Our system is able to run applications that would otherwise render any device with Flash as NVM useless in a matter of hours.
- Detecting Hidden Wireless Cameras through Network Traffic AnalysisCowan, KC Kaye (Virginia Tech, 2020-10-02)Wireless cameras dominate the home surveillance market, providing an additional layer of security for homeowners. Cameras are not limited to private residences; retail stores, public bathrooms, and public beaches represent only some of the possible locations where wireless cameras may be monitoring people's movements. When cameras are deployed into an environment, one would typically expect the user to disclose the presence of the camera as well as its location, which should be outside of a private area. However, adversarial camera users may withhold information and prevent others from discovering the camera, forcing others to determine if they are being recorded on their own. To uncover hidden cameras, a wireless camera detection system must be developed that will recognize the camera's network traffic characteristics. We monitor the network traffic within the immediate area using a separately developed packet sniffer, a program that observes and collects information about network packets. We analyze and classify these packets based on how well their patterns and features match those expected of a wireless camera. We used a Support Vector Machine classifier and a secondary-level of classification to reduce false positives to design and implement a system that uncovers the presence of hidden wireless cameras within an area.
- A Development Platform to Evaluate UAV Runtime Verification Through Hardware-in-the-loop SimulationRafeeq, Akhil Ahmed (Virginia Tech, 2020-06-17)The popularity and demand for safe autonomous vehicles are on the rise. Advances in semiconductor technology have led to the integration of a wide range of sensors with high-performance computers, all onboard the autonomous vehicles. The complexity of the software controlling the vehicles has also seen steady growth in recent years. Verifying the control software using traditional verification techniques is difficult and thus increases their safety concerns. Runtime verification is an efficient technique to ensure the autonomous vehicle's actions are limited to a set of acceptable behaviors that are deemed safe. The acceptable behaviors are formally described in linear temporal logic (LTL) specifications. The sensor data is actively monitored to verify its adherence to the LTL specifications using monitors. Corrective action is taken if a violation of a specification is found. An unmanned aerial vehicle (UAV) development platform is proposed for the validation of monitors on configurable hardware. A high-fidelity simulator is used to emulate the UAV and the virtual environment, thereby eliminating the need for a real UAV. The platform interfaces the emulated UAV with monitors implemented on configurable hardware and autopilot software running on a flight controller. The proposed platform allows the implementation of monitors in an isolated and scalable manner. Scenarios violating the LTL specifications can be generated in the simulator to validate the functioning of the monitors.
- Energy-Adaptive Buffering for Efficient, Responsive, and Persistent Batteryless SystemsWilliams, Harrison; Hicks, Matthew (ACM, 2024-04-27)Batteryless energy harvesting systems enable a wide array of new sensing, computation, and communication platforms untethered by power delivery or battery maintenance demands. Energy harvesters charge a buffer capacitor from an unreliable environmental source until enough energy is stored to guarantee a burst of operation despite changes in power input. Current platforms use a fixed-size buffer chosen at design time to meet constraints on charge time or application longevity, but static energy buffers are a poor fit for the highly volatile power sources found in real-world deployments: fixed buffers waste energy both as heat when they reach capacity during a power surplus and as leakage when they fail to charge the system during a power deficit. To maximize batteryless system performance in the face of highly dynamic input power, we propose REACT: a responsive buffering circuit which varies total capacitance according to net input power. REACT uses a variable capacitor bank to expand capacitance to capture incoming energy during a power surplus and reconfigures internal capacitors to reclaim additional energy from each capacitor as power input falls. Compared to fixed-capacity systems, REACT captures more energy, maximizes usable energy, and efficiently decouples system voltage from stored charge—enabling low-power and high-performance designs previously limited by ambient power. Our evaluation on real-world platforms shows that REACT eliminates the tradeoff between responsiveness, efficiency, and longevity, increasing the energy available for useful work by an average 25.6% over static buffers optimized for reactivity and capacity, improving event responsiveness by an average 7.7𝑥 without sacrificing capacity, and enabling programmer directed longevity guarantees.
- Exploiting Update Leakage in Searchable Symmetric EncryptionHaltiwanger, Jacob Sayid (Virginia Tech, 2024-03-15)Dynamic Searchable Symmetric Encryption (DSSE) provides efficient techniques for securely searching and updating an encrypted database. However, efficient DSSE schemes leak some sensitive information to the server. Recent works have implemented forward and backward privacy as security properties to reduce the amount of information leaked during update operations. Many attacks have shown that leakage from search operations can be abused to compromise the privacy of client queries. However, the attack literature has not rigorously investigated techniques to abuse update leakage. In this work, we investigate update leakage under DSSE schemes with forward and backward privacy from the perspective of a passive adversary. We propose two attacks based on a maximum likelihood estimation approach, the UFID Attack and the UF Attack, which target forward-private DSSE schemes with no backward privacy and Level 2 backward privacy, respectively. These are the first attacks to show that it is possible to leverage the frequency and contents of updates to recover client queries. We propose a variant of each attack which allows the update leakage to be combined with search pattern leakage to achieve higher accuracy. We evaluate our attacks against a real-world dataset and show that using update leakage can improve the accuracy of attacks against DSSE schemes, especially those without backward privacy.
- FiniteFuzz : Finite State Machine Fuzzer For Industrial Control IoT DevicesKaur, Jaskaran (Virginia Tech, 2023-07-03)Automated software testing techniques have become increasingly popular in recent years, with fuzzing being one of the most prevalent approaches. However, fuzzing Finite State Machines (FSMs) poses a significant challenge due to state and input dependency, resulting in exponential exploration time required to unlock the Finite State Machine. To address this issue, we present a novel approach in this research paper by introducing FINITEFUZZ, a Grey Box Fuzzer explicitly designed to fuzz Finite State Machines. Unlike the Blackbox fuzzers, FINITEFUZZ employs a mutational technique that utilizes feedback to steer the fuzzing process. FINITEFUZZ takes a random set of states and compares them with the desired FSM and records the states that increase the coverage of the Finite State Machine. The next seed incorporates the feedback received from all the previous seed inputs. This avoids exploring the same path multiple times and results in linear performance for all the types of Finite State machines possible. Our findings reveal that the use of FINITEFUZZ significantly reduces the exploration time required to uncover each state of the machine, making it a promising solution for generating Finite State Machines. We tested our FINITEFUZZ on 4 different types of Finite State Machines with each scenario resulting in at least 5X performance improvement in FSM generation. The potential applications of FSMs are vast, and our research suggests that the proposed approach can be used to generate any type of Finite State Machine.
- Gurthang - A Fuzzing Framework for Concurrent Network ServersShugg, Connor William (Virginia Tech, 2022-06-13)The emergence of Internet-connected technologies has given the world a vast number of services easily reachable from our computers and mobile devices. Web servers are one of the dominant types of computer programs that provide these services to the world by serving files and computations to connected users. Because of their accessibility and importance, web servers must be robust to avoid exploitation by hackers and other malicious users. Fuzzing is a software testing technique that seeks to discover bugs in computer programs in an automated fashion. However, most state-of-the-art fuzzing tools (fuzzers) are incapable of fuzzing web servers effectively, due to their reliance on network connections to receive input and other unique constraints they follow. Past research exists to remedy this situation, and while they have had success, certain drawbacks are introduced in the process. To address this, we created Gurthang, a fuzzing framework that gives state-of-the-art fuzzers the ability to fuzz web servers easily, without having to modify source code, the web server's threading model, or fundamentally change the way a server behaves. We introduce novelty by providing the ability to establish and send data across multiple concurrent connections to the target web server in a single execution of a fuzzing campaign, thus opening the door to the discovery of concurrency-related bugs. We accomplish this through a novel file format and two shared libraries that harness existing state-of-the-art fuzzers. We evaluated Gurthang by performing a research study at Virginia Tech that yielded 48 discovered bugs among 55 web servers written by students. Participants utilized Gurthang to integrate fuzzing into their software development process and discover bugs. In addition, we evaluated Gurthang against Apache and Nginx, two real-world web servers. We did not discover any bugs on Apache or Nginx, but Gurthang successfully enabled us to fuzz them without needing to modify their source code. Our evaluations show Gurthang is capable of performing fuzz-testing on web servers and discovering real bugs.
- Improving Operating System Security, Reliability, and Performance through Intra-Unikernel Isolation, Asynchronous Out-of-kernel IPC, and Advanced System ServersSung, Mincheol (Virginia Tech, 2023-03-28)Computer systems are vulnerable to security exploits, and the security of the operating system (OS) is crucial as it is often a trusted entity that applications rely on. Traditional OSs have a monolithic design where all components are executed in a single privilege layer, but this design is increasingly inadequate as OS code sizes have become larger and expose a large attack surface. Microkernel OSs and multiserver OSs improve security and reliability through isolation, but they come at a performance cost due to crossing privilege layers through IPCs, system calls, and mode switches. Library OSs, on the other hand, implement kernel components as libraries which avoids crossing privilege layers in performance-critical paths and thereby improves performance. Unikernels are a specialized form of library OSs that consist of a single application compiled with the necessary kernel components, and execute in a single address space, usually atop a hypervisor for strong isolation. Unikernels have recently gained popularity in various application domains due to their better performance and security. Although unikernels offer strong isolation between each instance due to virtualization, there is no isolation within a unikernel. Since the model eliminates the traditional separation between kernel and user parts of the address space, the subversion of a kernel or application component will result in the subversion of the entire unikernel. Thus, a unikernel must be viewed as a single unit of trust, reducing security. The dissertation's first contribution is intra-unikernel isolation: we use Intel's Memory Protection Keys (MPK) primitive to provide per-thread permission control over groups of virtual memory pages within a unikernel's single address space, allowing different areas of the address space to be isolated from each other. We implement our mechanisms in RustyHermit, a unikernel written in Rust. Our evaluations show that the mechanisms have low overhead and retain unikernel's low system call latency property: 0.6% slowdown on applications including memory/compute intensive benchmarks as well as micro-benchmarks. Multiserver OS, a type of microkernel OS, has high parallelism potential due to its inherent compartmentalization. However, the model suffers from inferior performance. This is due to inter-process communication (IPC) client-server crossings that require context switches for single-core systems, which are more expensive than traditional system calls; on multi-core systems (now ubiquitous), they have poor resource utilization. The dissertation's second contribution is Aoki, a new approach to IPC design for microkernel OSs. Aoki incorporates non-blocking concurrency techniques to eliminate in-kernel blocking synchronization which causes performance challenges for state-of-the-art microkernels. Aoki's non-blocking (i.e., lock-free and wait-free) IPC design not only improves performance and scalability, but also enhances reliability by preventing thread starvation. In a multiserver OS setting, the design also enables the reconnection of stateful servers after failure without loss of IPC states. Aoki solves two problems that have plagued previous microkernel IPC designs: reducing excessive transitions between user and kernel modes and enabling efficient recovery from failures. We implement Aoki in the state-of-the-art seL4 microkernel. Results from our experiments show that Aoki outperforms the baseline seL4 in both fastpath IPC and cross-core IPC, with improvements of 2.4x and 20x, respectively. The Aoki IPC design enables the design of system servers for multiserver OSs with higher performance and reliability. The dissertation's third and final contribution is the design of a fault-tolerant storage server and a copy-free file system server. We build both servers using NetBSD OS's rumprun unikernel, which provides robust isolation through hardware virtualization, and is capable of handling a wide range of storage devices including NVMe. Both servers communicate with client applications using Aoki's IPC design, which yields scalable IPC. In the case of the storage server, the IPC also enables the server to transparently recover from server failures and reconnect to client applications, with no loss of IPC state and no significant overhead. In the copy-free file system server's design, applications grant the server direct memory access to file I/O data buffers for high performance. The performance problems solved in the server designs have challenged all prior multiserver/microkernel OSs. Our evaluations show that both servers have a performance comparable to Linux and the rumprun baseline.
- Invisible Bits: Hiding Secret Messages in SRAM's Analog DomainMahmod, Jubayer; Hicks, Matthew (ACM, 2022-02-28)Electronic devices are increasingly the subject of inspection by authorities. While encryption hides secret messages, it does not hide the transmission of those secret messages - in fact, it calls attention to them. Thus, an adversary, seeing encrypted data, turns to coercion to extract the credentials required to reveal the secret message. Steganographic techniques hide secret messages in plain sight, providing the user with plausible deniability, removing the threat of coercion. This paper unveils Invisible Bits a new steganographic technique that hides secret messages in the analog domain of Static Random Access Memory (SRAM) embedded within a computing device. Unlike other memory technologies, the power-on state of SRAM reveals the analog-domain properties of its individual cells. We show how to quickly and systematically change the analog-domain properties of SRAM cells to encode data in the analog domain and how to reveal those changes by capturing SRAM’s power-on state. Experiments with commercial devices show that Invisible Bits provides over 90% capacity - two orders-of-magnitude more than previous on-chip steganographic approaches, while retaining device functionality - even when the device undergoes subsequent normal operation or is shelved for months. Experiments also show that adversaries cannot differentiate between devices with encoded messages and those without. Lastly, we show how to layer encryption and error correction on top of our message encoding scheme in an end-to-end demonstration.
- Linux Kernel Module Continuous Address Space Re-RandomizationNadeem, Muhammad Hassan (Virginia Tech, 2020-02-28)Address space layout randomization (ASLR) is a technique employed to prevent exploitation of memory corruption vulnerabilities in user-space programs. While this technique is widely studied, its kernel space counterpart known as kernel address space layout randomization (KASLR) has received less attention in the research community. KASLR, as it is implemented today is limited in entropy of randomization. Specifically, the kernel image and its modules can only be randomized within a narrow 1GB range. Moreover, KASLR does not protect against memory disclosure vulnerabilities, the presence of which reduces or completely eliminates the benefits of KASLR. In this thesis, we make two major contributions. First, we add support for position-independent kernel modules to Linux so that the modules can be placed anywhere in the 64-bit virtual address space and at any distance apart from each other. Second, we enable continuous KASLR re-randomization for Linux kernel modules by leveraging the position-independent model. Both contributions increase the entropy and reduce the chance of successful ROP attacks. Since prior art tackles only user-space programs, we also solve a number of challenges unique to the kernel code. Our experimental evaluation shows that the overhead of position-independent code is very low. Likewise, the cost of re-randomization is also small even at very high re-randomization frequencies.
- Memory Turbo Boost: Architectural Support for Using Unused Memory for Memory Replication to Boost Server Memory PerformanceZhang, Da (Virginia Tech, 2023-06-28)A significant portion of the memory in servers today is often unused. Our large-scale study of HPC systems finds that more than half of the total memory in active nodes running user jobs are unused for 88% of the time. Google and Azure Cloud studies also report unused memory accounts for 40% of the total memory in their servers, on average. Leaving so much memory unused is wasteful. To address this problem, we note that in the context of CPUs, Turbo Boost can turn off the unused cores to boost the performance of in-use cores. However, there is no equivalent technology in the context of memory; no matter how much memory is unused, the performance of in-use memory remains the same. This dissertation explores architectural techniques to utilize the unused memory to boost the performance of in-use memory and refer to them collectively as Memory Turbo Boost. This dissertation explores how to turbo boost memory performance through memory replication; specifically, it explores how to efficiently store the replicas in the unused memory and explores multiple architectural techniques to utilize the replicas to enhance memory system performance. Performance simulations show that Memory Turbo Boost can improve node-level performance by 18%, on average across a wide spectrum of workloads. Our system-wide simulations show applying Memory Turbo Boost to an HPC system provides 1.4x average speedup on job turnaround time.
- Message Authentication Codes On Ultra-Low SWaP DevicesLiao, Che-Hsien (Virginia Tech, 2022-05-27)This thesis focuses on specific crypto algorithms, Message Authentication Codes (MACs), running on ultra-low SWaP devices. The type of MACs we used is hash-based message authentication codes (HMAC) and cipher-block-chaining message authentication code (CBC-MAC). The most important thing about ultra-low SWaP devices is their energy usage. This thesis measures different implementations' execution times on ultra-low SWaP devices. We could understand which implementation is suitable for a specific device. In order to understand the crypto algorithm we used, this thesis briefly introduces the concept of hash-based message authentication codes (HMAC) and cipher-block-chaining message authentication code (CBC-MAC) from a high level, including their usage and advantage. The research method is empirical research. This thesis determines the execution times of different implementations. These two algorithms (HMAC and CBC-MAC) contain three implementations. The result comes from those implementations running on the devices we used.
- Neural Cryptanalysis for Cyber-Physical System CiphersMeno, Emma Margaret (Virginia Tech, 2021-05-18)A key cryptographic research interest is developing an automatic, black-box method to provide a relative security strength measure for symmetric ciphers, particularly for proprietary cyber-physical systems (CPS) and lightweight block ciphers. This thesis work extends the work of the recently-developed neural cryptanalysis method, which trains neural networks on a set of plaintext/ciphertext pairs to extract meaningful bitwise relationships and predict corresponding ciphertexts given a set of plaintexts. As opposed to traditional cryptanalysis, the goal is not key recovery but achieving a mimic accuracy greater than a defined base match rate. In addition to reproducing tests run with the Data Encryption Standard, this work applies neural cryptanalysis to round-reduced versions and components of the SIMON/SPECK family of block ciphers and the Advanced Encryption Standard. This methodology generated a metric able to rank the relative strengths of rounds for each cipher as well as algorithmic components within these ciphers. Given the current neural network suite tested, neural cryptanalysis is best-suited for analyzing components of ciphers rather than full encryption models. If these models are improved, this method presents a promising future in measuring the strength of lightweight symmetric ciphers, particularly for CPS.
- Neural Network-based Methodologies for Securing Cryptographic CodeXiao, Ya (Virginia Tech, 2022-08-17)Many studies show that manual code generation is error-prone and results in vulnerabilities. Vulnerability fixing has been shown as the most time-consuming process among multiple steps of code repair. To help developers repair these security vulnerabilities, my dissertation aims to develop an automatic or semi-automatic secure code generation system with neural network based approaches. Trained with huge amounts of good-quality code, I expect the neural network to learn the secure usage and produce the correct code suggestions. Despite the great success of neural networks, the vision of comprehending and generating programming languages through neural networks has not been fully realized. There are many fundamental questions that need to be answered. These questions include 1) what are the accuracy impacts of the various choices in code embedding? 2) How to address the accuracy challenges caused by the programming language specific properties in the task of secure code suggestion? My dissertation work answers the two questions with a systematical measurement study and specialized neural network designs. My experiments show that program analysis is a necessary preprocessing step to guide the code embedding – resulting in a 36.1% accuracy improvement. Furthermore, I identify two previously unreported deficiencies in the cryptographic API suggestion task. To close the gap, I invent a highly accurate API method suggestion solution, referred to as Multi-HyLSTM, with specialized neural network designs to recognize unique programming language characteristics. My work points out the important differences between natural languages and programming languages, which pure data-driven learning approaches may not recognize.