Browsing by Author "Williams, Dan"
Now showing 1 - 12 of 12
Results Per Page
Sort Options
- Broadly Enabling KLEE to Effortlessly Find Unrecoverable Errors in RustZhang, Ying; Li, Peng; Ding, Yu; Wang, Lingxiang; Williams, Dan; Meng, Na (ACM, 2024)Rust is a general-purpose programming language designed for performance and safety. Unrecoverable errors (e.g., Divide by Zero) in Rust programs are critical, as they signal bad program states and terminate programs abruptly. Previous work has contributed to utilizing KLEE, a dynamic symbolic test engine, to verify the program would not panic. However, it is difficult for engineers who lack domain expertise to write test code correctly. Besides, the effectiveness of KLEE in finding panics in production Rust code has not been evaluated. We created an approach, called PanicCheck, to hide the complexity of verifying Rust programs with KLEE. Using PanicCheck, engineers only need to annotate the function-to-verify with #[panic_check]. The annotation guides PanicCheck to generate test code, compile the function together with tests, and execute KLEE for verification. After applying PanicCheck to 21 open-source and 2 closed-source projects, we found 61 test inputs that triggered panics; 59 of the 61 panics have been addressed by developers so far. Our research shows promising verification results by KLEE, while revealing technical challenges in using KLEE. Our experience will shed light on future practice and research in program verification.
- Eliminating eBPF Tracing Overhead on Untraced ProcessesCraun, Milo; Hussain, Khizar; Gautam, Uddhav; Ji, Zhengjie; Rao, Tanuj; Williams, Dan (ACM, 2024-08-04)Current eBPF-based kernel extensions affect entire systems, and are coarse-grained. For some use cases, like tracing, operators are more interested in tracing a subset of processes (e.g., belonging to a container) rather than all processes. While overhead from tracing is expected for targeted processes, we find untraced processes—those that are not the target of tracing—also incur performance overhead. To better understand this overhead, we identify and explore three techniques for per-process filtering for eBPF: post-eBPF, in-eBPF, and pre-eBPF filtering, finding that all three approaches result in excessive overhead on untraced processes. Finally, we propose a system that allows for zero-untraced-overhead per-process eBPF tracing by modifying kernel virtual memory mappings to present per-process kernel views, effectively enabling untraced processes to execute on the kernel as if no eBPF programs are attached.
- Enabling BPF Runtime policies for better BPF managementSahu, Raj; Williams, Dan (ACM, 2023-09-10)As eBPF increasingly and rapidly gains popularity for observability, performance, troubleshooting, and security in production environments, a problem is emerging around how to manage the multitude of BPF programs installed into the kernel. Operators of distributed systems are already beginning to use BPF-orchestration frameworks with which they can set load and access policies for who can load BPF programs and access their resultant data. However, other than a guarantee of eventual termination, operators currently have little to no visibility into the runtime characteristics of BPF programs and thus cannot set policies that ensure their systems still meet crucial performance targets when instrumented with BPF programs. In this paper, we propose that having a runtime estimate will enable better policies that will govern the allowed latency in critical paths. Our key insight is to leverage the existing architecture within the verifier to statically track the runtime cost of all possible branches. Along with dynamically determined runtime estimates for helper functions and knowledge of loop-based helpers’ effects on control flow, we generate an accurate—although broad—range estimate for making runtime policy decisions. We further discuss some of the limitations of this approach, particularly in the case of broad estimate ranges as well as complementary tools for BPF runtime management.
- Enabling eBPF on Embedded Systems Through Decoupled VerificationCraun, Milo; Oswald, Adam; Williams, Dan (ACM, 2023-09-10)eBPF (Extended Berkeley Packet Filter) is a Linux kernel subsystem that aims to allow developers to write safe and efficient kernel extensions by employing an in-kernel verifier and just-in-time compiler (JIT). We find that verification is prohibitively expensive for resource-constrained embedded systems. To solve this we describe a system that allows for verification to occur outside of the embedded kernel and before BPF program load time. The in-kernel verifier and JIT are coupled so they must be decoupled together. A designated verifier kernel accepts a BPF program, then verifies, compiles, and signs a native precompiled executable. The executable can then be loaded onto an embedded device without needing the verifier and JIT on the embedded device. Decoupling verification and JIT from load-time opens the door to much more than running BPF programs on embedded devices. It allows larger and more expressive BPF programs to be verified, provides a way for new approaches to verification to be used without extensive kernel modification and creates the possibility for BPF program verification as a service.
- Introduction to the Special Section on USENIX ATC 2023Williams, Dan; Lawall, Julia (ACM, 2024-02-19)
- KASLR in the age of MicroVMsHolmes, Benjamin; Waterman, Jason; Williams, Dan (ACM, 2022-03-28)Address space layout randomization (ASLR) is a widely used component of computer security aimed at preventing code reuse and/or data-only attacks. Modern kernels utilize kernel ASLR (KASLR) and finer-grained forms, such as functional granular KASLR (FGKASLR), but do so as part of an inefficient bootstrapping process we call bootstrap selfrandomization. Meanwhile, under increasing pressure to optimize their boot times, microVM architectures such as AWS Firecracker have resorted to eliminating bootstrapping steps, particularly decompression and relocation from the guest kernel boot process, leaving them without KASLR. In this paper, we present in-monitor KASLR, in which the virtual machine monitor efficiently implements KASLR for the guest kernel by skipping the expensive kernel self-relocation steps. We prototype in-monitor KASLR and FGKASLR in the opensource Firecracker virtual machine monitor demonstrating, on a microVM configured kernel, boot times 22% and 16% faster than bootstrapped KASLR and FGKASLR methods, respectively. We also show the low overhead of in-monitor KASLR, with only 4% (2 ms) increase in boot times on average compared to a kernel without KASLR. We also discuss the implications and future opportunities for in-monitor approaches.
- Kernel extension verification is untenableJia, Jinghao; Sahu, Raj; Oswald, Adam; Williams, Dan; Le, Michael V.; Xu, Tianyin (ACM, 2023-06-22)The emergence of verified eBPF bytecode is ushering in a new era of safe kernel extensions. In this paper, we argue that eBPF’s verifier—the source of its safety guarantees—has become a liability. In addition to the well-known bugs and vulnerabilities stemming from the complexity and ad hoc nature of the in-kernel verifier, we highlight a concerning trend in which escape hatches to unsafe kernel functions (in the form of helper functions) are being introduced to bypass verifier-imposed limitations on expressiveness, unfortunately also bypassing its safety guarantees. We propose safe kernel extension frameworks using a balance of not just static but also lightweight runtime techniques. We describe a design centered around kernel extensions in safe Rust that will eliminate the need of the in-kernel verifier, improve expressiveness, allow for reduced escape hatches, and ultimately improve the safety of kernel extensions.
- Practical and Flexible Kernel CFI Enforcement using eBPFJia, Jinghao; Le, Michael V.; Ahmed, Salman; Williams, Dan; Jamjoom, Hani (ACM, 2023-09-10)Enforcing control flow integrity (CFI) in the kernel (kCFI) can prevent control-flow hijack attacks. Unfortunately, current kCFI approaches have high overhead or are inflexible and cannot support complex context-sensitive policies. To overcome these limitations, we propose a kCFI approach that makes use of eBPF (eKCFI) as the enforcement mechanism. The focus of this work is to demonstrate through implementation optimizations how to overcome the enormous performance overhead of this approach, thereby enabling the potential benefits with only modest performance tradeoffs.
- Protect the System Call, Protect (Most of) the World with BASTIONJelesnianski, Christopher; Ismail, Mohannad; Jang, Yeongjin; Williams, Dan; Min, Changwoo (ACM, 2023-03-25)System calls are a critical building block in many serious security attacks, such as control-flow hijacking and privilege escalation attacks. Security-sensitive system calls (e.g., execve, mprotect), especially play a major role in completing attacks. Yet, few defense efforts focus to ensure their legitimate usage, allowing attackers to maliciously leverage system calls in attacks. In this paper, we propose a novel System Call Integrity, which enforces the correct use of system calls throughout runtime. We propose three new contexts enforcing (1) which system call is called and how it is invoked (Call Type), (2) how a system call is reached (Control Flow), and (3) that arguments are not corrupted (Argument Integrity). Our defense mechanism thwarts attacks by breaking the critical building block in their attack chains. We implement Bastion, as a compiler and runtime monitor system, to demonstrate the efficacy of the three system call contexts. Our security case study shows that Bastion can effectively stop all the attacks including real-world exploits and recent advanced attack strategies. Deploying Bastion on three popular system callintensive programs, NGINX, SQLite, and vsFTPd, we show Bastion is secure and practical, demonstrating overhead of 0.60%, 2.01%, and 1.65%, respectively.
- Securing Container-based Clouds with Syscall-aware SchedulingLe, Michael V.; Ahmed, Salman; Williams, Dan; Jamjoom, Hani (ACM, 2023-07-10)Container-based clouds—in which containers are the basic unit of isolation—face security concerns because, unlike Virtual Machines, containers directly interface with the underlying highly privileged kernel through the wide and vulnerable system call interface. Regardless of whether a container itself requires dangerous system calls, a compromised or malicious container sharing the host (a bad neighbor) can compromise the host kernel using a vulnerable syscall, thereby compromising all other containers sharing the host. In this paper, rather than attempting to eliminate host compromise, we limit the effectiveness of attacks by bad neighbors to a subset of the cluster. To do this, we propose a new metric dubbed Extraneous System call Exposure (ExS). Scheduling containers to minimize ExS reduces the number of nodes that expose a vulnerable system call and as a result the number of affected containers in the cluster. Experimenting with 42 popular containers on SySched, our greedy scheduler implementation in Kubernetes, we demonstrate that SySched can reduce up to 46% more victim nodes and up to 48% more victim containers compared to the Kubernetes default scheduling while also reducing overall host attack surface by 20%.
- SEVeriFast: Minimizing the root of trust for fast startup of SEV microVMsHolmes, Benjamin; Waterman, Jason; Williams, Dan (ACM, 2024-04-27)Serverless computing platforms rely on fast container initialization to provide low latency and high throughput for requests. While hardware enforced trusted execution environments (TEEs) have gained popularity, confidential computing has yet to be widely adopted by latency-sensitive platforms due to its additional initialization overhead. We investigate the application of AMD’s Secure Encrypted Virtualization (SEV) to microVMs and find that current startup times for confidential VMs are prohibitively slow due to the high cost of establishing a root of trust for each new VM. We present SEVeriFast, a new bootstrap scheme for SEV VMs that reevaluates current microVM techniques for fast boot, such as eliminating bootstrap stages and bypassing guest kernel decompression. Counter-intuitively, we find that introducing an additional bootstrap component and reintroducing kernel compression optimizes the cold boot performance of SEV microVMs by reducing the cost of measurement on the critical boot path and producing a minimal root of trust. To our knowledge, SEVeriFast is the first work to explore the trade-offs associated with booting confidential microVMs and provide a set of guiding principles as a step toward confidential serverless. We show that SEVeriFast improves cold start performance of SEV VMs over current methods by 86-93%.
- Verified Programs Can Party: Optimizing Kernel Extensions via Post-Verification In-Kernel MergingKuo, Hsuan-Chi; Chen, Kai-Hsun; Lu, Yicheng; Williams, Dan; Mohan, Sibin; Xu, Tianyin (ACM, 2022-03-28)Operating system (OS) extensions are more popular than ever. For example, Linux BPF is marketed as a “superpower” that allows user programs to be downloaded into the kernel, verified to be safe and executed at kernel hook points. So, BPF extensions have high performance and are often placed at performance-critical paths for tracing and filtering. However, although BPF extension programs execute in a shared kernel environment and are already individually verified, they are often executed independently in chains.We observe that the chain pattern has large performance overhead, due to indirect jumps penalized by security mitigations (e.g., Spectre), loops, and memory accesses. In this paper, we argue for a separation of concerns. We propose to decouple the execution of BPF extensions from their verification requirements—BPF extension programs can be collectively optimized, after each BPF extension program is individually verified and loaded into the shared kernel. We present KFuse, a framework that dynamically and automatically merges chains of BPF programs by transforming indirect jumps into direct jumps, unrolling loops, and saving memory accesses, without loss of security or flexibility. KFuse can merge BPF programs that are (1) installed by multiple principals, (2) maintained to be modular and separate, (3) installed at different points of time, and (4) split into smaller, verifiable programs via BPF tail calls. KFuse demonstrates 85% performance improvement of BPF chain execution and 7% of application performance improvement over existing BPF use cases (systemd’s Seccomp BPF filters). It achieves more significant benefits for longer chains.