On Reducing the Trusted Computing Base in Binary Verification

An, Xiaoxin

On Reducing the Trusted Computing Base in Binary Verification

dc.contributor.author	An, Xiaoxin	en
dc.contributor.committeechair	Ravindran, Binoy	en
dc.contributor.committeemember	Lee, Dongyoon	en
dc.contributor.committeemember	Chantem, Thidapat	en
dc.contributor.committeemember	Zeng, Haibo	en
dc.contributor.committeemember	Verbeek, Freek	en
dc.contributor.department	Electrical and Computer Engineering	en
dc.date.accessioned	2022-06-16T08:00:21Z	en
dc.date.available	2022-06-16T08:00:21Z	en
dc.date.issued	2022-06-15	en
dc.description.abstract	The translation of binary code to higher-level models has wide applications, including decompilation, binary analysis, and binary rewriting. This calls for high reliability of the underlying trusted computing base (TCB) of the translation methodology. A key challenge is to reduce the TCB by validating its soundness. Both the definition of soundness and the validation method heavily depend on the context: what is in the TCB and how to prove it. This dissertation presents three research contributions. The first two contributions include reducing the TCB in binary verification, and the last contribution includes a binary verification process that leverages a reduced TCB. The first contribution targets the validation of OCaml-to-PVS translation -- commonly used to translate instruction-set-architecture (ISA) specifications to PVS -- where the destination language is non-executable. We present a methodology called OPEV to validate the translation between OCaml and PVS, supporting non-executable semantics. The validation includes generating large-scale tests for OCaml implementations, generating test lemmas for PVS, and generating proofs that automatically discharge these lemmas. OPEV incorporates an intermediate type system that captures a large subset of OCaml types, employing a variety of rules to generate test cases for each type. To prove the PVS lemmas, we develop automatic proof strategies and discharge the test lemmas using PVS Proof-Lite, a powerful proof scripting utility of the PVS verification system. We demonstrate our approach in two case studies that include 259 functions selected from the Sail and Lem libraries. For each function, we generate thousands of test lemmas, all of which are automatically discharged. The dissertation's second contribution targets the soundness validation of a disassembly process where the source language does not have well-defined semantics. Disassembly is a crucial step in binary security, reverse engineering, and binary verification. Various studies in these fields use disassembly tools and hypothesize that the reconstructed disassembly is correct. However, disassembly is an undecidable problem. State-of-the-art disassemblers suffer from issues ranging from incorrectly recovered instructions to incorrectly assessing which addresses belong to instructions and which to data. We present DSV, a systematic and automated approach to validate whether the output of a disassembler is sound with respect to the input binary. No source code, debugging information, or annotations are required. DSV defines soundness using a transition relation defined over concrete machine states: a binary is sound if, for all addresses in the binary that can be reached from the binary's entry point, the bytes of the (disassembled) instruction located at an address are the same as the actual bytes read from the binary. Since computing this transition relation is undecidable, DSV uses over-approximation by preventing false positives (i.e., the existence of an incorrectly disassembled reachable instruction but deemed unreachable) and allowing, but minimizing, false negatives. We apply DSV to 102 binaries of GNU Coreutils with eight different state-of-the-art disassemblers from academia and industry. DSV is able to find soundness issues in the output of all disassemblers. The dissertation's third contribution is WinCheck: a concolic model checker that detects memory-related properties of closed-source binaries. Bugs related to memory accesses are still a major issue for security vulnerabilities. Even a single buffer overflow or use-after-free in a large program may be the cause of a software crash, a data leak, or a hijacking of the control flow. Typical static formal verification tools aim to detect these issues at the source code level. WinCheck is a model-checker that is directly applicable to closed-source and stripped Windows executables. A key characteristic of WinCheck is that it performs its execution as symbolically as possible while leaving any information related to pointers concrete. This produces a model checker tailored to pointer-related properties, such as buffer overflows, use-after-free, null-pointer dereferences, and reading from uninitialized memory. The technique thus provides a novel trade-off between ease of use, accuracy, applicability, and scalability. We apply WinCheck to ten closed-source binaries available in a Windows 10 distribution, as well as the Windows version of the entire Coreutils library. We conclude that the approach taken is precise -- provides only a few false negatives -- but may not explore the entire state space due to unresolved indirect jumps.	en
dc.description.abstractgeneral	Binary verification is a process that verifies a class of properties, usually security-related properties, on binary files, and does not need access to source code. Since a binary file is composed of byte sequences and is not human-readable, in the binary verification process, a number of assumptions are usually made. The assumptions often involve the error-free nature of a set of subsystems used in the verification process and constitute the verification process's trusted computing base (or TCB). The reliability of the verification process therefore depends on how reliable the TCB is. The dissertation presents three research contributions in this regard. The first two contributions include reducing the TCB in binary verification, and the last contribution includes a binary verification process that leverages a reduced TCB. The dissertation's first contribution presents a validation on OCaml-to-PVS translations -- commonly used to translate a computer architecture's instruction specifications to PVS, a language that allows mathematical specifications. To build up a reliable semantical model of assembly instructions, which is assumed to be in the TCB, it is necessary to validate the translation. The dissertation's second contribution validates the soundness of the disassembly process, which translates a binary file to corresponding assembly instructions. Since the disassembly process is generally assumed to be trustworthy in many binary verification works, the TCB of binary verification could be reduced by validating the soundness of the disassembly process. With the reduced TCB, the dissertation introduces WinCheck, the dissertation's third and final contribution: a concolic model checker that validates pointer-related properties of closed-source Windows binaries. The pointer-related properties include absence of buffer overflow, absence of use-after-free, and absence of null-pointer dereference.	en
dc.description.degree	Doctor of Philosophy	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:35073	en
dc.identifier.uri	http://hdl.handle.net/10919/110791	en
dc.language.iso	en	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Translation Validation	en
dc.subject	Disassembly Soundness	en
dc.subject	Binary Verification	en
dc.subject	Random Testing	en
dc.subject	Symbolic Execution	en
dc.subject	Bounded Model Checking	en
dc.title	On Reducing the Trusted Computing Base in Binary Verification	en
dc.type	Dissertation	en
thesis.degree.discipline	Computer Engineering	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	doctoral	en
thesis.degree.name	Doctor of Philosophy	en

Files

Original bundle

Now showing 1 - 2 of 2

Name:: An_X_D_2022.pdf
Size:: 1.11 MB
Format:: Adobe Portable Document Format

Download

Name:: An_X_D_2022_support_1.pdf
Size:: 67.6 KB
Format:: Adobe Portable Document Format
Description:: Supporting documents

Download

Collections

Doctoral Dissertations