Detecting Persistence Bugs from Non-volatile Memory Programs by Inferring Likely-correctness Conditions

Files

TR Number

Date

2022-03-10

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

Non-volatile main memory (NVM) technologies are revolutionizing the entire computing stack thanks to their storage-and-memory-like characteristics. The ability to persist data in memory provides a new opportunity to build crash-consistent software without paying a storage stack I/O overhead. A crash-consistent NVM program can recover back to a consistent state from a persistent NVM in the event of a software crash or a sudden power loss. In the presence of a volatile cache, data held in a volatile cache is lost after a crash. So NVM programming requires users to manually control the durability and the persistence ordering of NVM writes. To avoid performance overhead, developers have devised customized persistence mechanisms to enforce proper persistence ordering and atomicity guarantees, rendering NVM programs error-prone. The problem statement of this dissertation is how one can effectively detect persistence bugs from NVM programs. However, detecting persistence bugs in NVM programs is challenging because of the huge test space and the manual consistency validation required. The thesis of this dissertation is that we can detect persistence bugs from NVM programs in a scalable and automatic manner by inferring likely-correctness conditions from programs. A likely-correctness condition is a possible correctness condition, which is a condition a program must maintain to make the program crash-consistent. This dissertation proposes to infer two forms of likely-correctness conditions from NVM programs to detect persistence bugs. The first proposed solution is to infer likely-ordering and likely-atomicity conditions by analyzing program dependencies among NVM accesses. The second proposed solution is to infer likely-linearization points to understand a program's operation-level behavior. Using these two forms of likely-correctness conditions, we test only those NVM states and thread interleavings that violate the likely-correctness conditions. This significantly re- duces the test space required to examine. We then leverage the durable linearizability model to validate consistency automatically without manual consistency validation. In this way, we can detect persistence bugs from NVM programs in a scalable and automatic manner. In total, we detect 47 (36 new) persistence correctness bugs and 158 (113 new) persistence performance bugs from 20 single-threaded NVM programs. Additionally, we detect 27 (15 new) persistence correctness bugs from 12 multi-threaded NVM data structures.

Description

Keywords

Non-Volatile Memory, Crash Consistency, Durable Linearizability, Testing, Debugging

Citation