Youssef, KarimRaqibul Islam, Abdullah A.Iwabuchi, KeitaFeng, Wu-chunPearce, Roger2024-03-042024-03-042022-01-019781665497862https://hdl.handle.net/10919/118252Persistent data structures represent a core component of high-performance data analytics. Multiple data processing systems persist data structures using memory-mapped files. Memory-mapped file I/O provides a productive and unified programming interface to different types of storage systems. However, it suffers from multiple limitations, including performance bottlenecks caused by system-wide configurations and a lack of support for efficient incremental versioning. There-fore, many such systems only support versioning via full-copy snapshots, resulting in poor performance and storage capacity bottlenecks. To address these limitations, we present Privateer 2.0, a virtual memory and storage interface that optimizes performance and storage capacity for versioned persistent data structures. Privateer 2.0 improves over the previous version by supporting userspace virtual memory management and block compression. We integrated Privateer 2.0 into Metall, a C++ persistent data structure allocator, and LMDB, a widely-used key-value store database. Privateer 2.0 yielded up to 7.5× speedup and up to 300× storage space reduction for Metall incremental snapshots and 1.25× speedup with 11.7× storage space reduction for LMDB incremental snapshots.Pages 1-7In CopyrightOptimizing Performance and Storage of Memory-Mapped Persistent Data StructuresConference proceeding2022 IEEE High Performance Extreme Computing Conference, HPEC 2022https://doi.org/10.1109/HPEC55821.2022.9926392Feng, Wu-chun [0000-0002-6015-0727]