Scaling RDMA RPCs with FLOCK

TR Number

Date

2021-11-30

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

RDMA-capable networks are gaining traction with datacenter deployments due to their high throughput, low latency, CPU efficiency, and advanced features, such as remote memory operations. However, efficiently utilizing RDMA capability in a common setting of high fan-in, fan-out asymmetric network topology is challenging. For instance, using RDMA programming features comes at the cost of connection scalability, which does not scale with increasing cluster size. To address that, several works forgo some RDMA features by only focusing on conventional RPC APIs. In this work, we strive to exploit the full capability of RDMA, while scaling the number of connections regardless of the cluster size. We present FLOCK, a communication framework for RDMA networks that uses hardware provided reliable connection. Using a partially shared model, FLOCK departs from the conventional

RDMA design by enabling connection sharing among threads, which provides significant performance improvements contrary to the widely held belief that connection sharing deteriorates performance. At its core, FLOCK uses a connection handle abstraction for connection multiplexing; a new coalescing-based synchronization approach for efficient network utilization; and a load-control mechanism for connections with symbiotic send-recv scheduling, which reduces the synchronization overheads associated with connection sharing along with ensuring fair utilization of network connections.

Description

Keywords

Datacenter networking, Remote Direct Memory Access (RDMA), Scalability

Citation

Collections