Bin Fan Algorithmic Engineering Towards More Efficient Key-Value Systems Degree Type: Ph.D. in Computer Science Advisor(s): David G. Andersen Graduated: December 2013 Abstract: Distributed key-value systems have been widely used as elemental components of many Internet-scale services at sites such as Amazon, Facebook and Twitter. This thesis examines a system design approach to scale existing key-value systems, both horizontally and vertically, by carefully engineering and integrating techniques that are grounded in re- cent theory but also informed by underlying architectures and expected workloads in practice. As a case study, we re-design FAWN-KV–a distributed key-value cluster consisting of "wimpy" key-value nodes–to use less memory but achieve higher throughput even in the worst case. First, to improve the worst-case throughput of a FAWN-KV system, we propose a randomized load balancing scheme that can fully utilize all the nodes regardless of their query distribution. We analytically prove and empirically demonstrate that deploying a very small but extremely fast load balancer at FAWN-KV can effectively prevent uneven or dynamic workloads creating hotspots on individual nodes. Moreover, our analysis provides service designers a mathematically tractable approach to estimate the worst-case throughput and also avoid drastic over- provisioning in similar distributed key-value systems. Second, to implement the high-speed load balancer and also to improve the space efficiency of individual key-value nodes, we propose novel data structures and algorithms, including the cuckoo filter, a Bloom filter replacement that is high-speed, highly compact and delete-supporting, and optimistic cuckoo hashing, a fast and space-efficient hashing scheme that scales on multiple CPUs. Both algorithms are built upon conventional cuckoo hashing but are optimized for our target architectures and workloads. Using them as building blocks, we design and implement MemC3 to serve transient data from DRAM with high throughput and low-latency retrievals, and SILT to provide cost-effective access to persis- tent data on flash storage with extremely small memory footprint (e.g., 0.7 bytes per entry). Thesis Committee: David G. Andersen (Chair) Michael Kaminsky (Intel Labs) Garth A. Gibson Edmund B. Nightingale (Microsoft Research) Frank Pfenning, Head, Computer Science Department Randy Bryant, Dean, School of Computer Science Keywords: Memory Efficiency, Hash Table, Bloom Filter, Caching, Load Balancing CMU-CS-13-126.pdf (1.37 MB) ( 134 pages) Copyright Notice