Computer Science Thesis Oral
In Person and Remote - ET - Traffic21 Classroom, Gates Hillman 6501 and Zoom
FRANCISCO JOSÉ MATURANA SANGUINETI
Ph.D. Candidate, Computer Science Department, Carnegie Mellon University
Designing Storage Codes for Heterogeneity: Theory and Practice
Distributed storage systems support many essential applications, and thus need to be highly reliable. To achieve this goal at a low cost, most systems use erasure codes. The parameters of the erasure code (which affect the cost and level of protection) are set based on the expected operating conditions. However, conditions vary significantly across time and across the system. For example, failure rates, workloads, and density of devices can change with time and in different locations. Many existing systems fail to accommodate these variations, or do so in inefficient ways. My thesis focuses on making distributed storage systems more robust and efficient by enabling them to automatically adapt to these variations. To make progress towards this goal, I develop and use tools from both Coding Theory and Computer Systems research. The first part focuses on variations in the system across time. Our main contribution here is the "convertible codes" framework, designed to study and construct erasure codes that can efficiently change their parameters over time. We propose the framework, derive the fundamental limits of this problem and design optimal codes. Additionally, we propose two distributed storage system designs, which automatically decide when and how to convert between codes. The second part focuses on heterogeneity across the system. Specifically, we consider a geo-distributed storage system, where the density of nodes and latencies between nodes vary significantly, and the cost of sending data across the wide-area network (WAN) is crucial. Our main contribution is a new class of codes that optimizes both the storage overhead and WAN bandwidth given the parameters of the system. We additionally propose a new strongly-consistent geo-distributed storage system that jointly optimizes its consensus protocol and erasure code.
Gregory R. Ganger
Muriel Médard (Massachusetts Institute of Technology)
In Person and Zoom Participation. See announcement.