Kai Ren

Fast Storage for File System Metadata Degree Type: Ph.D. in Computer Science
Advisor(s): Garth Gibson
Graduated: December 2017

Abstract:

In an era of big data, the rapid growth of data that many companies and organizations produce and manage continues to drive efforts to improve the scalability of storage systems. The number of objects presented in storage systems continue to grow, making metadata management critical to the overall performance of file systems. Many modern parallel applications are shifting toward shorter durations and larger degrees of parallelism. Such trends continue to make storage systems to experience more diverse metadata intensive workloads.

The goal of this dissertation is to improve metadata management in both local and distributed file systems. The dissertation focuses on two aspects. One is to improve the out-of-core representation of file system metadata, by exploring the use of log-structured multi-level approaches to provide a unified and efficient representation for different types of secondary storage devices (e.g., traditional hard disk and solid state disk). The other aspect is to demonstrate that such representation also can be flexibly integrated with many namespace distribution mechanisms to scale metadata performance of distributed file systems, and provide better support for a variety of big data applications in data center environment.

Thesis Committee:
Garth A. Gibson (Chair)
David G. Andersen
Greg R. Ganger
Brent B. Welch (Google)

Frank Pfenning, Head, Computer Science Department
Andrew W. Moore, Dean, School of Computer Science

Keywords:
File System, Metadata Management, Log-Structured Approach, Caching

CMU-CS-17-121.pdf (4.57 MB) ( 165 pages)
Copyright Notice