Computer Science Thesis Proposal

— 3:30pm

Location:
In Person and Virtual - ET - Panther Hollow Conference Room, Mehrabian Collaborative Innovation Center 4105

Speaker:
DANIEL LIN-KIT WONG , Ph.D. Student, Computer Science Department, Carnegie Mellon University
https://wonglkd.fi-de.net/

Machine learning for flash caching in bulk storage systems

Flash caches are used to reduce peak backend load for throughput-constrained data center services, reducing the total number of backend servers required. Bulk storage systems are a large-scale example, backed by high-capacity but low-throughput hard disks, and using flash caches to provide a more cost-effective storage layer underlying everything from blobstores to data warehouses.

However, flash caches must address the limited write endurance of flash by limiting the long-term average flash write rate to avoid premature wearout. To do so, most flash caches must use admission policies to filter cache insertions and attempt to maximize the workload-reduction value of each flash write.

We present the Baleen flash cache, which uses coordinated ML admission and prefetching to reduce peak backend load. After learning painful lessons with early ML  policy attempts, we exploit a new cache residency model (which we call episodes) to guide models used and model training, and focus on optimizing for an end-to-end system metric (disk-head time) balancing IOPS and bandwidth rather than hit rate. Evaluation using Meta traces from seven storage clusters shows that Baleen reduces Peak Disk-head Time (and backend capacity required) by 11.8% over state-of-the-art policies.

In proposed work, we apply ML to item placement to improve eviction and optimize the use of DRAM in hybrid caches. To improve eviction, we will reduce cache dead time by classifying items by their eviction age and placing them into different eviction queues. We will use ML to select a few items for placement in DRAM that are most helpful for reducing flash writes, instead of letting every item pass through DRAM.

Workloads change over time, requiring the cache to adapt to maintain performance. We propose strategies to actively target peak load reduction and to mitigate workload drift. We plan to augment admission to prioritize items based on their benefit during peak load and to adapt to load levels.

Thesis Committee:

Gregory R. Ganger (Chair)
David G. Andersen
Nathan Beckmann
Daniel S. Berger (Microsoft Research / University of Washington)

Additional Information

In Person and Zoom Participation. See announcement.


Add event to Google
Add event to iCal