Isaac Grosof Optimal Scheduling in Multiserver Queues Degree Type: Ph.D. in Computer Science Advisor(s): Mor Harchol-Balter Graduated: August 2023 Abstract: Scheduling theory is a key tool for reducing latency (i.e. response time) in queueing systems. Scheduling, i.e. choosing the order in which to serve jobs, can reduce response time by an order of magnitude with no additional resources. Scheduling theory is well-developed in single-server systems, where one job is processed at a time. However, little is known about scheduling in multiserver systems, where many jobs are processed at once. Results are especially limited in stochastic multiserver scheduling theory. Today's datacenters have thousands of servers, and scheduling theory is unable to analyze such systems. This thesis proves the first optimality results and first closed-form bounds on mean response time for scheduling policies in stochastic multiserver models which reflect the behavior of modern computing systems. The thesis explores three themes: I start by studying one-server-per-job multiserver models, and prove the first results on optimal scheduling in that setting. Optimality results are proven for both a central-queue model and a dispatching model. I invent a novel class of dispatching policies, guardrails, to achieve these results. Next, I study the multiserver-job (MSJ) model, where different jobs require different amounts of resources to be served. I prove the first characterization of mean response time for any scheduling policy in the MSJ model, as well as the first optimality results. I invent novel scheduling policies, ServerFilling and ServerFilling-SRPT, to achieve these results. Finally, I study the effects of scheduling on the tail of response time, rather than mean response time. The prior state-of-the-art for scheduling for the tail was First-Come First-Served, which was conjectured to achieve optimal asymptotic tail of response time. I invent a novel scheduling policy, Nudge, which I prove to be the first policy to outperform FCFS's asymptotic tail of response time. Thesis Committee: Mor Harchol-Balter (Chair) Alan Scheller-Wolf Anupam Gupta Weina Wang Michael Mitzenmacher (Harvard University) Srinivasan Seshan, Head, Computer Science Department Martial Hebert, Dean, School of Computer Science Keywords: Queueing theory; scheduling; optimal scheduling; response time; multiserver; tails; M/G/1; M/G/k; dispatching; Shortest Remaining Processing Time; SRPT; First-Come First-Served; FCFS; Nudge; Guardrails; work; Multiserver-job; ServerFilling; ServerFilling-SRPT; DivisorFilling; WCFS; Finite Skip; RESET; MARC; Gittins index; heavy-traffic; stochastic improvement; tail probability CMU-CS-23-128.pdf (2.1 MB) ( 325 pages) Copyright Notice