Elfen Scheduling: {Fine-Grain} Principled Borrowing from {Latency-Critical} Workloads Using Simultaneous Multithreading

Xi Yang; Stephen M. Blackburn; Kathryn S. McKinley

USENIX ATC '15 button

Get more
Help Promote graphics!

Tweets by @usenix

Authors:

Xi Yang and Stephen M. Blackburn, Australian National University; Kathryn S. McKinley, Microsoft Research

Abstract:

Web services from search to games to stock trading impose strict Service Level Objectives (SLOs) on tail latency. Meeting these objectives is challenging because the computational demand of each request is highly variable and load is bursty. Consequently, many servers run at low utilization (10 to 45%); turn off simultaneous multithreading (SMT); and execute only a single service—wasting hardware, energy, and money. Although co-running batch jobs with latency critical requests to utilize multiple SMT hardware contexts (lanes) is appealing, unmitigated sharing of core resources induces non-linear effects on tail latency and SLO violations.

We introduce principled borrowing to control SMT hardware execution in which batch threads borrow core resources. A batch thread executes in a reserved batch SMT lane when no latency-critical thread is executing in the partner request lane. We instrument batch threads to quickly detect execution in the request lane, step out of the way, and promptly return the borrowed resources. We introduce the nanonap system call to stop the batch thread’s execution without yielding its lane to the OS scheduler, ensuring that requests have exclusive use of the core’s resources. We evaluate our approach for colocating batch workloads with latency-critical requests using the Apache Lucene search engine. A conservative policy that executes batch threads only when request lane is idle improves utilization between 90% and 25% on one core depending on load, without compromising request SLOs. Our approach is straightforward, robust, and unobtrusive, opening the way to substantially improved resource utilization in datacenters running latency-critical workloads.

Xi Yang, Australian National University

Stephen M. Blackburn, Australian National University

Kathryn S. McKinley, Microsoft Research

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX

@inproceedings {196288,
author = {Xi Yang and Stephen M. Blackburn and Kathryn S. McKinley},
title = {Elfen Scheduling: {Fine-Grain} Principled Borrowing from {Latency-Critical} Workloads Using Simultaneous Multithreading},
booktitle = {2016 USENIX Annual Technical Conference (USENIX ATC 16)},
year = {2016},
isbn = {978-1-931971-30-0},
address = {Denver, CO},
pages = {309--322},
url = {https://www.usenix.org/conference/atc16/technical-sessions/presentation/yang},
publisher = {USENIX Association},
month = jun
}

connect with us

Xi Yang, Australian National University

Stephen M. Blackburn, Australian National University

Kathryn S. McKinley, Microsoft Research

Open Access Media

Presentation Audio