Ahmed Khawaja, Joshua Landgraf, and Rohith Prakash, UT Austin; Michael Wei and Eric Schkufza, VMware Research Group; Christopher J. Rossbach, UT Austin and VMware Research Group
Cloud providers such as Amazon and Microsoft have begun to support on-demand FPGA acceleration in the cloud, and hardware vendors will support FPGAs in future processors. At the same time, technology advancements such as 3D stacking, through-silicon vias (TSVs), and FinFETs have greatly increased FPGA density. The massive parallelism of current FPGAs can support not only extremely large applications, but multiple applications simultaneously as well.
System support for FPGAs, however, is in its infancy. Unlike software, where resource configurations are limited to simple dimensions of compute, memory, and I/O, FPGAs provide a multi-dimensional sea of resources known as the FPGA fabric: logic cells, floating point units, memories, and I/O can all be wired together, leading to spatial constraints on FPGA resources. Current stacks either support only a single application or statically partition the FPGA fabric into fixed-size slots. These designs cannot efficiently support diverse workloads: the size of the largest slot places an artificial limit on application size, and oversized slots result in wasted FPGA resources and reduced concurrency.
This paper presents AmorphOS, which encapsulates user FPGA logic in morphable tasks, or Morphlets. Morphlets provide isolation and protection across mutually distrustful protection domains, extending the guarantees of software processes. Morphlets can morph, dynamically altering their deployed form based on resource requirements and availability. To build Morphlets, developers provide a parameterized hardware design that interfaces with AmorphOS, along with a mesh, which specifies external resource requirements. AmorphOS explores the parameter space, generating deployable Morphlets of varying size and resource requirements. AmorphOS multiplexes Morphlets on the FPGA in both space and time to maximize FPGA utilization.
We implement AmorphOS on Amazon F1 and Microsoft Catapult. We show that protected sharing and dynamic scalability support on workloads such as DNN inference and blockchain mining improves aggregate throughput up to 4x and 23x on Catapult and F1 respectively.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Ahmed Khawaja and Joshua Landgraf and Rohith Prakash and Michael Wei and Eric Schkufza and Christopher J. Rossbach},
title = {Sharing, Protection, and Compatibility for Reconfigurable Fabric with {AmorphOS}},
booktitle = {13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)},
year = {2018},
isbn = {978-1-939133-08-3},
address = {Carlsbad, CA},
pages = {107--127},
url = {http://www.usenix.org/conference/osdi18/presentation/khawaja},
publisher = {USENIX Association},
month = oct
}