DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving

TitleDistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving
Publication TypeConference Paper
Year of Publication2024
AuthorsZhong Y, Liu S, Chen J, Hu J, Zhu Y, Liu X, Jin X, Zhang H
Conference Name18th USENIX Symposium on Operating Systems Design and Implementation (OSDI 24)
Date Published07/2024
PublisherUSENIX Association
Conference LocationSanta Clara, CA
ISBN Number978-1-939133-40-3
URLhttps://www.usenix.org/conference/osdi24/presentation/zhong-yinmin