HPC Resource Accounting: Progress Against Allocation—Lessons Learned
Ken Schumacher, Fermi National Accelerator Laboratory
The stakeholders who fund and oversee our HPC facilities need to know how our resources are utilized. Based on usage, we manage priorities and quotas so all users can get their fair share. Each year we allocate normalized cpu-core hours across our clusters. I will describe our usage reporting including how to incorporate additional ""charges"" for on-line storage, off-line storage and dedicated processors. You will learn how we deal with credits for failed jobs and periods of reduced performance due to load shed events. And I'll describe the usefulness of calculating burn rates for reprioritizing batch queues.
Ken Schumacher, Fermi National Accelerator Laboratory
Ken Schumacher is a 35 year computing professional who has spent the last 17 years at Fermi National Accelerator Lab. He currently helps support several HPC compute clusters, monitoring and reporting resource usage against allocations. Previously he worked with teams supporting lab wide Unix systems, farm and grid systems as well as data storage systems (currently over 300 PB of tape).
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Ken Schumacher},
title = {{HPC} Resource Accounting: Progress Against {Allocation{\textemdash}Lessons} Learned},
year = {2014},
address = {Seattle, WA},
publisher = {USENIX Association},
month = nov
}
connect with us