sponsors
usenix conference policies
HotSnap: A Hot Distributed Snapshot System For Virtual Machine Cluster
Lei Cui, Bo Li, Yangyang Zhang, and Jianxin Li, Beihang University
The management of virtual machine cluster (VMC) is challenging owing to the reliability requirements, such as non-stop service, failure tolerance, etc. Distributed snapshot of VMC is one promising approach to support system reliability, it allows the system administrators of data centers to recover the system from failure, and resume the execution from a intermediate state rather than the initial state. However, due to the heavyweight nature of virtual machine (VM) technology, applications running in the VMC suffer from long downtime and performance degradation during snapshot. Besides, the discrepancy of snapshot completion times among VMs brings the TCP backoff problem, resulting in network interruption between two communicating VMs. This paper proposes HotSnap, a VMC snapshot approach designed to enable taking hot distributed snapshot with milliseconds system downtime and TCP backoff duration. At the core of HotSnap is transient snapshot that saves the minimum instantaneous state in a short time, and full snapshot which saves the entire VM state during normal operation. We then design the snapshot protocol to coordinate the individual VM snapshots into the global consistent state of VMC. We have implemented HotSnap on QEMU/ KVM, and conduct several experiments to show the effectiveness and efficiency. Compared to the live migration based distributed snapshot technique which brings seconds of system downtime and network interruption, HotSnap only incurs tens of milliseconds.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Lei Cui and Bo Li and Yangyang Zhang and Jianxin Li},
title = {{HotSnap}: A Hot Distributed Snapshot System For Virtual Machine Cluster},
booktitle = {27th Large Installation System Administration Conference (LISA 13)},
year = {2013},
isbn = {978-1-931971-05-8},
address = {Washington, D.C.},
pages = {59--74},
url = {https://www.usenix.org/conference/lisa13/technical-sessions/presentation/cui},
publisher = {USENIX Association},
month = nov
}
connect with us