Check out the new USENIX Web site. next up previous
Next: Implementation Details Up: A Clustered Persistent Snapshot Previous: Membership updates in the

Performance

The update of metadata of snapshots is done by taking a clusterwide lock using a global lock manager. This global lock manager sends and receives messages on a network when transferring access rights of the locks between nodes. Each of the message involves some fixed overhead that is independent of the size of the message. If the data associated with a lock can `piggy-back' on the message used to transfer an access right, there is a potential benefit in performance. Thus we can divide the map into sections of upto. say, 64K, with each section protected by a cluster lock.

To increase the performance of disk writes when the snapshot is on, we can cache snapshot maps and snapshot blocks. The following methods can be used for better performance:

Method A - Each node that does a COW push first takes a clustered wide lock and logs the transaction onto the log disk. Then it broadcasts the updated map information along with the original disk block to all the nodes. Each node updates its map and caches the original block. Whenever a node tries to read a snapshot block (for backup), it will check in its local cache for its map. If there is a mapping and the original block is in the cache, then that is used. If the original block is not in the cache, the node broadcasts a message requesting the snapshot block. Nodes that have this block in their cache will respond to this message. If no node responds to this message, it means that the block is already written to the snapshot disk and can be read from the disk. If there is no mapping for that block, then the node can take a cluster wide lock on the map section and read the block from the original disk. Similarly, if the map section is not in the cache, then the node can broadcast a request for it. If no node responds to that request, the map section can again be read from the snapshot disk.
Advantages If a block is already COW pushed, there is no need to take a cluster wide lock on the snapshot map section to check whether to do a COW push. Similarly, a snapshot-read need not take a cluster wide lock to read the snapshot block if it is already COW pushed. This method performs better when large number of nodes in the cluster do a large number of disk writes in a small, concentrated portion of the original disk.
Disadvantages We need one broadcast message for each COW and one broadcast message, if the map-section is not in cache, and one more broadcast message for snapshot block if it is not in the cache.

Augmentations to Method A - We can keep additional information in the map: the node that has COW pushed the block along with the snapshot block number where it is copied. For example, if some node A does a COW push of block x, x's map information contains A also and the snapshot block is in A's cache. Later, if some node B tries to read x when it is not in its cache, it gets the node number from the map and contacts that node (in this case, node A). If map is not in its cache, it broadcasts a message for it. Node B now requests node A for snapshot block x. If A has it in its cache, it can reply with that block. Otherwise, we can do the read of the snapshot block x in 2 ways.
1. Node A reads block x and sends it to node B
2. Node B reads block x and broadcasts that it has block x so that all other nodes can update the node information for the map-entry of snapshot block x
Advantages The broadcast messages in method A become unicast messages when map or snapshot block is not in cache. This method is useful if many nodes are write to the original disk, but the writes are not concentrated in one region.
Disadvantages The map size increases as we incorporate node information also in the maps. This results in an increase in the number of locks that are used to serialize access to different sections of maps.

Method B - To do a write, a node x takes a cluster lock for the map section and logs its COW push onto the log disk. Whenever some other node y asks for a lock on the map section, node x transfers the log along with the lock to y. After transferring the log, x logs an entry in its log disk that the log has been transferred to y. If x fails, y need not replay the log as x's log has already been transferred to y's log disk.
Advantages There are no broadcast messages. This method is useful if only a few nodes in the cluster write to the original disk.
Disadvantages A node has to take a cluster wide lock on the snapshot map section to check whether to do a COW push, even if the block is already COW pushed. Similarly, a snapshot-read needs a cluster wide lock to read the snapshot block even if it is already COW pushed. When a cluster wide lock is transferred from one node to another node, the dirty log has to be transferred also.


next up previous
Next: Implementation Details Up: A Clustered Persistent Snapshot Previous: Membership updates in the
Suresh Siddha 2001-09-13