Next: Disaster Test
Up: Evaluation
Previous: Evaluation Metrics
Reliability During Disaster
Figure 6:
Data loss as a result of disaster and wide-area link failure, varying link loss (50ms one-way latency and FEC params
).
|
We measure reliability in two ways:
- In the event of a disaster at the primary site, how much data loss results?
- How much are the primary and mirror sites allowed to diverge?
These questions are highly related; we distinguish between them as
follows: The maximum amount by which the primary and mirror sites can
diverge is the extent of the bandwidth-delay product of the link
between them; however, the amount of data lost in the event of failure
depends on how much of this data has been acknowledged to the
application. In other words, how often can we be caught in a lie? For
instance, with a remote-sync solution (synchronous mirroring), though
bandwidth-delay product - and hence primary-to-mirror divergence -
may be high, data loss is zero. This, of course, is at severe cost to
performance. With a local-sync solution (async- or semi-synchronous
mirroring), on the other hand, data loss is equal to divergence. The
following experiments show that the network-sync solution with SMFS
achieves a desirable mean between these two extremes.
Subsections
Next: Disaster Test
Up: Evaluation
Previous: Evaluation Metrics
Hakim Weatherspoon
2009-01-14