Next: Policy Framework
Up: Design of TSS
Previous: Failure Handling
Data logging consists of logging the data along with the stripe information
in a log device synchronously and then writing to the actual device. After the
actual write is complete, a commit of the transaction is done on the
log device. The design of the log is depicted in figure 3.
Figure 3:
Design of the log
|
Some of the issues in logging are:
- Should the data logging be only targeting the data consistency or
try to improve performance, by delaying the writes to the disks, similar to
a log structured file system. The former approach was chosen in TSS, mainly due
to its simplicity and ease of addition into the existing TSS device driver.
- At what level should the logging be incorporated in the existing TSS
device driver? The possible options were at request level, at stripe level
and at physical block level. Considering the advantages and the
disadvantages of the three approaches, the
stripe level logging was chosen as a suitable method for TSS.
- When should the recovery process be initiated? An
ioctl() was provided to trigger the recovery mechanism, which can be
called through an application program, in case of crash of the TSS device.
- If the recovery is done in a straight forward manner, using the normal
stripe write mechanisms, there is a possibility of data
corruption. This case arises due to the use of Read-Modify-Write (RMW)
cycles in a RAID5 device. The problem: if a part of a write is
committed before the crash, it will leave the stripe in an inconsistent
state. If a RMW cycle is used during recovery, then some inconsistent
data may be read and used during the recovery that would result in
permanent inconsistent data on the disk.
This problem can be solved, if reconstructing writes are used in place of
RMW cycles during the recovery. In reconstructing writes, data
is reconstructed from the whole stripe; partially committed
data will not affect the recovery and the final data will be consistent.
- What should be done in case of failures in accessing
the log device? Logging is used for crash recovery but any
redundancy in the log device to avoid the failure of the device is likely to be
costly.
Next: Policy Framework
Up: Design of TSS
Previous: Failure Handling
2001-09-13