Experiment two compares the accuracy of //TRACE and think-limited. Results are shown in Figure 10, which is the same as Figure 9, with //TRACE added for comparison.
//TRACE offers no significant improvement for Pseudo, and this result is expected given that Pseudo has few data dependencies. However, for both PseudoSync and PseudoSyncDat, //TRACE offers substantial gains. Namely, the maximum replay error is reduced from 82% to 17% for PseudoSync and 33% to 10% for PseudoSyncDat. These improvements are due to the replayed synchronization: a barrier after every write I/O, which //TRACE approximates with 8 SIGNAL() and 8 WAIT() calls per node (a barrier requires all nodes to signal and wait on all other nodes before proceeding).
Looking at Fitness, one sees even greater improvement. Namely, the maximum replay error is reduced from 205% to 5%. There are only 3 data dependencies approximated by //TRACE: node 0 signaling node 1 after it completes is read, 1 signaling 2, and 2 signaling 3. Nonetheless, these dependencies enforce a sequential execution of the I/O (which is what Fitness intended); when ignored, the result is concurrent access from all nodes (a different workload altogether). Therefore, it is not the number of data dependencies discovered that determines replay accuracy, but rather how these dependencies impact the storage system.
The Quake workload highlights how accurately //TRACE replays complex applications with multiple I/O phases, having different mixes of I/O, compute, and synchronization. Relative to think-limited, the maximum replay error is reduced from 26% to 8%.