Check out the new USENIX Web site. next up previous
Next: Discovering data dependencies Up: //TRACE: Parallel trace replay Previous: I/O throttling


Design overview

//TRACE discovers an application's data dependencies and compute time using I/O throttling. Summarizing from Section 2, the design requirements are as follows:

  1. To adjust with the speed of the storage system, the traces must be replayed with a closed model.
  2. To enforce data dependencies, the traces must be annotated with the inter-node synchronization calls.
  3. To model computation, the inter-I/O compute time must be reflected in the traces.
  4. To evaluate different file systems (e.g., log-structured vs. journaled) and different storage systems (e.g., blocks vs. objects [29]), the traces must be file-level traces, including all buffered and non-buffered synchronous POSIX [32] file I/O (e.g., open, fopen, read, fread, write, fwrite, seek).

//TRACE is both a tracing engine and a replayer, designed not to require semantic knowledge or instrumentation of the application or its synchronization mechanisms. The tracing engine, called the causality engine, is designed as a library interposer [14] (which uses the LD_PRELOAD mechanism) and is run on all nodes in a parallel application. The application does not need to be modified, but must be dynamically linked to the causality engine. Any shared library call issued by the application can be traced and optionally delayed using this mechanism.

The objectives of the causality engine are to intercept and trace the I/O calls, calculate the computation time between I/Os, and discover any causal relationships (i.e., the data dependencies) across the nodes. All of this information is stored in a per-node annotated I/O trace. A replayer (also distributed) can then mimic the behavior of the traced application, by replaying the I/O, the computation, and the synchronization. Although I/O calls to any shared library (e.g., MPI-IO, libc) can be traced and replayed, this work focuses on the POSIX I/O issued by an application through libc.



Subsections
next up previous
Next: Discovering data dependencies Up: //TRACE: Parallel trace replay Previous: I/O throttling
Michael Mesnier 2006-12-22