Measuring and Characterizing System Behavior Using Kernel-Level Event Logging

Abstract: 

Analyzing the dynamic behavior and performance of complex software systems is difficult. Currently available systems either analyze each process in isolation, only provide system level cumulative statistics, or provide a fixed and limited number of process group related statistics. The Linux Trace Toolkit (LTT) introduced here provides a novel, modular, and extensible way of recording and analyzing complete system behavior. Because all significant system events are recorded, it is possible to analyze any desired subset of the running processes, and for instance distinguish between the time spent waiting for some relevant event (data from disk or another process) versus time spent waiting for some unrelated process to use up its time slice.

Despite the extensive information gathered, experimental results show that the LTT time and memory overhead is minimal (< 2.5% when observing core kernel events). Moreover, due to the LTT and Linux kernel modularity and open source code availability, the system is easily extended both in terms of system events gathered, and of later post-processing and graphical presentation.

BibTeX
@inproceedings {271392,
author = {Karim Yaghmour and Michel R. Dagenais},
title = {Measuring and Characterizing System Behavior Using {Kernel-Level} Event Logging},
booktitle = {2000 USENIX Annual Technical Conference (USENIX ATC 00)},
year = {2000},
address = {San Diego, CA},
url = {https://www.usenix.org/conference/2000-usenix-annual-technical-conference/measuring-and-characterizing-system-behavior},
publisher = {USENIX Association},
month = jun
}