Analyzing Log Analysis: An Empirical Study of User Log Mining
S. Alspaugh, University of California, Berkeley and Splunk Inc.; Beidi Chen and Jessica Lin, University of California, Berkeley; Archana Ganapathi, Splunk Inc.; Marti A. Hearst and Randy Katz, University of California, Berkeley
Awarded Best Student Paper!
We present an in-depth study of over 200K log analysis queries from Splunk, a platform for data analytics. Using these queries, we quantitatively describe log analysis behavior to inform the design of analysis tools. This study includes state machine based descriptions of typical log analysis pipelines, cluster analysis of the most common transformation types, and survey data about Splunk user roles, use cases, and skill sets. We find that log analysis primarily involves filtering, reformatting, and summarizing data and that non-technical users increasingly need data from logs to drive their decision making. We conclude with a number of suggestions for future research.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {S. Alspaugh and Beidi Chen and Jessica Lin and Archana Ganapathi and Marti Hearst and Randy Katz},
title = {Analyzing Log Analysis: An Empirical Study of User Log Mining},
booktitle = {28th Large Installation System Administration Conference (LISA14)},
year = {2014},
isbn = {978-1-931971-17-1},
address = {Seattle, WA},
pages = {62--77},
url = {https://www.usenix.org/conference/lisa14/conference-program/presentation/alspaugh},
publisher = {USENIX Association},
month = nov
}
connect with us