In addition to the wide variety of topics covered in the LISA ’12 program, the program committee has created three specific conference themes, or tracks, for those looking to focus on a key subject area:

  • Super Sysadmin
  • Cloud Computing 
  • IPv6 and DNSSEC

Follow the icons throughout the technical sessions below. You can combine days of training or workshops with days of technical sessions content to build the conference that meets your needs. Pick and choose the sessions that best fit your interest—focus on just one topic or mix and match themes.

Proceedings Front Matter: 
Cover Page | Title Page and List of Organizers | Table of Contents | Message from the Program Chair

 

Wednesday, December 12, 2012

8:45 a.m.–9:00 a.m. Wednesday

Opening Remarks

Grande Ballroom

Program Chair: Carolyn Rowland

9:00 a.m.–10:30 a.m. Wednesday

Keynote Address

Grande Ballroom

The Internet of Things and Sensors and Actuators!

Vint Cerf,
VP and Chief Internet Evangelist, Google

Vinton G. Cerf is vice president and chief Internet evangelist for Google. Widely known as one of the "Fathers of the Internet," Cerf is the co-designer of the TCP/IP protocols and the architecture of the Internet. With his colleague, Robert Kahn, Cerf received the U.S. National Medal of Technology in 1997 for co-founding and developing the Internet. In 1994 and 1998 respectively, Kahn and Cerf were honored as Marconi Fellows. They received the ACM Alan M. Turing award in 2004 for their work on the Internet protocols. In November 2005 they received the Presidential Medal of Freedom. In 2008 they received the Japan Prize.

Vint Cerf served as chairman of the board of the Internet Corporation for Assigned Names and Numbers (ICANN) from 2000 to 2007 and as founding president of the Internet Society. He is a Fellow of the IEEE, ACM, and American Association for the Advancement of Science, the American Academy of Arts and Sciences, the American Philosophical Society, the International Engineering Consortium, the Computer History Museum and a member of the National Academy of Engineering.

Cerf holds a Bachelor of Science degree in Mathematics from Stanford University and Master of Science and Ph.D. degrees in Computer Science from UCLA. He also holds honorary Doctorate degrees from 20 universities.


We’ll look at the rapid influx of devices on the Internet and the increasing need for more address space. IPv6, here we come! Big opportunities await third parties willing to assist users to manage their office, home, personal, and automobile devices.

We’ll look at the rapid influx of devices on the Internet and the increasing need for more address space. IPv6, here we come! Big opportunities await third parties willing to assist users to manage their office, home, personal, and automobile devices.

Available Media
10:30 a.m.–11:00 a.m. Wednesday

Break

 Grand Ballroom Foyer
11:00 a.m.–12:30 p.m. Wednesday

Papers and Reports: Storage and Data

SPINNAKER

Session Chair:
Marc Chiarini, Harvard University

A Simple File Storage System for Web Applications

Daniel Pollack, AOL Inc.

AOL Technologies has created a scalable object store for web applications. The goal of the object store was to eliminate the creation of a separate storage system for every application we produce while avoiding sending data to external storage services. AOL developers had been devoting a significant amount of time to creating backend storage systems to enable functionality in their applications. These storage systems were all very similar and many relied on difficult-to-scale technologies like network attached file systems. This paper describes our implementation and the operating experience with the storage system. The paper also presents a feature roadmap and our release of an open source version.

Available Media

IDO: Intelligent Data Outsourcing with Improved RAID Reconstruction Performance in Large-Scale Data Centers

Suzhen Wu, Xiamen University and University of Nebraska-Lincoln; Hong Jiang and Bo Mao, University of Nebraska-Lincoln

Dealing with disk failures has become an increasingly common task for system administrators in the face of high disk failure rates in large-scale data centers consisting of hundreds of thousands of disks. Thus, achieving fast recovery from disk failures in general and high online RAID-reconstruction performance in particular has become crucial. To address the problem, this paper proposes IDO (Intelligent Data Outsourcing ), a proactive and zone-based optimization, to significantly improve on-line RAID-reconstruction performance. IDO moves popular data zones that are proactively identified in the normal state to a surrogate set at the onset of reconstruction. Thus, IDO enables most, if not all, user I/O requests to be serviced by the surrogate set instead of the degraded set during reconstruction.

Extensive trace-driven experiments on our lightweight prototype implementation of IDO demonstrate that, compared with the existing state-of-the-art reconstruction approaches WorkOut and VDF, IDO simultaneously speeds up the reconstruction time and the average user response time. Moreover, IDO can be extended to improving the performance of other background RAID support tasks, such as re-synchronization, RAID reshape and disk scrubbing.

Available Media

Theia: Visual Signatures for Problem Diagnosis in Large Hadoop Clusters

Elmer Garduno, Soila P. Kavulya, Jiaqi Tan, Rajeev Gandhi, and Priya Narasimhan, Carnegie Mellon University
Awarded Best Student Paper!   

Diagnosing performance problems in large distributed systems can be daunting as the copious volume of monitoring information available can obscure the root-cause of the problem. Automated diagnosis tools help narrow down the possible root-causes—however, these tools are not perfect thereby motivating the need for visualization tools that allow users to explore their data and gain insight on the root-cause. In this paper we describe Theia, a visualization tool that analyzes application-level logs in a Hadoop cluster, and generates visual signatures of each job's performance. These visual signatures provide compact representations of task durations, task status, and data consumption by jobs. We demonstrate the utility of Theia on real incidents experienced by users on a production Hadoop cluster.

Available Media

Invited Talks 1

GRANDE BALLROOM A

Session Chair:
Rudi van Drunen, Competa IT and Xlexit Technology, The Netherlands

The Evolution of Ethernet

John D’Ambrosia, Ethernet Alliance and Dell; Chauncey Schwartz II, Ethernet Alliance and QLogic

John D’Ambrosia is the Chief Ethernet Evangelist in the CTO Office Dell. In this capacity John has been an industry leader in the development of Ethernet-related technologies since 1999. Currently, he is chairing the IEEE P802.3bj 100 Gb/s Backplane and Copper Cable Task Force and the IEEE 802.3 Ethernet Bandwidth Assessment Ad Hoc. In addition, John is a founder of the Ethernet Alliance, and is currently serving as the Chairman of the Board of Directors. Prior to these efforts, John served as chair of the IEEE P802.3ba Task Force, which developed the specifications for 40 Gb/s and 100 Gb/s Ethernet, as well as chair of the Optical Internetworking Forum's Market Awareness & Education committee. D’Ambrosia writes a blog for EE Times, called Ethernet Watch, which may be found here.

Chauncey Schwartz II has a successful 25+ year career in sales, marketing, and business management. Schwartz has a proven ability to position, communicate, and teach markets and channels about creative break-through solutions generating successful business results while leading highly successful cross-functional teams. Mr. Schwartz is currently working with QLogic to develop strategies for alliances focused on networking connectivity and to find creative ways to incorporate virtualization and fabrics into complex applications. Mr. Schwartz is also currently the Marketing Chairperson for the Ethernet Alliance, focused on helping the alliance meet their strategic vision of expanding the Ethernet ecosystem, supporting Ethernet development, and promoting Ethernet to the community.

 

Ethernet is the dominant networking technology, driving an interwoven interconnected eco-system that includes cloud computing, data centers, enterprises, high-performance computing, and millions of servers and end users.

Ethernet is the dominant networking technology, driving an interwoven interconnected eco-system that includes cloud computing, data centers, enterprises, high-performance computing, and millions of servers and end users. While solutions range in speeds from 10 Megabit to 100 Gigabit, the reality is that there is more to Ethernet than feeds and speeds. Ethernet continues to evolve to meet the needs of the networking industry, and to grow into areas that wish to leverage networking and the benefits of Ethernet. For this session the Ethernet Alliance will bring together the expertise within its membership to provide an overview on the state of Ethernet standards, technology, and deployment.

Available Media

IPv6: A Guide to Address Planning

Owen DeLong, Hurricane Electric

Owen DeLong is an IPv6 Evangelist at Hurricane Electric and a member of the ARIN Advisory Council. Owen brings more than 25 years of industry experience. He is an active member of the systems administration, operations, and IP policy communities. In the past, Owen has worked at Tellme Networks (Senior Network Engineer); Exodus Communications (Senior Backbone Engineer), where he was part of the team that took Exodus from a pre-IPO startup with two datacenters to a major global provider of hosting services; Netcom Online (Network Engineer), where he worked on a team that moved the Internet from an expensive R&E tool to a widely available public access system accessible to anyone with a computer; Sun Microsystems (Senior Systems Administrator); and more. He can be reached as owend at he dot net.

 

As more organizations begin to embark on their IPv6 journey, the question of how to properly plan an IPv6 network deployment comes up with increasing frequency.

As more organizations begin to embark on their IPv6 journey, the question of how to properly plan an IPv6 network deployment comes up with increasing frequency. This talk goes over real-world address planning techniques based on operational experience with a wide variety of networks. It includes concise examples and simple exercise to get the audience involved.

Available Media

Invited Talks 2

GRANDE BALLROOM B

Session Chair:
Andrew Hume, AT&T Labs—Research

OpenStack: Leading the Open Source Cloud Revolution

Vish Ishaya, Nebula, Inc.

Vish Ishaya is the Director of Open Source at Nebula, Inc. He was previously a Principal Engineer with Rackspace Cloud Builders. He was also a Senior Systems Engineer with Anso Labs and NASA Nebula Technical Lead during the creation of Nova, one of the founding OpenStack projects.

He is a highly prolific developer who is one of the top contributors to OpenStack. During the November 2010 OpenStack conference, he won an OpenStack award for his development and community efforts, and was also elected to the OpenStack Board, which along with other prominent community members helps guide the vision of the OpenStack project.

Vish has been elected to four consecutive terms as the OpenStack Compute Project Technical Lead. In addition to his excellent programming and systems skills, Vish has spent over a decade teaching, most recently classes in object oriented analysis and design.

 

The OpenStack project’s mission is to produce the ubiquitous open source cloud computing platform that will meet the needs of public and private cloud providers regardless of size. In this talk, I will give a detailed technical overview of the OpenStack Compute software stack in terms of the three major resources provided by it: compute, block storage, and networking services. I will also provide details of the recent Folsom release, as well as provide a sneak peek at some of the features proposed for inclusion in our next release.

The OpenStack project’s mission is to produce the ubiquitous open source cloud computing platform that will meet the needs of public and private cloud providers regardless of size. In this talk, I will give a detailed technical overview of the OpenStack Compute software stack in terms of the three major resources provided by it: compute, block storage, and networking services. I will also provide details of the recent Folsom release, as well as provide a sneak peek at some of the features proposed for inclusion in our next release.

Available Media

Invited Talks 3

GRANDE BALLROOM C

Session Chair:
Nicole Forsgren Velasquez, Utah State University

Analysis of an Internet-wide Stealth Scan from a Botnet

Alberto Dainotti, Cooperative Association for Internet Data Analysis

kc claffy has played a leading role in Internet research for more than a decade. For the past 15 years she has led the direction, strategy, and overall management of the Cooperative Association for Internet Data Analysis (CAIDA), which she founded at the UC San Diego Supercomputer Center in 1996. CAIDA is an internationally respected Internet research organization, responsive to industry, government, and academic sector needs and interests, providing tools and analyses to promote a robust, scalable global Internet infrastructure. As a research scientist at SDSC and Adjunct Professor of Computer Science & Engineering at UCSD, her research interests include Internet data collection, analysis, visualization, and enabling others to make use of CAIDA data and results. She has been at SDSC since 1991 and holds a Ph.D. in Computer Science from UC San Diego.

 

Alberto Dainotti is a PostDoc at CAIDA (Cooperative Association for Internet Data Analysis) at  UC San Diego. In 2008 he received his Ph.D. in Computer Engineering and Systems at the Department of Computer Engineering and Systems of University of Napoli “Federico II,” Italy. He has co-authored several peer-reviewed papers published at conferences and in scientific journals in the field of Internet measurement, traffic analysis, and network security. He serves as an independent reviewer/evaluator of projects and project proposals co-funded by the European Commission.


Botnets are the most common vehicle of cyber-criminal activity. They are used for spamming, phishing, denial of service attacks, brute-force cracking, stealing private information, and cyber warfare. We present the measurement and analysis of a horizontal scan of the entire IPv4 address space conducted by the Sality botnet last year.

Botnets are the most common vehicle of cyber-criminal activity. They are used for spamming, phishing, denial of service attacks, brute-force cracking, stealing private information, and cyber warfare. We present the measurement and analysis of a horizontal scan of the entire IPv4 address space conducted by the Sality botnet last year. This 12-day scan originated from approximately 3 million distinct IP addresses and tried to discover and compromise VoIP-related infrastructure. We observed this event through the UCSD Network Telescope. Sality is one of the largest botnets ever identified by researchers, representing ominous advances in the evolution of modern malware. This talk offers a detailed dissection of the botnet’s scanning behavior, including general methods to correlate, visualize, and extrapolate botnet behavior across the global Internet.

Available Media

The Guru Is In

MARINA 3

Datacenter Infrastructure

Doug Hughes, D. E. Shaw Research

Doug Hughes is the manager of all things infrastructure at D. E. Shaw Research, a bio-technology research firm located in Manhattan. He was intimately involved in the specifications, design, and implementation of the company's current built-from-scratch datacenter.

Doug Hughes is the manager of all things infrastructure at D. E. Shaw Research, a bio-technology research firm located in Manhattan. He was intimately involved in the specifications, design, and implementation of the company's current built-from-scratch datacenter.

Doug Hughes is the manager of all things infrastructure at D. E. Shaw Research, a bio-technology research firm located in Manhattan. He was intimately involved in the specifications, design, and implementation of the company's current built-from-scratch datacenter.

12:30 p.m.–2:00 p.m. Wednesday

Lunch, on your own

2:00 p.m.–3:30 p.m. Wednesday

Papers and Reports: Security and Systems Management

SPINNAKER

Session Chair:
Tim Nelson, Worcester Polytechnic Institute

Lessons in iOS Device Configuration Management

Tim Bell, Trinity College, University of Melbourne

After an initial trial, Trinity College has deployed iPads to its 600 international students. Supporting these devices on the Trinity network has presented challenges relating to configuration management, wireless network capacity, and server load.

We addressed the problem of configuration management with a web application written using the Django framework, which enables students to download and install a customized configuration profile to their iPads. 

This paper describes the requirements and implementation of the configuration management solution, security issues, as well as the experiences and lessons learned from its use.

Available Media

A Declarative Approach to Automated Configuration

John A. Hewson and Paul Anderson, University of Edinburgh; Andrew D. Gordon, Microsoft Research and University of Edinburgh

System administrators increasingly use declarative, object-oriented languages to configure their systems. Extending such systems with automated analysis and decision making is an area of active research. We introduce ConfSolve, an object-oriented declarative configuration language, in which logical constraints over a system can be specified. Verification, impact analysis or even the generation of valid configurations can then be performed, by translation to a Constraint Satisfaction Problem (CSP), which is solved with an off-the-shelf solver. We present a full definition of our language and its compilation process, and show that our implementation outperforms previous work utilising an SMT solver, while adding new features such as optimisation.

Available Media

Preventing the Revealing of Online Passwords to Inappropriate Websites with LoginInspector

Chuan Yue, University of Colorado at Colorado Springs
Awarded Best Paper!   

Modern Web browsers do not provide sufficient protection to prevent users from submitting their online passwords to inappropriate websites. As a result, users may accidentally reveal their passwords for high-security websites to inappropriate low-security websites or even phishing websites. In this paper, we address this limitation of modern browsers by proposing LoginInspector, a profiling-based warning mechanism. The key idea of LoginInspector is to continuously monitor a user’s login actions and securely store hashed domain-specific successful login information to an in-browser database. Later on, whenever the user attempts to log into a website that does not have the corresponding successful login record, LoginInspector will warn and enable the user to make an informed decision on whether to really send this login information to the website. LoginInspector can also report users’ insecure password practices to system administrators so that targeted training and technical assistance can be provided to vulnerable users. We implemented LoginInspector as a Firefox browser extension and evaluated it on 30 popular legitimate websites, 30 sample phishing websites, and one new phishing scam discovered by M86 Security Labs. Our evaluation and analysis indicate that LoginInspector is a secure and useful mechanism that can be easily integrated into modern Web browsers to complement their existing protection mechanisms. Security system administrators in our university commented that such a tool could be very helpful for them to strengthen campus IT security.

Available Media

Invited Talks 1

GRANDE BALLROOM A

Session Chair:
Cory Lueninghoener, Los Alamos National Laboratory

Database Server Safety Nets: Options for Predictive Server Analytics

Joe Conway, credativ USA; Jeff Hamann, Forest Informatics, Inc.

Joe Conway has been involved with PostgreSQL as a contributor since 2001. He is also the author and maintainer of a PostgreSQL procedural language handler for the R language, PL/R. Joe is President/CEO of credativ USA, which specializes in open source software with its "Open Source Support Center" and a comprehensive range of services, including consulting, architectural and technical advice, software development, training, and personalized support.

Jeff Hamann has developed open source analysis and optimization tools for forestry people for over 20 years. He co-authored Forest Analytics with R for the Springer Use-R series, and is a Wiley Science Advisor. He has a Bachelors of Science in Forestry from Humboldt State University, and an MS and Ph.D. in Forest Biometrics and Forest Engineering from Oregon State University. As president of Forest Informatics, Jeff is obsessed with analyzing data from, developing tools for, and presenting collaborative, geek-friendly stories and solutions for forests and people.

 

Server monitoring is usually reactive in nature. Some predefined threshold is exceeded, an alert is sent, and by the time you receive the alert, something bad has already happened. Wouldn’t it be nice to be able to foresee trouble before it rears its ugly head? We present our initial investigation into using analytical tools available within the R statistical environment to easily monitor server activity, predict potential performance problems, and possibly prevent faults on PostgreSQL database servers.

Server monitoring is usually reactive in nature. Some predefined threshold is exceeded, an alert is sent, and by the time you receive the alert, something bad has already happened. Wouldn’t it be nice to be able to foresee trouble before it rears its ugly head? We present our initial investigation into using analytical tools available within the R statistical environment to easily monitor server activity, predict potential performance problems, and possibly prevent faults on PostgreSQL database servers.

Available Media

Invited Talks 2

GRANDE BALLROOM B

Session Chair:
Narayan Desai, Argonne National Laboratories

Ceph: Managing a Distributed Storage System at Scale

Sage Weil, Inktank

Sage Weil designed Ceph as part of his Ph.D. research in Storage Systems at the University of California, Santa Cruz. Since graduating, he has continued to refine the system with the goal of providing a stable next generation distributed file system for Linux. Prior to his graduate work, he co-founded New Dream Network, the company behind DreamHost.com, a Los Angeles-based Web hosting company.

 

As the size and performance requirements of storage systems have increased, file system designers have looked to new architectures to facilitate system scalability. Ceph is a fully open source distributed object store, network block device, and file system designed for reliability, performance, and scalability from terabytes to exabytes.

As the size and performance requirements of storage systems have increased, file system designers have looked to new architectures to facilitate system scalability. Ceph is a fully open source distributed object store, network block device, and file system designed for reliability, performance, and scalability from terabytes to exabytes. Fault tolerance is a key challenge for both system design and operations. Ceph is designed to be both highly available and elastic. In large clusters, disk, host, and even network failures are the norm rather than the exception, hardware is heterogeneous and incrementally deployed or deprovisioned, and availability must be continuous. This talk will describe the Ceph architecture and the impact it has on system operations, including failure management, monitoring, and provisioning.

Available Media

Invited Talks 3

GRANDE BALLROOM C

Session Chair:
Doug Hughes, D. E. Shaw Research, LLC

System Log Analysis Using BigQuery

Gustavo Franco, Google Inc.

Gustavo Franco is the Lead Site Reliability Engineer for the following services, which are part of the Google Cloud Platform: Google Compute Engine, Cloud Storage, and BigQuery. He has been a Debian Developer for more than 10 years. His career spans over 12 years of DevOps-related work including the FIFA World Cup online broadcast, migrating Google to Goobuntu, and more.

This talk presents System Log Analysis using an OLA P-based and in-the-cloud system, BigQuery, which is heavily optimized for reads. BigQuery is a Dremel-based SaaS that, when combined with a system logger such as syslog-ng, a tool to convert log entries to CSV, and an uploader, which will be discussed in detail in this talk, allows any DevOp to scale her log analysis needs in an elastic manner.

This talk presents System Log Analysis using an OLA P-based and in-the-cloud system, BigQuery, which is heavily optimized for reads. BigQuery is a Dremel-based SaaS that, when combined with a system logger such as syslog-ng, a tool to convert log entries to CSV, and an uploader, which will be discussed in detail in this talk, allows any DevOp to scale her log analysis needs in an elastic manner.

Available Media

The Guru Is In

MARINA 3

Documentation

Janice Gelb, Oracle Corporation

Janice Gelb is a Senior Developmental Editor at Oracle Corporation, where she is responsible for editing print and online documentation for numerous software products. She is a certified trainer for their internal SGML/XML authoring tool. Janice also worked as a technical editor at Ashton-Tate and at Scitex Corporation. She has presented papers at several technical communication conferences and was the project lead for the popular editorial style guide
Read Me First! A Style Guide for the Computer Industry.

Janice Gelb is a Senior Developmental Editor at Oracle Corporation, where she is responsible for editing print and online documentation for numerous software products.

Janice Gelb is a Senior Developmental Editor at Oracle Corporation, where she is responsible for editing print and online documentation for numerous software products. She is a certified trainer for their internal SGML/XML authoring tool. Janice also worked as a technical editor at Ashton-Tate and at Scitex Corporation. She has presented papers at several technical communication conferences and was the project lead for the popular editorial style guide Read Me First! A Style Guide for the Computer Industry.

3:30 p.m.–4:00 p.m. Wednesday

Break

 Grand Ballroom Foyer
4:00 p.m.–5:30 p.m. Wednesday

LISA Game Show

GRANDE BALLROOM AB

Don’t miss the LISA Game Show. Join us as we once again pit attendees against each other in a test of technical knowledge and cultural trivia. Host Rob Kolstad and sidekick Dan Klein will provide the questions and color commentary for this always memorable session.

Thursday, December 13, 2012

9:00 a.m.–10:30 a.m. Thursday

Papers and Reports: Tools You Can Use

SPINNAKER

Session Chair: 
Patrick Cable, MIT Lincoln Laboratory

XUTools: Unix Commands for Processing Next-Generation Structured Text

Gabriel A. Weaver and Sean W. Smith, Dartmouth College

Traditional Unix tools operate on sequences of characters, bytes, fields, lines, and files. However, modern practitioners often want to manipulate files in terms of a variety of language-specific constructs—C functions, Cisco IOS interface blocks, and XML elements, to name a few. These language-specific structures quite often lie beyond the regular languages upon which Unix textprocessing tools can practically compute. In this paper, we propose eXtended Unix text-processing tools (xutools) and present implementations that enable practitioners to extract (xugrep ), count (xuwc ), and compare (xudiff ) texts in terms of language-specific structures. We motivate, design, and evaluate our tools around real-world use cases from network and system administrators, security consultants, and software engineers from a variety of domains including the power grid, healthcare, and education.

Available Media

Managing User Requests With the Grand Unified Task System (GUTS)

Andrew Stromme, Danica J. Sutherland, Alexander Burka, Benjamin Lipton, Nicholas Felt, Rebecca Roelofs, Daniel-Elia Feist-Alexandrov, Steve Dini, and Allen Welkie, Swarthmore College

As system administrators who are also full-time students, we aim to minimize the time we spend approving and carrying out standard tasks that comprise much of our day-to-day work. The less time required for these repetitive tasks, the more time we have available to provide new and exciting services to our community.

To facilitate the automation of this process, we have created the Grand Unified Task System (GUTS), which consists of a small core (a web interface and task executor) that unites task request processing for a range of modular services. This design allows for enhanced security and makes the system easily understandable and extensible, especially to new administrators. The Python backend provides deep integration with standard UNIX tools; the Django-based frontend provides a web interface friendly to both users and administrators. These design decisions have proven successful: deploying GUTS in production allowed us to dramatically reduce our response time for approving tasks, reach a much larger portion of our potential user base, and more easily support a diverse array of new services.

Available Media

Bayllocator: A Proactive System to Predict Server Utilization and Dynamically Allocate Memory Resources Using Bayesian Networks and Ballooning

Evangelos Tasoulas, University of Oslo; Hârek Haugerud, Oslo and Akershus University College; Kyrre Begnum, Norske Systemarkitekter AS

With the advent of virtualization and cloud computing, virtualized systems can be found from small companies to service providers and big data centers. All of them use this technology because of the many benefits it has to offer, such as a greener ICT, cost reduction, improved profitability, uptime, flexibility in management, maintenance, disaster recovery, provisioning and more. The main reason for all of these benefits is server consolidation which can be even further improved through dynamic resource allocation techniques. Out of the resources to be allocated, memory is one of the most difficult and requires proper planning, good predictions and proactivity. Many attempts have been made to approach this problem, but most of them are using traditional statistical mathematical methods. In this paper, the application of discrete Bayesian networks is evaluated, to offer probabilistic predictions on system utilization with focus on memory. The tool Bayllocator is built to provide proactive dynamic memory allocation based on the Bayesian predictions, for a set of virtual machines running in a single hypervisor. The results show that Bayesian networks are capable of providing good predictions for system load with proper tuning, and increase performance and consolidation of a single hypervisor. The modularity of the tool gives a great freedom for experimentation and even results to deal with the reactivity of the system can be provided. A survey of the current state-of-the-art in dynamic memory allocation for virtual machines is included in order to provide an overview.

Available Media

Plenary Session

GRANDE BALLROOM

Session Chair: 
Carolyn Rowland

Education vs. Training

Selena DeckelmannPostgreSQL

Selena Deckelmann is a major contributor to PostgreSQL. She specializes in open source software development and product management. She's an internationally recognized speaker on open source, PostgreSQL, and developer community. She founded Postgres Open, a conference dedicated to the business of PostgreSQL and disruption of the database industry. She founded and co-chaired Open Source Bridge, a developer conference for open source citizens. She founded the PostgreSQL Conference, a successful series of east coast/west coast conferences in the U.S. for PostgreSQL. She's helped run other conferences like WhereCampPDX, BarCampPDX, and PG Days. She is currently on the organizing committees for PgCon and OSCON. She's a contributing writer for the Google Summer of Code Mentor Manual and Student Guide.

It’s no secret that the field of system administration has struggled for respect among computer scientists. When I broke the news that I was taking a job as a system administrator directly out of college, a fellow graduate asked sarcastically, “Why would you want to be a janitor?”

System administration principles are typically not taught at universities, where “education” (concepts and frameworks) is valued over “training” (explicit instruction of tasks). It’s true that system administration courses and a couple degree programs exist—but they are the exception rather than the rule. The state of related curriculum in K–12 education is even more dire.

The belief that education and training are separate is harmful. It’s helped make computer science educators hostile to efforts to adopt system administration curriculum and a CS degree largely irrelevant when hiring Web developers.

It’s no secret that the field of system administration has struggled for respect among computer scientists. When I broke the news that I was taking a job as a system administrator directly out of college, a fellow graduate asked sarcastically, “Why would you want to be a janitor?”

System administration principles are typically not taught at universities, where “education” (concepts and frameworks) is valued over “training” (explicit instruction of tasks). It’s true that system administration courses and a couple degree programs exist—but they are the exception rather than the rule. The state of related curriculum in K–12 education is even more dire.

The belief that education and training are separate is harmful. It’s helped make computer science educators hostile to efforts to adopt system administration curriculum and a CS degree largely irrelevant when hiring Web developers.

Changing education will take a long time and we need to do it. But we can start making the change we want today. Teaching basic system administration is something that we can all learn to do, and should all start doing, right now.

Available Media

The Guru Is In

MARINA 3

Time Management for System Administrators

Tom Limoncelli,
Google NYC

Tom Limoncelli is an internationally recognized author, speaker, and system administrator. His best known books include Time Management for System Administrators (O'Reilly) and The Practice of System and Network Administration (Addison-Wesley). In 2005 he received the SAGE Outstanding Achievement Award. He works at Google in NYC on the Ganeti project. http://EverythingSysadmin.com is his blog.

Thomas A. Limoncelli is an internationally recognized author, speaker, and system administrator. His best-known books include Time Management for System Administrators (O'Reilly) and The Practice of System and Network Administration (Addison-Wesley). He received the SAGE 2005 Outstanding Achievement Award. He works at Google in NYC on the Ganeti project. EverythingSysadmin.com is his blog.

Thomas A. Limoncelli is an internationally recognized author, speaker, and system administrator. His best-known books include Time Management for System Administrators (O'Reilly) and The Practice of System and Network Administration (Addison-Wesley). He received the SAGE 2005 Outstanding Achievement Award. He works at Google in NYC on the Ganeti project. EverythingSysadmin.com is his blog.

10:30 a.m.–11:00 a.m. Thursday

Break

 Grand Ballroom Foyer
11:00 a.m.–12:30 p.m. Thursday

Papers and Reports: Community and Teaching

SPINNAKER

Session Chair:
Andrew Hume, AT&T Labs—Research

A Sustainable Model for ICT Capacity Building in Developing Countries

Rudy Gevaert, Ghent University, Belgium

System administrators are often asked to apply their professional expertise in unusual situations, or under tight resource constraints. What happens, though, when the “situation” is a foreign country with only basic technical infrastructure, and the task is to bauild systems which are able to survive and grow in these over-constrained environments?

In this paper we report on our experiences in two very different countries – Cuba and Ethiopia – where we ran a number of ICT projects. In those projects we assisted local universities to upgrade their ICT infrastructure and services. This included skills and process building for local system administrators.

Based on our experiences we formulate a model for sustainable ICT capacity building. We hope this model will be useful for other organizations doing similar projects.

Available Media

Teaching System Administration

Steve VanDevender, University of Oregon

For the past twelve years I have taught a one-term college-level class introducing students to the discipline of system administration. I discuss how the class was created, the considerations that went into designing the class structure and assignments, student outcomes, how the class has evolved over time, and other observations on teaching. Links to detailed course materials and other resources are provided.

Available Media

Training and Professional Development in an IT Community

George William Herbert, Taos Mountain, Inc.

This paper describes training and professional development activities at a mid-sized IT consulting firm over the last roughly 15 years. These activities have successfully engaged many of the consultants and provided significant career bonuses and advantages for the company. We present the types of activities, their effectiveness and success, and the evolution of professional development efforts over time. Many of these activities proved effective and valuable and are still in use, including annual skill reviews and development recommendations, regular organized training and discussion type events, training and materials reimbursements, and escalation support. Challenges and failures with other training and activities are described. Recommendations are made for other organizations’ own professional development programs.

Available Media

Invited Talks 1

GRANDE BALLROOM A

Session Chair:
Doug Hughes, D. E. Shaw Research, LLC

Performance Analysis Methodology

Brendan Gregg is the lead performance engineer at Joyent, where he analyzes the performance of small to large cloud computing environments, at any level of the software stack, down to metal. He is a co-author of DTrace and Solaris Performance and Tools (Prentice Hall), and developed the DTraceToolkit and the ZFS L2ARC. Many of Brendan's performance tools are shipped by default in Mac OS X and Oracle Solaris 11.

 

Performance analysis methodologies provide guidance, save time, and can find issues that are otherwise overlooked. Example issues include hardware bus saturation, lock contention, recoverable device errors, kernel scheduling issues, and unnecessary workloads.

Performance analysis methodologies provide guidance, save time, and can find issues that are otherwise overlooked. Example issues include hardware bus saturation, lock contention, recoverable device errors, kernel scheduling issues, and unnecessary workloads. The talk will focus on the USE Method: a simple strategy for all staff for performing a complete check of system performance health, identifying common bottlenecks and errors. Other analysis methods discussed include workload characterization, drill-down analysis, and latency analysis, with example applications from enterprise and cloud computing. Don’t just reach for tools—use a method!

Available Media

Invited Talks 2

GRANDE BALLROOM B

Session Chair:
Narayan Desai, Argonne National Laboratories

Dude, Where’s My Data? Replicating and Migrating Data Across Data Centers and Clouds

Jeff Darcy, Red Hat

Jeff Darcy has worked on network and distributed storage problems for twenty years, including playing an instrumental role in developing MPFS (a precursor of modern pNFS) while at EMC and, more recently, leading the HekaFS project. He is currently a member of the GlusterFS architecture team at Red Hat, coordinating the integration of HekaFS's features and leading the asynchronous-replication development effort.

 

Large enterprises have long needed to deal with the problem of copying or migrating data between sites. This problem becomes more acute when those enterprises try to move work to/from public clouds, to avoid having computation arrive in the cloud with no data on which to work. This talk will cover methods for managing data location, including various tradeoffs of efficiency, consistency, and user-friendliness.

Large enterprises have long needed to deal with the problem of copying or migrating data between sites. This problem becomes more acute when those enterprises try to move work to/from public clouds, to avoid having computation arrive in the cloud with no data on which to work. This talk will cover methods for managing data location, including various tradeoffs of efficiency, consistency, and user-friendliness.

Available Media

Invited Talks 3

GRANDE BALLROOM C

Session Chair:
Kent Skaar, 
VMware, Inc.

Rolling the D2O: Choosing an Open Source HTTP Proxy Server

Leif Hedstrom, Cisco Systems

Leif Hedstrom is Chief Architect for the Cisco Cloud Services group, working on various cloud and edge related projects. Before joining Cisco, he worked on several CDN and edge solutions at Yahoo! as well as at Akamai. Prior to Yahoo, he worked at Netscape Communications and later Mozilla as a committer. Leif is actively involved with development and evangelism of the Apache Traffic Server Open Source project, where he also serves as chairperson. Leif is an avid dirt biker, alpine skier, dog person, scuba diver, and family man. And of course, he's a huge computer nerd.

With Web performance and scalability becoming more and more important, choosing advanced HTTP intermediaries is a vital skill. 

With Web performance and scalability becoming more and more important, choosing advanced HTTP intermediaries is a vital skill. This presentation will give the audience a thorough walkthrough of the most popular and advanced solutions available today. The audience will gain a solid background to help them be able to make the right choices when it comes to HTTP intermediaries and proxy caches.

Available Media

Distributed Messaging: The Administrative Aspect

Martin Sustrik, 250bpm s.r.o.

Martin Sústrik is an expert in the field of messaging middleware. He participated in the creation and reference implementation of the AMQP standard. He has been involved in different messaging projects in the financial industry. He is a founder of the 0MQ project. Currently he's working on integration of messaging technology with operating systems and the Internet stack.

The talk introduces distributed messaging then explains what it is and how it allows us to build Internet-scale distributed systems.

The talk introduces distributed messaging then explains what it is and how it allows us to build Internet-scale distributed systems.

While focusing on challenges of managing such distributed systems, I will explain why most of today’s messaging fabric is not really friendly to monitoring and management, as well as why most of the tasks encountered are painful to accomplish at best and impossible to do at worst. Finally the talk shows how distributed messaging, by the virtue of being simply an additional layer in the network stack, can easily solve problems such as monitoring load and connectivity problems, resource provisioning based on business criteria, or even complex DevOps tasks such as distributed debugging.

Available Media

The Guru Is In

MARINA 3

Lightning Talks

Organizer: Lee Damon, University of Washington

Lee Damon (S3) has a B.S. in Speech Communication from Oregon State University. He has been a UNIX system administrator since 1985 and has been active in SAGE (US) & LOPSA since their inceptions. He assisted in developing a mixed AIX/SunOS environment at IBM Watson Research and has developed mixed environments for Gulfstream Aerospace and QUALCOMM. He is currently leading the development effort for the Nikola project at the University of Washington Electrical Engineering department. Among other professional activities, he is a charter member of LOPSA and SAGE and past chair of the SAGE Ethics and Policies working groups. He chaired LISA '04, co-chaired CasITconf '11, and is co-chairing CasITconf '13.

Lightning talks are fast-paced and high-energy. These are back-to-back 5-minute presentations on just about anything. Talk about a recent success, energize people about a pressing issue, ask a question, start a conversation!

Lightning talks are an opportunity to get up and talk about what’s on your mind. You can give several lightning talks if you have more than one topic. 

To submit a lightning talk, complete the form here by December 12, 2012. 

Lightning talks are fast-paced and high-energy. These are back-to-back 5-minute presentations on just about anything. Talk about a recent success, energize people about a pressing issue, ask a question, start a conversation!

Lightning talks are an opportunity to get up and talk about what’s on your mind. You can give several lightning talks if you have more than one topic. 

To submit a lightning talk, complete the form here by December 12, 2012. 

12:30 p.m.–2:00 p.m. Thursday

Lunch, on your own

2:00 p.m.–3:30 p.m. Thursday

Papers and Reports: If You Can’t Monitor It, You Can’t Manage It

SPINNAKER

Session Chair:
Marc Chiarini, Harvard University

Extensible Monitoring with Nagios and Messaging Middleware

Jonathan Reams, CUIT Systems Engineering, Columbia University

Monitoring is a core function of systems administration, and is primarily a problem of communication – a good monitoring tool communicates with users about problems, and communicates with hosts and software to take remedial action. The better it communicates, the greater the confidence administrators will have in its view of their environment. Nagios has been a leading open-source monitoring solution for over a decade, but in that time, the way it gets data in and out of its scheduling engine hasn’t changed. As applications are written to extend Nagios, each one has to figure out its own way of getting data out of the Nagios core process. This paper explores the use of messaging middleware, in an open-source project called NagMQ, as a way to provide a common interface for Nagios that can be easily utilized by a variety of applications.

Available Media

Efficient Multidimensional Aggregation for Large Scale Monitoring

Lautaro Dolberg, Jérôme François, and Thomas Engel, University of Luxembourg SnT—Interdiciplinary Centre for Security, Reliability and Trust

Today, network monitoring becomes necessary on many levels: Internet Service Providers, large companies as well as smaller entities. Since network monitoring supports many applications in various fields (security, service provisioning, etc), it may consider multiple sources of information such as network traffic, user activity, network events and logs, etc. All these ones produce voluminous amount of data which need to be stored, visualized and analyzed for administration purposes. Various techniques to cope with scalability have been proposed as for example sampling or aggregation.

In this paper, we introduce an aggregation technique which is able to handle multiple kinds of dimension, i.e. features, like traffic capture or host locations, without giving any preference a priori to a particular feature for ordering the aggregation process among dimensions. Furthermore, feature space granularity is determined on the fly depending on the desired events to monitor. We propose optimizations to keep the computational overhead low.

In particular, the technique is applied to network related data involving multiple dimensions: source and destination IP addresses, services, geographical location of hosts, DNS names, etc. Thus, our approach is validated through multiple scenarios using different dimensions, measuring the impact of the aggregation process and the optimizations as well as by highlighting the ability to figure out important facts or changes in the network.

Available Media

On the Accurate Identification of Network Service Dependencies in Distributed Systems

Barry Peddycord III and Peng Ning, North Carolina State University; Sushil Jajodia, George Mason University

The automated identification of network service dependencies remains a challenging problem in the administration of large distributed systems. Advances in developing solutions for this problem have immediate and tangible benefits to operators in the field. When the dependencies of the services in a network are better-understood, planning for and responding to system failures becomes more efficient, minimizing downtime and managing resources more effectively.

This paper introduces three novel techniques to assist in the automatic identification of network service dependencies through passively monitoring and analyzing network traffic, including a logarithm-based ranking scheme aimed at more accurate detection of network service dependencies with lower false positives, an inference technique for identifying the dependencies involving infrequently used network services, and an approach for automated discovery of clusters of network services configured for load balancing or backup purposes. This paper also presents the experimental evaluation of these techniques using real-world traffic collected from a production network. The experimental results demonstrate that these techniques advance the state of the art in automated detection and inference of network service dependencies.

Available Media

Invited Talks 1

GRANDE BALLROOM A

Session Chair:
Nicole Forsgren Velasquez, Utah State University

Advancing Women in Computing (Panel)

Moderator:
Rikki Endsley,
USENIX Association

Panelists: Jennifer Davis, Yahoo, Inc.; Elizabeth Krumbach, Ubuntu; Adele Shakal, Metacloud, Inc.; Nicole Forsgren Velasquez, Utah State University; Josephine Zhao, Prosperb Media and AsianAmericanVoters
.org

Moderator Rikki Endsley started her career in IT as the Managing Editor of Sys Admin magazine. She moved on to become Managing Editor and then Associate Publisher of Linux Pro Magazine and ADMIN magazine. In addition to her role as the Community Manager and;login: Managing Editor for USENIX, Rikki is the Editor of Ubuntu User magazine and contributes to NetworkWorld.com and Linux.com. In 2007, Rikki received a Master's of Science degree in Journalism from the University of Kansas. For her thesis, she researched how to highlight women's contributions to open source technologies.

 

Available Media

Invited Talks 2

GRANDE BALLROOM B

Session Chair:
Tim Nelson, Worcester Polytechnic Institute

Carat: Collaborative Energy Debugging

Adam J. Oliner, AMP Lab, University of California, Berkeley

Adam J. Oliner is a postdoc in the EECS Department at UC Berkeley, working in the AMP Lab. Before starting at Berkeley, he earned a Ph.D. in computer science from Stanford University, where he was a DOE High Performance Computer Science Fellow and Honorary Stanford Graduate Fellow. Adam received a MEng in EECS from MIT, where he also earned undergraduate degrees in computer science and mathematics. His research focuses on understanding complex systems, most recently applied to diagnosing energy bugs in mobile devices.

 

We aim to detect and diagnose code misbehavior that wastes energy, which we call energy bugs. I will describe a method and implementation, called Carat, for performing such diagnosis on mobile devices.

We aim to detect and diagnose code misbehavior that wastes energy, which we call energy bugs. I will describe a method and implementation, called Carat, for performing such diagnosis on mobile devices. Carat takes a collaborative, black-box approach. A non-invasive client app sends intermittent, coarse-grained measurements to a server, which identifies correlations between higher expected energy use and client properties like the running apps, the device model, and the operating system. Carat has been deployed on more than a quarter of a million devices and has detected thousands of app instances exhibiting energy bugs in the wild.

Available Media

Invited Talks 3

GRANDE BALLROOM C

Session Chair: 
Kent Skaar, 
VMware, Inc.

OmniOS: Motivation and Design

A widely respected industry thought leader, Theo Schlossnagle is the author of Scalable Internet Architectures (SAMS) and a frequent speaker at worldwide IT conferences. He was also the principal architect of the Momentum mta, which is now the flagship product of OmniTI’s sister company, Message Systems. Born from Theo’s vision and technical wisdom, this innovation is transforming the email software spectrum. Theo is a computer scientist in every respect. After earning undergraduate and graduate degrees from Johns Hopkins University in computer science with a focus on graphics and randomized algorithms in distributed systems, he went on to research resource allocation techniques in distributed systems during four years of post-graduate work. Theo is a member of the IEEE and a senior member of the ACM. He serves on the editorial board of the ACM’s Queue Magazine.

In today's marketplace, operating systems are considered a commodity. What in the world would possess someone to roll a new distribution? In this talk, I'll walk through the constraints, the thought process, the plans, and the implementation of the open-source Illumos-based OmniOS operating system distribution.

In today's marketplace, operating systems are considered a commodity. What in the world would possess someone to roll a new distribution? In this talk, I'll walk through the constraints, the thought process, the plans, and the implementation of the open-source Illumos-based OmniOS operating system distribution. Beyond the motivation, I will talk about the hard numbers of what it cost and what we saved and share some subjective ideas about whether it was all worth it. OmniTI developed OmniOS after collecting 15 years of experience managing almost every production operating system available during its life for clients in vastly ranging industries.

Available Media

The Guru Is In

MARINA 3

IPv6

Owen DeLong, Hurricane Electric

Owen DeLong is an IPv6 Evangelist at Hurricane Electric and a member of the ARIN Advisory Council. Owen brings more than 25 years of industry experience. He is an active member of the systems administration, operations, and IP policy communities. In the past, Owen has worked at Tellme Networks (Senior Network Engineer); Exodus Communications (Senior Backbone Engineer), where he was part of the team that took Exodus from a pre-IPO startup with two datacenters to a major global provider of hosting services; Netcom Online (Network Engineer), where he worked on a team that moved the Internet from an expensive R&E tool to a widely available public access system accessible to anyone with a computer; Sun Microsystems (Senior Systems Administrator); and more. He can be reached as owend at he dot net.

 

Owen DeLong is an IPv6 Evangelist at Hurricane Electric and a member of the ARIN Advisory Council. Owen brings more than 25 years of industry experience. He is an active member of the systems administration, operations, and IP policy communities.

Owen DeLong is an IPv6 Evangelist at Hurricane Electric and a member of the ARIN Advisory Council. Owen brings more than 25 years of industry experience. He is an active member of the systems administration, operations, and IP policy communities. In the past, Owen has worked at Tellme Networks (Senior Network Engineer); Exodus Communications (Senior Backbone Engineer), where he was part of the team that took Exodus from a pre-IPO startup with two datacenters to a major global provider of hosting services; Netcom Online (Network Engineer), where he worked on a team that moved the Internet from an expensive R&E tool to a widely available public access system accessible to anyone with a computer; Sun Microsystems (Senior Systems Administrator); and more. He can be reached as owend at he dot net.

3:30 p.m.–4:00 p.m. Thursday

Break

 Grand Ballroom Foyer
4:00 p.m.–5:30 p.m. Thursday

Plenary Session

GRANDE BALLROOM

Session Chair:
Carolyn Rowland

NSA on the Cheap

Matt Blaze, University of Pennsylvania

Matt Blaze is a researcher in the areas of secure systems, cryptography, and trust management. He is currently an Associate Professor of Computer and Information Science at the University of Pennsylvania; he received his Ph.D. in Computer Science from Princeton University.

 

Last year, we discovered a number of protocol weaknesses in P25, a “secure” two-way radio system used by, among others, the federal government to manage surveillance and other sensitive law enforcement and intelligence operations. Although some of the problems are quite serious (efficient jamming, cryptographic failures, vulnerability to active tracking of idle radios, etc.), many of these vulnerabilities require an active attacker who is able and willing to risk transmitting. So we also examined passive attacks, where all the attacker needs to do is listen, exploiting usability and key management errors when they occur. And then we built a multi-city networked P25 interception infrastructure to see how badly the P25 security protocols do in practice (spoiler: badly).

Last year, we discovered a number of protocol weaknesses in P25, a “secure” two-way radio system used by, among others, the federal government to manage surveillance and other sensitive law enforcement and intelligence operations. Although some of the problems are quite serious (efficient jamming, cryptographic failures, vulnerability to active tracking of idle radios, etc.), many of these vulnerabilities require an active attacker who is able and willing to risk transmitting. So we also examined passive attacks, where all the attacker needs to do is listen, exploiting usability and key management errors when they occur. And then we built a multi-city networked P25 interception infrastructure to see how badly the P25 security protocols do in practice (spoiler: badly).

This talk will describe the P25 protocols and how they failed, but will focus on the architecture and implementation of our interception network. We used off-the-shelf receivers (with some custom software) deployed around various US cities, capturing virtually every sensitive, but unintentionally clear, transmission (and associated metadata) sent by federal agents in those cities. And by systematically analyzing the captured data, we often found that the whole was much more revealing than the sum of the parts.

Available Media

Friday, December 14, 2012

9:00 a.m.–10:30 a.m. Friday

Papers and Reports: Content, Communication, and Collecting

HARBOR ISLAND 3

Session Chair: Brent Chapman, Great Circle Associates, Inc.

What Your CDN Won't Tell You: Optimizing a News Website for Speed and Stability

Julian Dunn, SecondMarket Holdings, Inc.; Blake Crosby, Canadian Broadcasting Corporation

In this paper, the authors discuss their experiences implementing, operating, and optimizing a content delivery network for Canada’s largest news website, that of the Canadian Broadcasting Corporation (CBC). The site receives over one million unique visitors per day. Although CBC uses the Akamai Aqua Platform (formerly EdgeSuite) as its content delivery network of choice, the lessons described here are generally applicable to any infrastructure fronted by a CDN.

Available Media

Building a 100K log/sec Logging Infrastructure

David Lang, Intuit

A look at the logging infrastructure that one division of Intuit built that included the requirement to handle 100K lines of logs per second with the logs being delivered to several destinations (including proprietary appliances).

This paper will cover the options considered, the choices made and the problems we ran into. The most unusual and interesting topic discussed is the method selected to distribute the logs to all the different destinations, able to deliver the log message to several different load balanced farms of servers with only one copy being sent over the wire.

Available Media

Building a Protocol Validator for Business to Business Communications

Rudi van Drunen, Competa IT B.V., The Netherlands; Rix Groenboom, Parasoft Netherlands

In this paper we describe the design and implementation of a system essential to enable the deregulation of the energy market in the Netherlands. The system is used to test and validate secure communications using XML messages through the AS2 standard between the business partners in the market. The tool is comprised of an Enterprise Service Bus component, a service virtualization component, a database with business logic and an user interface added. The version 1.0 of the system was built in less than one month.

Available Media

Invited Talks 1

GRANDE BALLROOM A

Session Chair: Kent Skaar, VMware, Inc.

Surviving The Thundering Hordes: Keeping Engadget Alive During Apple Product Announcements

Valerie Detweiler has worked in tech support and operations for over 17 years. Today Valerie is an SRE at AOL wrangling all things layer 3–7 and especially layer 8. Valerie was a Technical Support Engineer for Netscape and Inktomi before landing at AOL in 2002.

 

Chris Stolfi has been with AOL Technical Operations since 2000 and is currently a Principal Systems Administrator on AOL's Published Content Systems (www.engadget.com, www.moviefone.com, www.autoblog.com, etc.).

When Apple product announcements occur, consumers of the world pay rapt attention and one of the premier technology sites they rely upon for live event updates is Engadget.

When Apple product announcements occur, consumers of the world pay rapt attention and one of the premier technology sites they rely upon for live event updates is Engadget.

The Engadget infrastructure must seamlessly sustain incredible traffic spikes during these Apple events. How do we get there?

  • Keep it simple: LAMP
  • Cache, cache, cache: CDN, load balancer, memcache
  • Manage the complexities: Sometimes you don’t want GSLB to automatically failover
  • Make no assumptions: Validate, re-validate, and then do it some more
  • Work well with others: Respect, trust, and communication are key
Available Media

Vitess: Scaling MySQL at YouTube Using Go

Sugu Sougoumarane works for the YouTube architecture team. He's worked on various scalability projects there. He currently works on the Vitess open source project. Prior to YouTube, Sugu worked for the architecture team at PayPal where he built many of PayPal's core features. He also has many years of experience in development environments, compilers, and computer graphics.

Mike Solomon works at YouTube focusing on distributed systems. He collaborated on the recently released Vitess project and is actively working improvements to the efficiency of Youtube's MySQL infrastructure.

Vitess is an open source project that packages many of the ad-hoc processes and conventions that grew out of managing and scaling MySQL at YouTube.

Vitess is an open source project that packages many of the ad-hoc processes and conventions that grew out of managing and scaling MySQL at YouTube.

It is now at the core of our MySQL serving infrastructure, and is primarily written in Go. In this session, we’ll cover our vision of where the project is headed as well as what we’ve achieved so far. We’ll go over some of the challenges and wins due to using Go as the language of choice. We’ll also share tips on how to write scalable servers using Go.

Available Media

Invited Talks 2

GRANDE BALLROOM B

Session Chair: Patrick Cable, MIT Lincoln Laboratory

Ganeti: Your Private Virtualization Cloud "the Way Google Does It"

Thomas A. Limoncelli, Google, Inc.

Thomas A. Limoncelli (M5, M10, T9) is an internationally recognized author, speaker, and system administrator. His best-known books include Time Management for System Administrators (O'Reilly) and The Practice of System and Network Administration (Addison-Wesley). He received the SAGE 2005 Outstanding Achievement Award. He works at Google in NYC.

Ganeti is a cluster virtual server management software tool built on top of existing virtualization technologies such as Xen or KVM and other Open Source software.  Ganeti takes care of disk creation, migration, OS installation, shutdown, startup, and can be used to preemptively move a virtual machine off a physical machine that is starting to get sick.  It doesn’t require a big expensive SAN, complicated networking, or a lot of money.  The project is used around the world by many organizations; it is sponsored by Google and hosted here.

Ganeti is a cluster virtual server management software tool built on top of existing virtualization technologies such as Xen or KVM and other Open Source software.  Ganeti takes care of disk creation, migration, OS installation, shutdown, startup, and can be used to preemptively move a virtual machine off a physical machine that is starting to get sick.  It doesn’t require a big expensive SAN, complicated networking, or a lot of money.  The project is used around the world by many organizations; it is sponsored by Google and hosted here.

Available Media

Invited Talks 3

GRANDE BALLROOM C

Session Chair: Steve VanDevender, University of Oregon

DNSSEC: What Every Sysadmin Should be Doing to Keep Things Working

Roland van Rijswijk works as Technical Product Manager for several SURFnet services, including DNS and DNSSEC. He is responsible for innovation management in the area of Internet security. Roland obtained a Master of Science degree in Computer Science from the University of Twente (2001), after which he worked in software development for Philips, Advanced Encryption Technology (AET), and InTraffic. His expertise is in the application of high-end cryptography. Roland joined SURFnet in 2008.

Unless you've been sipping cold lemon daiquiris on a beach for the past five years, you will know that there's this thing called DNSSEC out there. 

Unless you've been sipping cold lemon daiquiris on a beach for the past five years, you will know that there's this thing called DNSSEC out there. But did you know that you may be using it without being aware of it? And that the firewall nobody dares touch on your network may be messing with DNSSEC and causing you problems? This talk will focus on what every sysadmin should know about DNSSEC and should be aware of when setting up a DNS server and a firewall based on real world problems and experiences.

Available Media

DNSSEC Deployment in .gov: Progress and Lessons Learned

Practice and Experience Report
Scott Rose, National Institute of Standards and Technology (NIST)

In 2008, the US Federal government mandated that all Federal-owned DNS zones must deploy DNSSEC. Initial deployments lagged and were often error prone.

Scott Rose works as a computer scientist at the National Institute of Standards and Technology (NIST) on Internet infrastructure protection research and development. He co-authored the core DNSSEC specification in the IETF as well as NIST Special Publication 800-81 on DNSSEC deployment. Scott was recently awarded the Department of Commerce Gold Medal for Leadership for work in deploying and testing DNSSEC deployment at the .GOV top level domain (TLD).

In 2008, the US Federal government mandated that all Federal-owned DNS zones must deploy DNSSEC. Initial deployments lagged and were often error prone.

This prompted the creation of a Tiger Team to assist agencies in deployment as well as a continuous monitoring program. These steps increased the number of signed zones in the .gov TLD and improved the response time in responding to errors and mistakes in deployment. This talk will cover the progress of DNSSEC in the Federal government in addition to lessons learned in setting up a system to monitor and maintain compliance across multiple administrative boundaries.

Available Media

The Guru Is In

NAUTILUS 3

CFEngine

Diego Zamboni, CFEngine AS

Diego Zamboni is a computer scientist, consultant, programmer, sysadmin, and overall geek. He works at CFEngine AS as Senior Security Advisor. His role is to advocate the power and usefulness of CFEngine, particularly in the security space, and also to help the CFEngine user community. After finishing his doctoral studies at Purdue University, he worked as researcher in computer security at the IBM Zurich Research Lab and as security consultant at HP Enterprise Services. He is very interested in topics related to computer security, virtualization, configuration management, and system automation. Diego is also the author of the O'Reilly book Learning CFEngine 3 and a frequent speaker in CFEngine webinars. This year he has presented on system administration topics at DevOps Chicago, DevOps Days Mountain View, CampIT, and PICC.

Diego Zamboni is a computer scientist, consultant, programmer, sysadmin, and overall geek. He works at CFEngine AS as Senior Security Advisor. His role is to advocate the power and usefulness of CFEngine, particularly in the security space, and also to help the CFEngine user community.

Diego Zamboni is a computer scientist, consultant, programmer, sysadmin, and overall geek. He works at CFEngine AS as Senior Security Advisor. His role is to advocate the power and usefulness of CFEngine, particularly in the security space, and also to help the CFEngine user community. After finishing his doctoral studies at Purdue University, he worked as researcher in computer security at the IBM Zurich Research Lab and as security consultant at HP Enterprise Services. He is very interested in topics related to computer security, virtualization, configuration management, and system automation. Diego is also the author of the O'Reilly book Learning CFEngine 3 and a frequent speaker in CFEngine webinars. This year he has presented on system administration topics at DevOps Chicago, DevOps Days Mountain View, CampIT, and PICC.

10:30 a.m.–11:00 a.m. Friday

Break

 Grand Ballroom Foyer
11:00 a.m.–12:30 p.m. Friday

Papers and Reports: If You Build It They Will Come

HARBOR ISLAND 3

Session Chair: Steve VanDevender, University of Oregon

Building the Network Infrastructure for the International Mathematics Olympiad

Rudi van Drunen, Competa IT; Karst Koymans, University of Amsterdam

In this paper we describe the network infrastructure we designed, built and operated for the International Mathematics Olympiad in the Netherlands in 2011. The infrastructure was pragmatically designed around OpenVPN tunnels in a star topology between the various venues. VLANs were extensively used to separate functional groups and networks. The actual construction of the event network took about 3 days and was needed for only 2 weeks. The architectural, setup, building and operational aspects of the network are described and we include some lessons learned.

Available Media

Lessons Learned When Building a Greenfield High Performance Computing Ecosystem

Andrew R. Keen, Dr. William F. Punch, and Greg Mason, Michigan State University
Awarded Best Practice and Experience Report!   

Faced with a fragmented research computing environment and growing needs for high performance computing resources, Michigan State University established the High Performance Computing Center in 2005 to serve as a central high performance computing resource for MSU’s research community. Like greenfield industrial development, the center was unconstrained by existing infrastructure. The lessons learned are useful when building or maintaining an effective HPC resource and may provide insight for developing other computational services.

Available Media

Building a Wireless Network for a High Density of Users

David Lang, Intuit

Why do conference and school wireless networks always work so poorly? As IT professionals we are used to the network 'just working' and fixing things by changing configuration files. This mind-set, combined with obvious-but-wrong choices in laying out a wireless network frequently lead to a network that seems to work when it's tested, but that then becomes unusable when placed under load. This is at its worst at technical conferences where there are so many people, each carrying several devices, all trying to use the network at the same time, and in schools where you pack students close together and then try to have them all use their computers at the same time.

Is this a fundamental limitation of wireless? While it is true that there are some issues that cannot be solved, there are a lot of things that the network administrator can do to make the network work better. The key issue is the obvious, but under-appreciated fact that wireless networking is radio communications first. If your radio link doesn't work well, you have no chance of fixing it with your configuration and software. This paper is intended to give you an appreciation of what the issues are, and enough information to know what sorts of things to look out for when planning a high density wireless network.

Available Media

Invited Talks 1

GRANDE BALLROOM A

Moderator: 
Narayan Desai, Argonne National Laboratories

Disruptive Tech Panel

Panelists:
Vish Ishaya, RackSpace; Jeff Darcy, Red Hat; Adam Oliner, University of California, Berkeley; Theo Schlossnagle, OmniTI

This panel will look into the changing landscape of system administration caused by disruptive innovations in emerging software, hardware, standards, and protocols. The panel will consist of experts working on the cutting edge of computing hardware, software stacks, and management approaches.

This panel will look into the changing landscape of system administration caused by disruptive innovations in emerging software, hardware, standards, and protocols. The panel will consist of experts working on the cutting edge of computing hardware, software stacks, and management approaches.

Available Media

Invited Talks 2

GRANDE BALLROOM B

Session Chair: Mario Obejas, Raytheon 

TTL of a Penetration

Branson Matheson is a 23-year veteran of UNIX and security. He started as a cryptologist for the U.S. Navy and has since worked on NASA shuttle projects, TSA security and monitoring systems, and Internet search engines; he continues to support many open source projects. He works at NASA as a Systems Architect; founded sandSecurity to provide policy and technical audits, support and training for IT Security, system administrators and developers; and he speaks at sysadmin and security conferences year-round. Branson has CEH, GSEC, GCIH, and several other credentials, but generally likes to spend time responding to the statement "I bet you can't…"

In the world of information security, it's not a matter of how anymore…it’s a matter of when. With the advent of penetration tools such as Metaspolit, AutoPwn, etc.—plus the day-to-day use of insecure operating systems, applications, and Web sites—reactive systems have become more important than proactive systems.

In the world of information security, it's not a matter of how anymore…it’s a matter of when. With the advent of penetration tools such as Metaspolit, AutoPwn, etc.—plus the day-to-day use of insecure operating systems, applications, and Web sites—reactive systems have become more important than proactive systems. Discovery of penetration by out-of-band processes and being able to determine the when and how to then mitigate the particular attack has become a stronger requirement than active defense. I will discuss the basic precepts of this idea and expand with various types of tools that help resolve the issue. Attendees should be able to walk away from this discussion and apply the knowledge immediately within their environment.

Available Media

Invited Talks 3

GRANDE BALLROOM C

Session Chair: Mark Roth, Google, Inc.

Near-Disasters: A Tale in 4 Parts

Doug Hughes, D. E. Shaw Research, LLC

Doug Hughes is the manager of all things infrastructure at D. E. Shaw Research, a bio-technology research firm located in Manhattan. He was intimately involved in the specifications, design, and implementation of the company's current built-from-scratch datacenter.

Occasionally one is the victim of a series of misadventures and failures so wrenching, potentially devastating, and yet completely unrelated that one is compelled to try to extract lessons from the fractured dust of mayhem. This is that tale. Where correlation is impossible, at least we can try to learn something. Murphy was an optimist.

Occasionally one is the victim of a series of misadventures and failures so wrenching, potentially devastating, and yet completely unrelated that one is compelled to try to extract lessons from the fractured dust of mayhem. This is that tale. Where correlation is impossible, at least we can try to learn something. Murphy was an optimist.

Available Media

The Guru Is In

NAUTILUS 3

PostgreSQL

Joe Conway, credativ USA; Josh Berkus, PostgreSQL Experts Inc.; Selena Deckelmann, PostgreSQL

Joe Conway has been involved with PostgreSQL as a contributor since 2001. He is also the author and maintainer of a PostgreSQL procedural language handler for the R language, PL/R. Joe is President/CEO of credativ USA, which specializes in open source software with its "Open Source Support Center" and a comprehensive range of services, including consulting, architectural and technical advice, software development, training, and personalized support.

Josh Berkus is a member of the Core Team of PostgreSQL. He is also president of PostgreSQL Experts Inc., and consults on database performance, scalability architecture, data warehousing, and open source community management. In addition to PostgreSQL, he also uses CouchDB, Redis, Hadoop, Greenplum, and Vertica, and participates in many open source projects. Josh is also a potter and amateur chef.

Selena Deckelmann is a major contributor to PostgreSQL. She specializes in open source software development and product management. She's an internationally recognized speaker on open source, PostgreSQL, and developer community. She founded Postgres Open, a conference dedicated to the business of PostgreSQL and disruption of the database industry. She founded and co-chaired Open Source Bridge, a developer conference for open source citizens. She founded the PostgreSQL Conference, a successful series of east coast/west coast conferences in the U.S. for PostgreSQL. She's helped run other conferences like WhereCampPDX, BarCampPDX, and PG Days. She is currently on the organizing committees for PgCon and OSCON. She's a contributing writer for the Google Summer of Code Mentor Manual and Student Guide.

Joe Conway

Joe Conway has been involved with PostgreSQL as a contributor since 2001. He is also the author and maintainer of a PostgreSQL procedural language handler for the R language, PL/R. Joe is President/CEO of credativ USA, which specializes in open source software with its "Open Source Support Center" and a comprehensive range of services, including consulting, architectural and technical advice, software development, training, and personalized support.

Joe Conway

Joe Conway has been involved with PostgreSQL as a contributor since 2001. He is also the author and maintainer of a PostgreSQL procedural language handler for the R language, PL/R. Joe is President/CEO of credativ USA, which specializes in open source software with its "Open Source Support Center" and a comprehensive range of services, including consulting, architectural and technical advice, software development, training, and personalized support.

Josh Berkus

Josh Berkus is a member of the Core Team of PostgreSQL. He is also president of PostgreSQL Experts Inc., and consults on database performance, scalability architecture, data warehousing, and open source community management. In addition to PostgreSQL, he also uses CouchDB, Redis, Hadoop, Greenplum, and Vertica, and participates in many open source projects. Josh is also a potter and amateur chef.

Selena Deckelmann

Selena Deckelmann is a software developer and specializes in open source software and product management. She's an internationally recognized speaker on open source, PostgreSQL, and developer community. She founded Postgres Open, a conference dedicated to the business of PostgreSQL and disruption of the database industry. She founded and co-chaired Open Source Bridge, a developer conference for open source citizens.

12:30 p.m.–2:00 p.m. Friday

Lunch, on your own

2:00 p.m.–3:30 p.m. Friday

Closing Plenary Session

GRANDE BALLROOM

Session Chairs:
Mike Ciavarella, Coffee Bean Software Pty Ltd; Carolyn Rowland

15 Years of DevOps

Geoff Halprin, The SysAdmin Group

Geoff Halprin (S6, S9, M12) has spent over 30 years as a software developer, system administrator, consultant, and troubleshooter. He has written software from system management tools to mission-critical billing systems, has built and run networks for enterprises of all sizes, and has been called upon to diagnose problems in every aspect of computing infrastructure and software.

He is the author of the System Administration Body of Knowledge (SA-BOK) and the SAGE Short Topics book A System Administrator's Guide to Auditing and was the recipient of the 2002 SAGE-AU award for outstanding contribution to the system administration profession.

Geoff has served on the boards of SAGE, SAGE-AU, USENIX, and LOPSA. He has spoken at over 20 conferences in Australia, New Zealand, Canada, Europe, and the US.

There has been a lot of hullabaloo over the past few years around a concept called “DevOps.” The idea is that we need to break down the barriers between development and operations teams, and treat infrastructure as code, in order to move towards better software, more reliable and scalable systems, and continuous deployment.

For some of us who have been around a while, this is just a new label for something we’ve always done.

There has been a lot of hullabaloo over the past few years around a concept called “DevOps.” The idea is that we need to break down the barriers between development and operations teams, and treat infrastructure as code, in order to move towards better software, more reliable and scalable systems, and continuous deployment.

For some of us who have been around a while, this is just a new label for something we’ve always done.

They say those that don’t learn from history are destined to repeat it. In this talk, we will look back at how the DevOps movement evolved, what it advocates, what it doesn’t address, and what you should take away from the movement that will help you in your professional life. We will also use this opportunity to look back over the past decade or two of system administration, and see how our challenges have changed, and how they have remained the same.

Available Media
3:30 p.m.–5:00 p.m. Friday

Ice Cream Social

GRANDE BALLROOM FOYER AND BAY VIEW LAWN

Wind down from a week of intense training and networking with your fellow attendees and indulge in some ice cream!