Tutorials in the LISA17 Training Program are half-day or full-day sessions taught by industry experts. These sessions offer a highly curated selection of instructor-lead tutorials, all geared towards helping you create, maintain, and monitor efficient and secure systems.
LISA17 mini tutorials take place Wednesday through Friday as part of the main Conference Program and offer 90-minute overviews of new and emerging technologies. These sessions are included in the registration fee for the Conference Program.
A variety of topics are being covered at LISA17. Use the icons listed below to focus on a key subject area:
Follow the icons throughout the training sessions below. You can combine registration to the conference program with training sessions to build the conference that meets your needs. Pick and choose the sessions that best fit your interests—focus on just one topic or mix and match.
Our Guarantee
If you're not happy, we're not happy. If you feel a tutorial does not meet the high standards you have come to expect from USENIX, let us know by the first break and we will seat you in any other available tutorial immediately.
Continuing Education Units (CEUs)
USENIX provides Continuing Education Units for a small additional administrative fee. The CEU is a nationally recognized standard unit of measure for continuing education and training and is used by thousands of organizations.
Each full-day tutorial qualifies for 0.6 CEUs. You can request CEU credit by completing the CEU section on the registration form. USENIX provides a certificate for each attendee taking a tutorial for CEU credit. CEUs are not the same as college credits. Consult your employer or school to determine their applicability.
Training Materials
USB Drives
Training materials will be provided to you on an 8GB USB drive. If you'd like to access them during your class, please remember to bring a laptop. There will not be any formally printed materials.
Half Day Morning
Kick off your journey to becoming a DevOps master by learning Kubernetes from the ground up. Get started with an introduction to distributed systems and the architecture behind Kubernetes; then learn about Kubernetes APIs and API object primitives. By the end of this workshop you’ll be deploying, scaling, and automating container-based solutions using open source tools for distributed computing.
Slides: http://bit.ly/lisa17-k8s. Bring a laptop with the following materials: http://bit.ly/lisa17-k8s#/workshop-setup.
For developers, systems administrators, "DevOps" folks, architects, and those who are interested in learning about distributed systems via hands-on examples. Attendees should have some basic knowledge of Linux Containers (docker) and have an interest in using distributed architectures to develop web solutions.
Attendees will learn how to deploy, scale, update, and manage container-based solutions through hands-on examples and exercises
Kubernetes, Distributed computing and solutions delivery, SRE, container operations
Ryan Jarvinen, Red Hat
Ryan Jarvinen is an Open Source Advocate at CoreOS, focusing on improving developer experience in the container community. He lives in Oakland, California and is passionate about open source, open standards, open government, and digital rights. You can reach him as ryanj on twitter, github, and IRC.
In this talk I outline the characteristics that define a "container host", an OS tuned to run software in containers. Explore the benefits and peculiarities of a stripped down, light weight minimal OS image and the implications for CM and update strategies.
Then I explore the architecture of two common container hosts, CoreOS and Project Atomic. Each has characteristics that make it suitable for different environments. Users will install one of the two environments and follow along probing and observing how a container host differs in operation from a conventional package based host.
Finally I will look at how a sysadmin's day to day tasks and operations will differ when running infrastructure services and providing application runtime environments for developers and users on container hosts. We will establish base network services (DNS, NTP, Authentication) on container hosts as well as installing and demonstrating utility containers to provide standard admin tools that are stripped from light-weight hosts.
Sysadmins and service designers interested in learning to use container hosts to reduce host management.
Participation requires access to local or cloud VM service.
Attendees will understand the goals and basic design requirements for container hosts. They will get an overview of the design of both CoreOS and Atomic host, highlighting the differences in architecture and how these inform the choice of container host for an installation.
They will learn how to boot and integrate container hosts into their existing infrastructure. They will know how to install and use traditional host tools from containers and how to manage, update and customize container hosts.
They will create a sample cluster of either CoreOS or Atomic hosts in a demo environment.
- Container Hosts
- Large Scale Container Infrastructure
- Atomic Host and CoreOS architecture
Mark Lamourine, Red Hat
Mark Lamourine fell into system administration when the VAX shop he worked in as a student inherited a set of HP/UX boxes. He became the de-facto admin because he was the only one in the group who had read a man(8) page. Since then he's done stints as a developer, a QA engineer, a lab infrastructure manager and infrastructure admin at a now-defunct world-wide ISP. These days he plays the Sysadmin Advocate to software developers who think software is done when they've installed it once in Vagrant.
When not computer geeking Mark geeks on road bicycles. He's been riding road fixed-gear for fun since before that was a thing.
This class will teach administrators how to get a project up and running with Azure Resource manager templates. These templates are an easy way to define, manage, and deploy instances into the Azure cloud using this technology. Additionally, I will go over some basic best practices for making your template more manageable.
System Administrators who are new to Azure, or have not worked with Resource Manager Templates in the past. Anyone interested in streamlining and automating his or her workflow in the Azure cloud
Attendees will take back to work the basic skills to get started automating their Azure deployment. They will have the basic understanding and baseline knowledge to work with ARM templates.
- Azure Resource Manager
- Azure Powershell/Azure CLI
- Basics of the ARM Template layout
- Metadata
- Parameters
- Variables
- Template file
- Using Parameters, and variables to generalize your deployment
- Adjusting resource sizing on the fly
- Deploying Resources
- Base resource
- Sizing
- Monitoring Configuration
- Resource Dependencies
- Troubleshooting Templates
- Tips and tricks to help you configure templates
George Beech, Stack Exchange
George has been an SRE generalist at Stack Exchange for Since October, 2011. Before that he worked for a Multinational CRM company running their IVR infrastructure. He has worked on every part of the stack from Windows, to Linux, to the network infrastructure. He is currently serving his first term as a LOPSA Director. His experience working in the IT field over more than a decade has led him to love working with multiple technologies, and allowed him to experience everything from running a small network as a consultant to being part of a large team running very large scale infrastructure.
In the past he has spoken at LISA, Velocity NYC, Local user groups, and LOPSA-EAST. As well as writing about his experience working on a high volume web infrastructure on his personal blog as well as the Server Fault blog.
Full Day
Insufficient knowledge of operating system internals is my most common reason for passing on an interview candidate. Anyone can learn that you run tool X to fix problem Y. But what happens when there is no tool X, or when you can't even accurately pinpoint the root cause of why "It's sometimes slow."
This will be a no-holds-barred, fury-road-paced review of all major parts of modern operating systems with specific emphasis on what's important for system administrators. It will provide just enough of an academic focus to bridge the "whys" so you can make better use of fiddling with the "whats" on a day-to-day basis. As an added bonus, it will prime you for the following day's "Linux Performance Tuning" tutorial with Theodore Ts'o.
You will learn about process management, scheduling, file system architecture and internals, interrupt management, the mysteries of the MMU and TLB, belady's anomaly, page replacement algorithms and hopefully a bit of networking. In a nutshell, we'll cover 16 weeks of college-level material in a few hours.
Buckle up.
The Automation Tools Bootcamp is a tutorial for individuals looking for exposure to and usage of new IT automation tools. We will learn about and then use Vagrant, Chef, Packer, Docker, Terraform, and Artifactory to deploy a small application in local VMs.
We will cover a progression of tasks, leveraging information from previous sections to deploy a small app that runs identically on your local development machine or on a shared server. Get rid of the “it works for me” mentality when you know your local VM is identical to your co-workers' and your shared environments.
Operations, QA, those who choose to call themselves DevOps, and even managers can come learn.
These automation tools are freely available to engineers, enabling them to safely break local environments until the change in configuration has been perfected. Basic exposure to these tools will allow attendees to return to work with new ways to tackle the problems they face daily.
Vagrant, Chef, Packer, Docker, Terraform, and Artifactory
Tyler Fitch, Adobe
Tyler is a Site Reliability Engineer for the Adobe Stock site—working to automate all the things done to build and release changes to the Stock platforms. He recently finished three years of "post graduate work" in DevOps as an Architect in Chef's Customer Success Program where he helped Chef's largest enterprise customers have delightful experiences in IT Automation. He lives in Vancouver, Washington, and when he’s not programming enjoys lacrosse and using his passport.
Half Day Afternoon
Tasks like management and maintenance of services that are critical to the business are on the daily TODO list of every system administrator. Also, containers and micro-service based architectures create the reality in which number of services that sysadmin has to manage is ever growing. To successfully manage thousands of services we need smart tools that can help us. In this session, we will look at systemd. Init system and service manager used by all major Linux distributions. Session will be a hands-on, interactive look at the architecture, capabilities, and administrative how-tos of systemd. Anyone who is new to systemd or looking to dig deeper into some of the advanced features should attend. Please bring a laptop with a virtual machine running a distribution of your choice that uses systemd.
Linux system administrators, package maintainers and developers who are transitioning to systemd, or who are considering doing so.
Understanding of how systemd works, where to find the configuration files, and how to maintain them.
- The basic principles of systemd
- systemd's major components
- Anatomy of a systemd unit file
- Understanding and optimizing the boot sequence
- Improved system logging with the journal
- Resource management via systemd's cgroups interface
- Simple security management with systemd and the kernel's capabilities
- systemd, containers, and virtualization
Michal Sekletar, Red Hat
Michal Sekletar joined Red Hat in 2011 and currently works as Senior Software Engineer in the "Plumbers" team. He spends his days working and supporting init systems and other low level user-space components. He holds a Masters degree from Brno University of Technology. His other professional interests include programming languages, algorithms, and UNIX-like (other than Linux) operating systems.
Have you ever wondered how to find the “one metric that matters” (for your team)? Or how to magically communicate why your team is doing what you’re doing so everyone can understand? Or moving back several steps -- how should you decide which work to focus on? This tutorial isn’t the magic pill, but it’s the closest thing to get you to be able to answer all of those questions. And once you learn it, you’ll be able to sketch it out on the back of a napkin.
I’ve used this simple framework with:
- Fortune 500 executives decide on the right metrics to use for their latest initiatives and communicate it throughout the organization
- Sysadmins to communicate their latest improvement work across their own teams and to “the business”
- My own research ranging from complex hardware studies to the State of DevOps Reports
The framework works for all types of measures: system, survey, technical, financial, etc.
Engineers, managers, anyone needing to plan or understand a system.
When you leave this tutorial, you’ll be able to:
- Communicate your measurement framework in a straightforward manner
- Identify key measures for your own improvement work, and share this easily with the data team (whether that’s you or another team)
- Chain your measurement frameworks, allowing you to link executive-level initiatives to middle management goals to practitioner workstreams
Metrics
Heidi Waterhouse, Consultant
Technical interviews can be intimidating, but it’s easier if you have confidence in yourself and your ability to answer complicated questions. The hardest questions are not about sorting algorithms, but how you’ll work in a team, how you’ll resolve conflicts, and what it will be like to manage and work with you. This workshop exists to address the skills and theories of presenting yourself as confident, capable, and coachable.
We envision the audience for this tutorial to be people interviewing for technical or technical-adjacent roles at technology companies who are early career (2-7 years). It is meant for beginners, but all are welcome if they want to brush up on their interviewing skills.
The audience will experience hands-on practice, and can expect to learn tactics for preparing for and excelling at interviews. We will provide handouts for participants to use after the workshop and for practice. Participants will learn how to accomplish the checkpoints of a hiring workflow, including: phone screens, phone interviews, in-person interviews, and how to accept or reject an offer. The take-home worksheets will provide types of interview questions, job search rubric, self-evaluation forms, and resources for further research.
Culture, Interviewing, Career, Early Career, Technology Industry
Carol Smith, Microsoft
Carol Smith has over 12 years experience with programs, communities, and partnerships. She worked at GitHub managing education partnerships for the Student Developer Pack and at Google managing the Google Summer of Code program. She has a degree in Journalism from California State University, Northridge, and is a cook, cyclist, and horseback rider.
Heidi Waterhouse, Consultant
Heidi Waterhouse is a freelance technical writer, information architect, and active conference speaker. Her experience as an in-demand consultant has given her insight into the interview process across several industry segments and allows her to generate meaningful answers to a wide variety of weird interview questions. In her spare time, she considers the technical writing aspects of sewing patterns.
Full Day
Today's threats to the enterprise are manifested in many ways but all share similar traits: highly intelligent, well-funded and determined to gain access. In this class, we will explore the murky world of the black-hats. We will examine your security foot-print as they view it, and discuss ways to minimize it, various vectors for attack, and how to detect and defend. We will spend time talking about current threats, and how they can impact your company, and we will build upon the foundations of good security practice. This class has been updated with current events and topics relative to environment profiling, social engineering and new attack vectors. As with all my classes, this will be accompanied with a pinch of humor and a large dollop of common sense.
Participants should be beginning to mid-level system administrators of any stripe with an interest in IT Security and a desire to understand their potential adversaries. It is suggested that participants have experience with *nix command line and virtual hosts.
Tools, tips, tricks, and a working security toolkit which can be implemented to improve monitoring, detection, and defense in your organization. Experience working with (mostly) free security software tools.
Security, Risk Evaluation, Social Engineering
Branson Matheson, Cisco Systems, Inc.
Branson is a 29-year veteran of system architecture, administration, and security. He started as a cryptologist for the US Navy and has since worked on NASA shuttle and aerospace projects, TSA security and monitoring systems, secure mobile communications, and Internet search engines. He has also run his own company while continuing to support many open source projects. Branson speaks to and trains sysadmins and security personnel world wide; and he is currently a senior technical lead for Cisco Cloud Services. Branson has several credentials; and generally likes to spend time responding to the statement "I bet you can't...."
Half Day Morning
Open source relational databases like MySQL and PostgreSQL power some of the world's largest websites, including Yelp. They can be used out of the box with few adjustments and rarely require a dedicated database administrator for the first few months or even years. This means that System Administrators and Site Reliability Engineers are usually the first to respond to some of the more "interesting" issues that can arise as you scale your databases. This tutorial will cover MySQL, but many of the concepts apply to PostgreSQL and other open source RDBMS's. We'll first go over a broad set of DBA basics to introduce MySQL Database Administration and next cover the InnoDB storage engine, database defense and monitoring. Finally, I'll cover the wide array of online resources, books, open source toolkits, and scripts from MySQL, Percona, and the Open Source community that will make the job easier.
Sysadmins and SREs of all levels who have an interest or need to learn MySQL or supporting an open source relational database.
Sysadmins and SREs who join us for this tutorial will come away with a real-world and ready for production understanding of why and how MySQL works the way it does.
- MySQL Installation and Configuration
- Architecture and Filesystem Layout
- InnoDB Tuning and Optimization
- Transactions
- Replication and Scaling Out
- Schema/Query Basics, Indexes, and Query Plans
- Deciphering Common Errors
- Monitoring
- Backup and Restore
- Troubleshooting
- Online Communities
- Open Source Toolkits
Jenni Snyder, Yelp
The R programming language and ecosystem constitute a rich tool set for performing system analyses, for communicating the results and importance of those analyses, and for automating the process with reproducible and repeatable results. This brief introduction to R and its ecosystem will provide a walk along the mainline — coming up to speed on R, accessing data, and getting results.
This tutorial will
- motivate you to pick up R
- introduce the basics of the R language
- demonstrate useful techniques using R and RStudio
- illustrate ways to simplify your life by automating data analysis and reporting
In-class demonstrations will be complemented with hands-on opportunities during the workshop. Additional exercises and data sets that students can explore following the workshop will be provided.
This tutorial is designed for system administrators who are awash in operational data and who want to do a more efficient job of understanding their data and communicating their findings to others. Some facility with programming and a knowledge of basic descriptive statistics are assumed. Prior knowledge of R is not required.
- Understanding where R fits into the system administrator’s tool set
- Acquaintance with R, R packages, and R Studio
- Familiarity with basic R data-manipulation techniques
- Motivation to learn or improve your R skills
- Next steps to take in mastering R
Analytics of System Data
Robert Ballance, Independent Computer Scientist
Dr. Robert Ballance recently completed a White House Presidential Innovation Fellowship where he applied his skills with R to analyzing and delivering broadband deployment data to communities across the U.S.A. He first developed his R-programming skills while managing large-scale High-Performance Computing systems for Sandia National Laboratories. While at Sandia, he developed several R packages used internally for system analysis and reporting. Prior to joining Sandia in 2003, Dr. Ballance managed systems at the University of New Mexico High Performance Computing Center. He has consulted, taught, and developed software, including R packages, PERL applications, C and C++ compilers, programming tools, Internet software, and Unix device drivers. He is a member of USENIX, the ACM, the IEEE Computer Society, the Internet Society, and the American Association for the Advancement of Science. He was a co-founder of the Linux Clusters Institute and recently served as Secretary of the Cray Users Group. Bob received his Ph.D. in Computer Science from U.C. Berkeley in 1989.
Terraform is a tool for deploying and configuring cloud infrastructure in AWS, Google Compute Engine, Digital Ocean, Azure, and many, many other platforms. It is a consistent, robust, well-maintained alternative to clicking in a web interface or writing custom provisioning code against the cloud provider's API.
This tutorial will show code and runtime examples of deploying various types of cloud infrastructure in AWS, Google Compute Engine, and others. Interactivity is unfortunately not offered due to the logistics of billing for arbitrary cloud resources.
Novice- to intermediate-level sysadmins who to learn what Terraform is and what it's good for, why you'd use it instead of your cloud provider's web interface or API, and how to implement common patterns across several different providers.
What is Terraform? What is it good for? How do we use it to build/manage infrastructure? How do we scale it to a team?
Terraform
Whether you are a sysadmin, dev, or web ops, time management can be more difficult than any technology issue. This class is for new and junior system admins that have found themselves over their head, overloaded, and looking for a better way to survive the tech world.
This tutorial presents fundamental techniques for eliminating interruptions and distractions so you have more time for projects, prioritization techniques so the projects you do work on have the most impact, plus "The Cycle System," which is the easiest and most effective way to juggle all your tasks without dropping any.
Sysadmins, devs, operations, and their managers
By the end of this class, you will be able to schedule and prioritize your work (rather than be interruption-driven), have perfect follow-through (never forget a request), and limit your work-time to 40 hours a week (have a life).
- How to manage all the work you have to do.
- How to prioritize and eliminate unnecessary tasks.
- Manage interruptions: prevent them, managing the ones you get.
- The Cycle System for recording and processing to-do lists
- Task grouping: batching, sharding, and multitasking
Tom Limoncelli, Stack Overflow, Inc.
Tom is the SRE Manager at StackOverflow.com and author of Time Management for System Administrators (O'Reilly). He is co-author of The Practice of System and Network Administration (3rd edition just released) and The Practice of Cloud System Administration. He is an internationally recognized author, speaker, system administrator, and DevOps advocate. He's previously worked at small and large companies including Google, Bell Labs/Lucent, and AT&T. His blog is http://EverythingSysadmin.com and he tweets @YesThatTom.
Half Day Afternoon
All too often, technical teams spend so much time firefighting that they can’t stop to identify and eliminate the problems—the underlying causes—of incidents. Incident resolution is about taking care of the customer—restoring a service to normal levels of operation ASAP. Without a process in place to turn the problem into a known error, the root causes of the incident remain, resulting in recurrences of the incident.
The goals of the Problem Management Process are to prevent repeat incidents and to minimize the impact of incidents and problems that cannot be prevented. Most technical people already have experience in root cause analysis and problem resolution. This tutorial will help them to be measurably more consistent, mature and effective in their practices. Using IT Infrastructure Library (ITIL) best practices, this tutorial will deliver step-by-step instructions on building and managing a problem process.
Technical people and managers responsible for the support of live production services. This is an operational support process that can be put in place from the bottom up. The more teams involved in the process—DBAs, system administrators, developers, helpdesk—the greater the scope of problems that can be addressed.
- a step-by-step guide for building and implementing a problem process and the reasons behind each step
- a process template with examples that can be easily adapted to fit your organization’s current and future needs
- instructions on setting up a Known Error Database and communicating work arounds with impacted support teams
- guidance for getting buy-in from peers and managers
- a complete kit for starting to use After Action Reviews to handle the human component of problems
- Incident response vs. problem resolution
- Root cause analysis techniques
- Making decisions that are aligned with business objectives
- Getting buy-in from teammates, colleagues and managers
- Proactive problem management
- After-action reviews
Jeanne Schock, Armstrong Flooring Inc.
Jeanne Schock has a background in Linux/FreeBSD/Windows system administration that includes working at a regional ISP, a large video hosting company and a Top Level Domain Registry services and DNS provider. About 7 years ago she transitioned to a role building and managing processes in support of IT operations, disaster recovery, and continual improvement. She is a certified Expert in the IT Infrastructure Library (ITIL) process framework with in-the-trenches experience in Change, Incident, and Problem Management. Jeanne also has a pre-IT academic and teaching career and is an experienced trainer and public presenter.
If you still haven't checked that Docker thing, but need (or want) to get started with containers, this tutorial is for you!
After a short introduction explaining various usage scenarios for containers, we will roll up the sleeves of our T-shirts, and run a few simple containers through the Docker CLI. We will explain the difference between containers and images, and write a Dockerfile to build an image for a trivial application. Finally, we will present Compose, a tool to build, run, and manage stacks with multiple containers.
No prior knowledge of Docker is needed. If you know how to interact with the UNIX command line, you're set! Some demos will feature code snippets in Python, Ruby, or even C; but you will be perfectly fine even if your language of choice is Bash.
Advanced topics like networks, volumes, plugins, multi-stage builds, health checks, etc. will be mentioned but not covered in depth.
The tutorial will be hands-on. You will be provided with a pre-configured Docker environment running on a cloud VM (you won't need to setup Docker or Vagrant or VirtualBox on your machine).
Devs and ops who have managed to avoid the container hype so far but now want to catch up on all that Docker jazz
The audience will learn about the basic principles of containers: what they are, what they're for, why they have been trending the last few years.
They will also learn how to use the Docker CLI to run simple containers; build container images with Dockerfiles; start multi-container applications with Docker Compose.
This will allow them to understand containers in general and Docker in particular; use them in simple scenarios; and have a reference point for more complex ones.
Docker, containers
Jérôme Petazzoni, Docker Inc.
Ansible is a fantastic starting point for automation—either when the learning curve or the infrastructure around Chef/Puppet is too high. New users can start writing useful automation playbooks with just an SSH connection and an hour (or two) reading the docs.
This tutorial will alternate between lecture and hands-on activities using (instructor-supplied) disposable cloud infrastructure.
Sysadmins with zero exposure to Ansible through intermediate-level users who want a guided tour of its potential.
Knowledge of what Ansible is, how it works, and how it compares with other configuration-management tools; hands-on experience using Ansible to solve real-world problems; and opinionated best-practices for saving blood, sweat, and/or tears.
Ansible
Your site’s back up, you’re back in business. Do you have a way to make sure that problem doesn’t happen again? And if you do, do you like how it works?
Heroku uses a blameless retrospective process to understand and learn from our operational incidents. We’ve recently released the templates and documentation we use in this process, but experience has taught us that facilitating a retrospective is a skill that’s best taught person to person.
This tutorial will take you through a retrospective based on the internal and external communications of a real Heroku operational incident. We’ve designed it to help you experience first-hand the relaxed, collaborative space that we achieve in our best retrospectives. We’ll practice tactics like active listening, redirecting blame, and reframing conversations. Along the way, we’ll discuss how we developed this process, what issues we were trying to solve, and how we’re still iterating on it.
Managers, tech leads, anyone interested in retrospective culture and iterating on processes.
Attendees will have the materials and firsthand experience to advocate for (or to begin) an incident retrospective process at their workplace, or to improve a process they might already be using.
- Why run a retrospective
- Goal of a retrospective
- Blameless retrospectives
- Facilitating: redirecting blame, reframing, drawing people out
- How to structure a retrospective
- Preparing for a retrospective
- Five "why"s / infinite "how"s
- Human error
- Achieving follow-through on remediation items
Courtney Eckhardt
Courtney Eckhardt first got into retrospectives when she signed up for comp.risks as an undergrad (and since then, not as much has changed as we’d like to think). Her perspectives on engineering process improvement are strongly informed by the work of Kathy Sierra and Don Norman (among others).
Full Day
Intermediate and advanced Linux system administrators who want to understand their systems better and get the most out of them.
The ability to hone your Linux systems for the specific tasks they need to perform.
- Strategies for performance tuning
- Characterizing your workload's requirements
- Finding bottlenecks
- Tools for measuring system performance
- Memory usage tuning
- Filesystem and storage tuning
- Network tuning
- Latency vs. throughput
- Capacity planning
- Profiling
- Memory cache and TLB tuning
- Application tuning strategies
Theodore Ts'o, Google
Theodore Ts'o is the first North American Linux Kernel Developer, and started working with Linux in September 1991. He previously served as CTO for the Linux Foundation, and is currently employed at Google. Theodore is a Debian developer, and is the maintainer of the ext4 file system in the Linux kernel. He is the maintainer and original author of the e2fsprogs userspace utilities for the ext2, ext3, and ext4 file systems.
Half Day Morning
There's many times that the daily grind pushes you out of your comfort zone. Sometimes, you're in a bind and the best way forward is fashioning a tool out of what's available. Sometimes, those really are nails you see around you. This class looks at some of the normal, and some of the not so normal, uses for Golang in Systems Administration.
- New Golang programmers who want to get a better idea of using the language (should have some familiarity with Golang).
- Old dogs looking for new tricks.
- Several MacGyver tools that may come in handy.
- Techniques and approaches for some out of the box thinking.
- Running a quick and dirty TLS secured web server for file transfers
- Collecting and serving up system metrics
- Driving web applications from the command line
- Speak http2
- Fanout shell results from one system to many with ssh
- Roll your own container system
- and more
Chris McEniry, Sony Interactive Entertainment
Chris "Mac" McEniry is a practicing sysadmin responsible for running a large ecommerce and gaming service. He's been working and developing in an operational capacity for 15 years. In his free time, he builds tools and thinks about efficiency.
eBPF (extended Berkeley Packet Filters) is a modern kernel technology that can be used to introduce dynamic tracing into a system that wasn't prepared or instrumented in any way. The tracing programs run in the kernel, are guaranteed to never crash or hang your system, and can probe every module and function—from the kernel to user-space frameworks such as Node and Ruby.
In this workshop, you will experiment with Linux dynamic tracing first-hand. First, you will explore BCC, the BPF Compiler Collection, which is a set of tools and libraries for dynamic tracing. Many of your tracing needs will be answered by BCC, and you will experiment with memory leak analysis, generic function tracing, kernel tracepoints, static tracepoints in user-space programs, and the "baked" tools for file I/O, network, and CPU analysis. You'll be able to choose between working on a set of hands-on labs prepared by the instructors, or trying the tools out on your own test system.
Next, you will hack on some of the bleeding edge tools in the BCC toolkit, and build a couple of simple tools of your own. You'll be able to pick from a curated list of GitHub issues for the BCC project, a set of hands-on labs with known "school solutions", and an open-ended list of problems that need tools for effective analysis. At the end of this workshop, you will be equipped with a toolbox for diagnosing issues in the field, as well as a framework for building your own tools when the generic ones do not suffice.
Developers, SRE, ops engineers
Low-overhead, production-ready tools based on the BPF kernel technology for CPU sampling, memory leak analysis, I/O and file issues, and many other performance and troubleshooting scenarios.
Performance, Monitoring, Tracing, BPF, Kernel
Sasha Goldshtein, CTO, Sela Group
Sasha Goldshtein is the CTO of Sela Group, a Microsoft MVP, Pluralsight author, and international consultant and trainer. Sasha is the author of two books and multiple online courses, and a prolific blogger. He is also an active open source contributor to projects focused on system diagnostics, performance monitoring, and tracing—across multiple operating systems and runtimes. Sasha authored and delivered training courses on Linux performance optimization, event tracing, production debugging, mobile application development, and modern C++. Between his consulting engagements, Sasha speaks at international conferences world-wide.
Dozens of commands! Hundreds of options! Git has dumbfounded sysadmins and developers alike since its appearance in 2005.
And yet, this ingenious software is among the most fantastically useful ever developed.
Learn Git from the ground up and the inside out with Git Foundations Training!
This half-day class explores Git's internals in depth and includes unique practical exercises to gain familiarity and comfort in handling the nuts and bolts.
Bring with you:
- A laptop with a UNIX-like command-line environment on which "git --version" displays a version (any version).
- A willingness to learn.
No prior knowledge of Git is required. Basic Unix/Linux command line experience is assumed. Experienced users of Git have given rave reviews; the class is not aimed only at beginners, but at anyone wishing to thoroughly understand and use Git to the fullest.
- A thorough and practical understanding of the internals of Git
- The ability to easily and *confidently* manipulate Git repositories and their contents
- Readiness to pick up and *quickly* learn more exotic and advanced Git commands (and to read the man pages easily!)
Git Internals are covered in depth, beginning from basic definitions and proceeding through the essentials of Graph Theory needed to appreciate Git's architecture. Plenty of audience Q&A throughout, live demonstrations, and diagrams. Following this complete theory portion comes the practical portion of the course, with hands-on exercises to ensure retention and application of all theory.
Mike Weilgart, Vertical Sysadmin, Inc.
Mike Weilgart has loved maths and computers all his life. Graduating high school at the age of 13, he thereafter worked in a variety of positions including software QA, calculus teacher, and graphic design, before resolving to put his love of computers to professional use as a Linux sysadmin and trainer. Mike currently consults at a Fortune 50 company as an automation specialist, and enjoys nothing more than training people to full mastery of their tools.
Speedy Change Control is not an oxymoron. This tutorial will provide practical, actionable steps to streamline and speed up change control at your organization without increasing risks. In The Visible Ops Handbook, authors Behr, Kim, and Spafford identify a culture of change management as common to high-performing IT groups: “change management does not slow things down in these organizations.” This tutorial will help anyone wishing to implement phase one of the handbook: “Stabilize The Patient” And “Modify First Response”. While I draw heavily on IT infrastructure Library (ITIL) guidance, much of this is common sense good practice based on lessons learned from past success and failure. No special ticketing system, tools or ITIL knowledge are necessary. I am a certified ITIL Expert. I have over five years of experience designing, improving and managing a successful change management process at an audited technology company delivering public registry and DNS services running on complex technologies across international data centers.
Individuals and managers involved in preparing for and deploying changes and software builds in production environments.
- templates for change request types and procedures
- templates for creating standard operating procedures
- ITIL-aligned talking points for making your case for these process improvements
- Change management
- Process
- Different change types to help you speed up the process
- Assessing risks and potential impact
- Defining change authorities specific for each change type
- Metrics for measuring change process performance against goals
- Release and deployment management
- Continuous delivery
Jeanne Schock, Armstrong Flooring Inc.
Jeanne Schock has a background in Linux/FreeBSD/Windows system administration that includes working at a regional ISP, a large video hosting company and a Top Level Domain Registry services and DNS provider. About 7 years ago she transitioned to a role building and managing processes in support of IT operations, disaster recovery, and continual improvement. She is a certified Expert in the IT Infrastructure Library (ITIL) process framework with in-the-trenches experience in Change, Incident, and Problem Management. Jeanne also has a pre-IT academic and teaching career and is an experienced trainer and public presenter.
Half Day Afternoon
This tutorial will give you ways of diagnosing and preempting PostgreSQL performance issues using a wide range of tools and techniques to measure and improve your database's performance. We will cover query optimisation, configuration, and OS settings for your database server and pooling, caching, replication, and partitioning strategies that can be used to ensure performance at scale.
The target audience for this talk is server administrators and developers working with PostgreSQL, or considering using it. No specific knowledge of PostgreSQL is required but some background in RDBMS or SQL is recommended.
System administrators will benefit by learning about:
- what aspects of server and PostgreSQL configuration affect database performance and how to choose and tweak them
- how to monitor the database server to maintain high performance
Developers will benefit by learning about:
- detecting performance issues in their database usage
- optimising their queries
- This tutorial breaks down into the various potential causes of performance issues in PostgreSQL: how to diagnose them, fix them and monitor them
- Query performance issues
- Choosing the right PostgreSQL configuration within hardware and OS limitations
- Operating system and hardware tweaks that can affect performance
- Optimising database usage
- Monitoring your database and database servers performance
Camille Baldock, Salesforce
Camille Baldock is an infrastructure engineer with the Heroku Department of Data. She works on distributed systems monitoring, operations, automation, and tuning for Heroku Postgres.
All distributed systems make tradeoffs and compromises. Different designs behave very differently with respect to cost, performance, and how they behave under failure conditions.
It's important to understand the tradeoffs that the building blocks in your systems make, and the implications this has for your system as a whole. In this workshop we'll look at several examples of different real-world distributed systems and discuss their strengths and shortcomings.
This workshop will include some practical elements. Attendees will be given some system designs to read and to evaluate, and then we'll discuss the implications of each design together as a group.
People working with distributed systems, who want to fill-in the blanks as to what 'distributed systems' are supposed to be.
They will know the basic building blocks of distributed systems, how to choose between different implementations as needed.
They will know the names and basic details on common distributed systems patterns, why they exist and what happens when they are not applied correctly.
Distributed Systems Primer
John Looney, Intercom
John Looney is an SRE in Intercom, pretending to be a Product Engineer, improving infrastructure and reliability while pretending to also add features customers want.
Previously, he spent a decade in Google SRE running GFS, Borg, Colossus, Chubby, Datacenter Automation, Ads Quality pipelines and Ads Serving systems.
He has been on the programme committee of SRECon Dublin for the last three years, and presented a 'Large Scale Design' tutorial at LISA in 2012.
Attendees will learn how CI/CD pipelines can increase IT velocity (from Dev to Ops), increase code quality and lower risk; and will learn how to implement CI/CD pipelines in two popular tools, Jenkins and GitLab CI.
Infrastructure engineers, system administrators, or DevOps engineers familiar with Git who have to set up or support CI/CD pipelines.
Familiarity with CI/CD concepts; ability to implement CI/CD pipelines using popular tools such as Jenkins and GitLab CI.
- Introduction and orientation
- Origin of Continuous Integration (CI) at ThoughtWorks
- Widespread adoption; how CI relates to DevOps
- Basic tasks: Build, Test, Deploy
- Jenkins
- Overview and Architecture
- Definition of Key Terms
- Building, Testing and Deploying (with hands-on lab)
- Checking Pipeline status with Jenkins Blue Ocean UI
- Troubleshooting
- GitLab CI
- Architecture: GitLab, GitLab CI Multi Runner, ephemeral test environments
- Definitions: pipeline, stage, job, build, runner, environment, artifact, cache
- Setting up runners: adding job runners; host instance types (shell, Docker, ssh, etc.); runner/job tags
- Building, Testing, and Deploying (with hands-on lab)
- Troubleshooting: build logs; enabling verbose builds; increasing "loglevel"; interactive access to containers
Aleksey Tsalolikhin, Vertical Sysadmin, Inc.
Aleksey Tsalolikhin is a practitioner in the area of Operations of information systems. Aleksey's mission is to improve the lives of fellow practitioners through effective training in excellent technologies. Aleksey is the principal at Vertical Sysadmin, which provides on-site training on UNIX shell basics, version control with Git, Configuration Management, Continuous Integration/Continuous Deployment, SQL basics and more.
In this tutorial, you will setup your own Docker cluster, using the native orchestration features provided by the SwarmKit library. (SwarmKit is integrated with the Docker Engine since Docker 1.12.)
Then, you will use that cluster to deploy and scale a sample application architectured around microservices.
We will cover deployment tips, service discovery, load balancing; we will show how to integrate Swarm and Compose to obtain a seamless, automated "dev-to-prod" workflow; and we will show how to collect logs and metrics on a containerized platform.
To get the most out of this tutorial, you should already be familiar with Docker! If you plan to attend this just after the other tutorial "Getting started with Docker and containers," you will definitely have to mind the gap.
The tutorial will be hands-on; each attendee will be provided with a cluster of Docker nodes running on cloud VMs. The only software required on your machine is a SSH client (and a web browser).
Folks who were excited by (or forced to deploy) Docker Swarm, but want to go beyond the trivial prototype, and implement a seamless dev-to-prod workflow, and tackle logging, metrics, security, etc.
After this tutorial, the audience will know how to map their existing "ops knowledge" of traditional platforms, to container platforms.
Docker, cluster, Swarm, orchestration, containers