Abstracts of the Refereed Papers

12TH SYSTEMS ADMINISTRATION CONFERENCE (LISA '98) - Dec 6-11, 1998 - Marriott Copley Place Hotel, Boston, Massachusetts

Abstracts of the Refereed Papers to be Presented

TITAN

Dan Farmer - Earthlink Network
Brad Powell - Sun Microsystems, Inc.
Matthew Archibald - KLA-Tencor

Abstract

Titan is a freely available host-based security tool that can be used to improve or audit the security of a UNIX system. It was written almost completely in Bourne shell, with a master script controlling the execution of many smaller programs. Each of the programs either fixes or detects potential security problem, and its simple and extremely modular design also makes it useful to help check or enforce the adherence of a system against its security policy. Finally, anyone who can write a shell script or program can easily create their own Titan modules.

Titan does not replace other security tools, nor does it fix or patch security bugs; its primary purpose is to improve the security of the system it runs on by codifying as many security tricks to secure an OS that the authors could think of. And when used in combination with other security tools it can help make the transformation of an "out of the box" system into a firewall or security conscious system a significantly easier task.

NOTE: Due to time, resource, and expertise limitations, the first release of Titan is only known to run on Solaris Operating Systems, versions Solaris 2.x and Solaris 1.x. However, many of the small sub-programs within Titan work well with other UNIX's, and other than taking the time to create Titan modules for them, there is nothing Sun specific about Titan that would prevent it working on other UNIX systems.

Infrastructure: A Prerequisite for Effective Security

Bill Fithen, Steve Kalinowski, Jeff Carpenter, and Jed Pickel
CERT Coordination Center

Abstract

The CERT Coordination Center is building an experimental information infrastructure management system, SAFARI, capable of supporting a variety of operating systems and applications. The motivation behind this prototype is to demonstrate the security benefits of a systematically managed infrastructure. SAFARI is an attempt to improve the scalability of managing an infrastructure composed of many hosts, where there are many more hosts than hosts types. SAFARI is designed with one overarching principle: it should impact user, developer, and administrator activities as little as possible. The CERT Coordination Center is actively seeking partners to further this or alternative approaches to improving the infrastructural fabric on which Internet sites operate. SAFARI is currently being used by the CERT/CC to manage over 900 collections of software on three different versions of UNIX on three hardware platforms in a repository (/afs/cert.org/software) that is over 20 GB in size.

SSU: Extending SSH for Secure Root Administration

Christopher Thorpe - Yahoo!, Inc.

Abstract

SSU, "Secure su," is a mechanism that uses SSH [Ylonen] to provide the security for distributing access to privileged operations. Its features include both shell or per-command access, a password for each user that is distinct from the login password and easily changed, and high portability. By installing SSU, administrators build a solid infrastructure for using SSH for improving security in other areas, such as file distribution and revision control.

System Management With NetScript

Apratim Purakayastha and Ajay Mohindra - IBM T. J. Watson Research Center

Abstract

Cost and complexity of managing client machines is a major concern for enterprises. This concern is compounded by emerging client machines that are mobile and diverse. To address this concern, management systems must be easy to configure and deploy, must handle asynchrony and disconnection for mobile clients, and must be customizable for diverse clients. In this paper, we first present NetScript, an environment for scripting with network components. We then propose a management system built with NetScript, where mobile scripts invoke components to perform management operations. We demonstrate that our approach results in a flexible, scalable management system that can support mobile and diverse client machines.

Accountworks: Users Create Accounts on SQL, Notes, NT, and UNIX

Bob Arnold - Sybase, Inc.

Abstract

Accountworks is a system which allows any employee at Sybase, Inc. to use a web form to create accounts for new employees. Every new hire gets a personal account in SQL, Notes, NT, and UNIX administrative domains. Accountworks also creates initial stub entries in our SQL personnel database. It allows the user to make a number of initial choices for their new employee, including access to popular applications and whether to use Notes or UNIX email. Typically all new accounts are available within four hours after the web form is submitted. The system operates 24 by 365 to support our worldwide infrastructure. When the accounts are created, it guarantees a consistent, unique login, UID (for UNIX), Firstname.Lastname record, and password across all domains. It went into full production in July 1997, and has been used to create 1900 new accounts since then. Because this paper is intended to help anyone tackling cross-domain account management problems, it describes the architecture of Accountworks, the process of building it, numerous design decisions, and future directions of the project.

Single Sign-On and the System Administrator

Michael Fleming Grubb and Rob Carter - Duke University

Abstract

Large organizations are increasingly shifting critical computing operations from traditional host-based application platforms to network-distributed, client-server platforms. The resulting proliferation of disparate systems poses problems for end-users, who must frequently track multiple electronic identities across different systems, as well as for system administrators, who must manage security and access for those systems. Single sign-on mechanisms have become increasingly important in solving these problems. System administrators who are not already being pressured to provide single sign-on solutions can expect to be in the near future. Duke University has recently embarked on an enterprise-wide single sign-on project. This paper discusses the various factors involved in the decision to deploy a single sign-on solution, reviews a variety of available approaches to the problem of electronic identity proliferation, and documents Duke's research and findings to date.

Using Gigabit Ethernet to Backup Six Terabytes

W. Curtis Preston - Collective Technologies

Abstract

Imagine having to prove everything you believe at one time. That is exactly what happened when I was asked to change the design of an Enterprise Backup System to accommodate the backup and restore needs of a new, very large system. To meet the challenge, I'd have to use three brand-new pieces of technology and push my chosen backup software to its limits. Would I be able to send that much data over the network to a central location? Would I be forced to change my design? This paper is the story of the Proof of Concept Test that answered these questions.

Configuring Database Systems

Christopher R. Page - Millennium Pharmaceuticals

Abstract

This paper provides the system administrator with a fundamental understanding of database architecture internals so that he can better configure relational database systems. The topics of discussion include buffer management, access methods, and lock management. To both illustrate concepts in practice, and to contrast the two architectures of market leaders, Oracle and Sybase implementations are referenced throughout the paper. The paper describes different backup strategies and when each strategy is appropriate. In conclusion, the paper describes special hardware considerations for high availability and performance of database systems.

A Configuration Distribution System for Heterogeneous Networks

Glêdson Elias da Silveira - Federal University of Rio Grande do Norte
Fabio Q. B. da Silva - Federal University of Pernambuco

Abstract

This article presents a configuration distribution system that assists system administrators with the tasks of host and service installation, configuration and crash recovery on large and heterogeneous networks. The objective of this article is twofold. First, to introduce the system's modular architecture. Second, to describe the platform independent protocol designed to support fast and reliable configuration propagation.

An NFS Configuration Management System and its Underlying Object-Oriented Model

Fabio Q. B. da Silva, Juliana Silva da Cunha, Danielle M. Franklin, Luciana S. Varejão, and Rosalie Belian - Federal University of Pernambuco

Abstract

This paper describes an NFS configuration and management system for large and heterogeneous computer environments. It also shows how this system can be extended to address other services in the network. The solution is composed of a process that describes service configuration and management life-cycle, a modular architecture and an objected oriented model. The system supports multiple features, including: automatic host and service installation, service dependency inference and analysis, performance analysis, configuration optimization as well as service functioning monitoring and problem correction.

Design and Implementation of an Administration System for Distributed Web Server

C. S. Yang and M. Y. Luo - National Sun Yat-Sen University, Taiwan, R.O.C.

Abstract

The explosive growth of the World Wide Web has raised great concerns regarding many challenges - performance, scalability and availability of the Web system. Consequently, Web site builders are increasingly to construct their Web servers as distributed system for solving these problems, and this trend is likely to accelerate. In such systems, a group of loosely-coupled hosts will work together to serve as a single virtual server. Although the distributed server can provide compelling performance and accommodate the growth of web traffic, it inevitably increases the complexity of system administration. In this paper, we exploit the advantages of Java to design and implement an administration system for addressing this challenging problem.

MRTG - The Multi Router Traffic Grapher

Tobias Oetiker - Swiss Federal Institute of Technology, Zurich

Abstract

This paper describes the history and operation of the current version of MRTG as well as the Round Robin Database Tool. The Round Robin Database Tool is a program which logs and visualizes numerical data in a efficient manner. The RRD Tool is a key component of the next major release of the Multi Router Traffic Grapher (MRTG). It is already fully implemented and working. Because of the massive performance gain possible with RRD Tool some sites have already started to use RRD Tool in production.

Wide Area Network Ecology

Jon T. Meek, Edwin S. Eichert, Kim Takayama
Cyanamid Agricultural Research Center/American Home Products Corporation

Abstract

In an ideal world the need to provide data communications between facilities separated by a large ocean would be filled simply. One would estimate the bandwidth requirement, place an order with a global telecommunications company, then just hook up routers on each end and start using the link. Our experience was considerably more painful, primarily due to three factors: 1) The behavior of some of our applications, 2) problems with various WAN carrier networks, and 3) increasing Internet traffic. "Network Ecology" describes the management of these factors and others that affect network performance.

Automatically Selecting a Close Mirror Based on Network Topology

Giray Pultar - <giray@coubros.com>

Abstract

The content of many popular ftp and web sites on the Internet are replicated at other sites, called "mirrors"; typically, to decrease the network load at the original site, to make information available closer to its users for higher availability; and to decrease the bandwidth requirements these sites place on long-haul network connections, such as international and backbone links.

Even though the success of mirroring depends heavily on the selection of a good mirror, there are very few methods to pick a good mirror: i.e., a mirror "close" to its user based on network topology.

This paper describes a method and two tools developed to locate a "close" mirror among replicated copies of a network service such as ftp, www, irc, streaming audio by utilizing network topology information based on autonomous systems. Routing information from the Internet Routing Registry is combined with information about the location of mirrors to generate mirroring tables, similar to routing tables, which are used to identify a "close" mirror, where "close" is defined as traversing the minimum number of autonomous systems.

The tools are avaliable via anonymous ftp from ftp.coubros.com.

What to Do When the Lease Expires: A Moving Experience

Lloyd Cha, Chris Motta, Syed Babar, and Mukul Agarwal - Advanced Micro Devices, Inc.
Jack Ma and Waseem Shaikh - Taos Mountain, Inc.
Istvan Marko - Volt Services Group

Abstract

Moving a division of approximately 200 employees from one building to another across town can be a daunting task. It involves coordination among teams from systems administration, networking, facilities, and security as well as support from management and cooperation of the employees being relocated. Contractors and subcontractors are frequently hired to handle physical relocation of goods from one location to another, construction of new server rooms, electrical rewiring, installation of new cooling systems, etc. This paper is the story of how we handled the move and reconfiguration of a network of approximately 1000 nodes over a long weekend in May 1998.

Previously published work has discussed some of the issues that challenged us here. The reconfiguration of large numbers of machines has been discussed in [Manning93, Riddle94, Shaddock95]. "Forklift" upgrades of new hardware [Harrison92] share some but not all of the problems we faced in our move. Implementation of new networking topology without the problems or schedules imposed by physical relocation has been discussed in [Limoncelli97].

We believe our work is unique in requiring all these tasks to happen on a large scale in a relatively short time. We were allocated only one workday in addition to a weekend to shutdown and relocate our computing environment. We were expected to have a fully functioning network at our new location the following Monday. Ordinarily the complete reconfiguration of a network this size would be a challenge in itself. For our project, we had to account for the time required to disconnect and pack machines, load them into trucks, transport them across town, unload, and reconnect them at the new building. As we will detail, the resulting window of time available to handle the reconfiguration of all these machines was very small.

Anatomy of an Athena Workstation

Thomas Bushnell, BSG and Karl Ramm - MIT Information Systems

Abstract

This paper presents work by many developers of the Athena Computing Environment, done over many years. We aim to show how the various components of the Athena system interact with reference to an individual workstation and its particular needs. We describe Hesiod, Kerberos, the locker system, electronic mail, and the software release process for workstation software, and show how they all interrelate to provide high-reliability computing in a very large network with fairly low staffing demands on programmers and systems administrators.

Bootstrapping an Infrastructure

Steve Traugott - Sterling Software and NASA Ames Research Center
Joel Huddleston - Level 3 Communications

Abstract

When deploying and administering systems infrastructures it is still common to think in terms of individual machines rather than view an entire infrastructure as a combined whole. This standard practice creates many problems, including labor-intensive administration, high cost of ownership, and limited generally available knowledge or code usable for administering large infrastructures.

The model we describe treats an infrastructure as a single large distributed virtual machine. We found that this model allowed us to approach the problems of large infrastructures more effectively. This model was developed during the course of four years of mission-critical rollouts and administration of global financial trading floors. The typical infrastructure size was 300-1000 machines, but the principles apply equally as well to much smaller environments. Added together these infrastructures totaled about 15,000 hosts. Further refinements have been added since then, based on experiences at NASA Ames.

The methodologies described here use UNIX and its variants as the example operating system. We have found that the principles apply equally well, and are as sorely needed, in managing infrastructures based on other operating systems.

This paper is a living document: Revisions and additions are expected and are available at https://www.infrastructures.org. We also maintain a mailing list for discussion of infrastructure design and implementation issues - details are available on the web site.

Ganymede: An Extensible and Customizable Directory Management Framework

Jonathan Abbey and Michael Mulvaney - The University of Texas at Austin

Abstract

In the fall of 1994, Applied Research Laboratories, The University of Texas at Austin (ARL:UT) presented a paper [1] at LISA VIII, describing work that we had performed designing and implementing a management framework for NIS and DNS, called GASH. In the years since that paper was presented, it has become clear that the design of GASH was insufficient to meet the complex, idiosyncratic, and rapidly changing needs of modern networking. GASH suffered from being too inflexible to be rapidly retooled for a changing network environment, from being limited to a single user at a time, and from being unable to provide management services to custom clients.

In the face of these issues, the Computer Science Division at ARL:UT went back to the drawing board and developed a Java-based directory management framework on the basis of the design principles presented in our GASH paper. Written in Java, Ganymede (which stands for The "GAsh Network Manager, Deluxe Edition," of course) is based on a distributed object design using the Java Remote Method Invocation [2] protocol and features a multi-threaded, multi-user server, and a graphical, explorer-style client. By supporting customization through a graphical schema editor, plug-in Java classes, and external build scripts, Ganymede is able to support a variety of directory services, including NIS, DNS, LDAP, and even NT user and group management.

Building An Enterprise Printing System

Ben Woodard - Cisco Systems

Abstract

Cisco Systems has chosen to internally develop an enterprise wide print system that provides access to more than 2000 printers for both Unix and PCs. The requirements for this print system were that it had to be very cheap to construct, highly scalable, easily maintained by a very small staff, fault tolerant, and mission critical reliable. In other words, management essentially wanted everything for practically nothing. To meet our objectives we built our print system out of interchangeable low cost running PCs Linux, LPD and Samba as well as other standard Unix applications. The low cost of PC hardware and the lack of licensing fees for Linux allowed us to deploy the print system vary widely without having to go through all the managerial justifications necessary to authorize larger scale purchases. By making each print server interchangeable we achieved scalability as well as a certain degree of fault tolerance. The flexibility of running a Unix like operating system such as Linux as opposed to another more restrictive operating system allowed us to develop a worldwide printing application that can be managed very easily by only two or three people. And finally the robustness of Linux made it possible for us to use our print system in mission critical environments such as manufacturing production floors.

This paper discusses the process by which the print system was implemented and the wisdom learned in the process. It covers topics such as how to gain and maintain control of the printing process, why it is necessary and how to keep printers a completely network managed device, how we learned to deal with large numbers of server, the advantages and problems we ran into as the number of servers grew, and the many advantages and few disadvantages of basing the system entirely on free software. It also highlights some of the major processes that we automated and the success we had devolving power first to the the local technical support people and then ultimately to the users. Finally, it discusses many of the problems that we are running into now that the print system is a few years old and the steps that we are taking to ensure that we do not become victims of our our own success and that we do not have the whole system collapse due to data rot.

Since the real key to managing thousands of printers effectively is figuring out how to save time, the real world experience we gained and the time saving tips we discovered while learning how to manage thousands of printers should be valuable even to sysadmins that have only a few printers to manage.

Large Scale Print Spool Service

Ignacio Reguero, David Foster, and Ivan Deloose - CERN

Abstract

The paper describes a project to enhance the print service for CERN.

The printer infrastructure consists of over 1000 printers serving more than 5000 Unix users running on workstations of various brands as well as PCs running Linux. In addition, the infrastructure must serve more than 3000 PCs running Windows/95 and NT 4.

We support a large number of printer manufacturers, including HP, QMS, Tektronix, Xerox and Apple.

Lightweight print clients are provided for all the supported platforms and transparently distributed using the ASIS software repository and the NICE application architecture. They may be used as "drop-in" replacements of the standard vendor clients. Compatibility with older CERN lightweight print clients is provided. Printing with standard vendor clients is also possible.

Administrative tools are provided for the general management of print servers and in particular for replicating server configurations and monitoring spool file systems.

The service offers a high level of scalability and fault tolerance, since it has no single point of failure in the server back-end.

mkpkg: A software packaging tool

Carl Staelin - Hewlett-Packard Laboratories

Abstract

mkpkg is a tool that helps software publishers create installation packages. Given software that is ready for distribution, mkpkg helps the publisher develop a description of the software package, including manifests, dependencies, and post-install customizations. mkpkg automates many of the painstaking tasks required of the publisher, such as determining the complete package manifest and dependencies of the executables on shared libraries. Using mkpkg, a publisher can generate software packages for complex software such as TeX with only a few minutes effort.

mkpkg has been implemented on HP-UX using Tcl/Tk and provides both graphical and command line interfaces. It builds product-level packages for Software Distributor (SD-UX).

SEPP - Software Installation and Sharing System

Tobias Oetiker - Swiss Federal Institute of Technology, Zurich

Abstract

SEPP is an application installation, sharing and packaging solution for large, decentrally managed Unix environments. SEPP can be used without making modifications to the organizational structure of the participants' servers. It provides consistent application setup, documentation, wrapper scripts and usage logging as well as version concurrency and clean software removal. This paper first gives an overview of products already available in this field and then goes on describing SEPP.

Synctree for Single Point Installation, Upgrades, and OS Patches

John Lockard - University of Michigan
Jason Larke - ANS Communications, Inc.

Abstract

The combination of large networks, frequent operating system security patches, and software updates can create a daunting task for a systems administration team. This paper presents a system created to address these challenges with system security and "uptime" as the primary concerns. By using a file-form "database," the Synctree system holds a full network's configuration in an understandable, secure, location. This paper also compares this system with previously published works.

The Evolution of the CMD Computing Environment: A Case Study in Rapid Growth

Lloyd Cha, Chris Motta, Syed Babar, and Mukul Agarwal - Advanced Micro Devices, Inc.
Jack Ma and Waseem Shaikh - Taos Mountain, Inc.
Istvan Marko - Volt Services Group

Abstract

Rapid growth of a computing environment presents a recurring theme of running out of resources. Meeting the challenges of building and maintaining such a system requires adapting to the ever changing needs brought on by rampant expansion. This paper discusses the evolution of our computer network from its origins in the startup company NexGen, Inc. to the current AMD California Microprocessor Division (CMD) network that we support today. We provide highlights of some of the problems we have encountered along the way, some of which were solved efficiently and others that provided lessons to be learned.

The reengineering of computer networks and system environments have been the subject of numerous papers including [Harrison92, Evard94b, Limoncelli97]. Like the others, we discuss topics related to modernization of our systems and the implementation of new technologies. However, our focus here is on the problems caused by rapid growth. With increasing requirements for more compute power and the availability of less expensive and more powerful computers, we believe that other environments are poised for rapid growth such as ours. We hope that lessons learned from our experience will better prepare other system administrators in similar situations.

Computer Immunology

Mark Burgess - Oslo College

Abstract

Present day computer systems are fragile and unreliable. Human beings are involved in the care and repair of computer systems at every stage in their operation. This level of human involvement will be impossible to maintain in future. Biological and social systems of comparable and greater complexity have self-healing processes which are crucial to their survival. It will be necessary to mimic such systems if our future computer systems are to prosper in a complex and hostile environment. This paper describes strategies for future research and summarizes concrete measures for the present, building upon existing software systems.

A Visual Approach for Monitoring Logs

Luc Girardin and Dominique Brodbeck - UBS, Ubilab

Abstract

Analyzing and monitoring logs that portray system, user, and network activity is essential to meet the requirements of high security and optimal resource availability. While most systems now possess satisfactory logging facilities, the tools to monitor and interpret such event logs are still in their infancy.

This paper describes an approach to relieve system and network administrators from manually scanning sequences of log entries. An experimental system based on unsupervised neural networks and spring layouts to automatically classify events contained in logs is explained, and the use of complementary information visualization techniques to visually present and interactively analyze the results is then discussed.

The system we present can be used to analyze past activity as well as to monitor real-time events. We illustrate the system's use for event logs generated by a firewall, however it can be easily coupled to any source of sequential and structured event logs.

Mailman: The GNU Mailing List Manager

John Viega - Reliable Software Technologies
Barry Warsaw and Ken Manheimer - Corporation for National Research Initiatives

Abstract

Electronic mailing lists are ubiquitous community-forging tools that serve the important needs of Internet users, both experienced and novice. The most popular mailing list managers generally use textual mail-based interfaces for all list operations, from subscription management to list administration. Unfortunately, anecdotal evidence suggests that most mailing list users, and many list administrators and moderators are novice to intermediate computer users; textual interfaces are often difficult to use effectively.

This paper describes Mailman, the GNU mailing list manager, which offers a dramatic step forward in usability and integration over other mailing list management systems. Mailman brings to list management an integrated Web interface for nearly all aspects of mailing list interaction, including subscription requests and option settings by members, list configuration and Web page editing by list administrators, and post approvals by list moderators. Mailman offers a mix of robustness, functionality and ease of installation and use that is unsurpassed by other freely available mailing list managers. Thus, it offers great benefits to site administrators, list administrators and end users alike. Mailman is primarily implemented in Python, a free, object-oriented scripting language; there are a few C wrapper programs for security.

Mailman's architecture is based on a centralized list-oriented database that contains configuration options for each list. This allows for several unique and flexible administrative mechanisms. In addition to Web access, traditional email-command based control and interactive manipulation via the Python interpreter are supported. Mailman also contains extensive bounce and anti-spam devices.

While many of the features discussed in this paper are generally improvements over other mailing list packages, we will focus our comparisons on Majordomo, which is almost certainly the most widely used freely available mailing list manager at present.

Drinking from the Fire(walls) Hose: Another Approach to Very Large Mailing Lists

Strata Rose Chalup, Christine Hogan, Greg Kulosa, Bryan McDonald, and Bryan Stansell - Global Networking and Computing, Inc.

Abstract

This paper describes a set of tools and procedures which allow very large mailing lists to be managed with the freeware tool of the administrator's choice. With the right approach scaling technology can be applied to a list management tool transparently.

In recent years, many ingenious methods have been proposed for handling email deliveries to mailing lists of several thousand subscribers. Administration of a mailing list is not limited to message delivery, however. Tasks such as managing subscribers, dealing with mail bounces, and preventing list spamming also become more difficult when applied to very large lists.

As a case study, this paper describes the process of moving the well-known "Firewalls" mailing list from its original home at GreatCircle Associates to a new infrastructure at GNAC. The process was thought to be straightforward and obvious, and it soon became apparent that it was neither. We trust that our discoveries will benefit other systems administrators undertaking similar projects, either concerning large mailing lists or moving complex "legacy systems."

Request v3: A Modular, Extensible Task Tracking Tool

Joe Rhett - Navigist

Abstract

Tracking tasks remains one of the most difficult issues facing any working team of administrators. Even with the implementation of commercial tools available today, e-mail and hallway conversations remain the standard for task management in many organizations; however, these make it difficult and time consuming to remain current on issues, and do nothing to summarize the long-term history of tasks and completion thereof.

Many commercial tools are available to handle task management, and most work quite well for stereotype models of their intended environments - development teams, help desk, etc. Unfortunately, these systems often have limitations which prevent their use (or a simple deployment) in a pre-existing, working environment. Other systems are difficult or time-consuming to use, and remain ignored in favor of task accomplishment. Few freely available systems provide the statistics to analyze productivity, generate statistics, and otherwise please management.

Request v3 was designed to provide the necessary essentials for modern task management: a selection of user interfaces, support for multiple database backends, flexible security controls, and extensive reporting capabilities. It runs cleanly in heterogeneous environments, including those that have a large installed base of Windows users. It includes command line, e-mail, and web interfaces, in addition to an Extension Interface which provides a simple way to access the Request system from other programs, scripts, or any custom interface one may create. The authentication, notification, data storage, and logging functions are processed within separate modules, allowing a variety of backend databases to be supported.

Need help? Use our Contacts page.

Last changed: Nov 30 1998 ah

Conference Index

Events Index

USENIX home