################################################ # # # ## ## ###### ####### ## ## ## ## ## # # ## ## ## ## ## ### ## ## ## ## # # ## ## ## ## #### ## ## ## ## # # ## ## ###### ###### ## ## ## ## ### # # ## ## ## ## ## #### ## ## ## # # ## ## ## ## ## ## ### ## ## ## # # ####### ###### ####### ## ## ## ## ## # # # ################################################ The following paper was originally published in the Proceedings of the 1997 USENIX Annual Technical Conference Anaheim, California, January 6-10 1997. For more information about USENIX Association contact: 1. Phone: 510 528-8649 2. FAX: 510 548-5738 3. Email: office@usenix.org 4. WWW URL: https://www.usenix.org Protected Shared Libraries - A New Approach to Modularity and Sharing Arindam Banerji John Michael Tracey David L. Cohn Hewlett-Packard Laboratories IBM T.J. Watson Research Center University of Notre Dame axb@hpl.hp.com jtracey@watson.ibm.com dlc@cse.nd.edu ABSTRACT Protected Shared Libraries, or PSLs, are a new type of support for modula= rity that form a basis = for building flexible library-based operating system services. PSLs exten= d the familiar notion of = shared libraries with protected state and data sharing across protection = boundaries. Protected state = information allows PSLs to be used to implement sensitive operating syste= m services. Sharing of = data across protection boundaries yields significant performance benefits= =2E These features make = PSLs a viable basis on which a complete operating system can be built lar= gely as a set of dynami- cally loadable libraries without compromising protection or sacrificing p= erformance. PSLs also = allow highly flexible implementations of new functionality to be added to= current commercial oper- ating systems. A prototype PSL implementation has been built into AIX 3.2= =2E5 and early perfor- mance results are encouraging. 1. Introduction Software flexibility relies on modularity, the ability = to modify or replace individual software components = easily. Modularity in turn relies on not only the soft- ware's internal structure, but also on the degree to which = modularity is supported by the underlying operating = system and the efficacy of that support. It is not surpris- ing, therefore, that traditional monolithic systems which = lack comprehensive support for modularity are charac- teristically inflexible and difficult to develop and main- tain. Production of highly adaptable and manageable = systems relies on the development of modularity support = which is flexible, efficient, and easy to use. Attempts to produce modular operating systems = have generally followed one of two approaches. The = first is to separate an existing operating system kernel = into a microkernel that provides a basic set of funda- mental constructs and one or more user-level server = tasks which run on top of the microkernel and provide = operating system services [Black et al. 92] [Rosier at al. = 92]. This approach has been applied to a number of = commercial operating systems [Batlivala et al. 92] [Bor- gendale at al. 94] [Golub et al. 90] [Golub et al. 93] = [Malan et al. 90] [Phelan et al. 93] [Weicek et al. 93]. = The second approach is to design an entirely new oper- ating system emphasizing flexibility using object-ori- ented technology which generally includes language = support. The second approach has primarily been rele- gated to academic and research environments [Bershad = et al. 95] [Campbell et al. 93] [Yokote 92]. Neither of these approaches features adequate flexi- bility, efficiency, and ease of use. By itself the microker- nel approach conveys separation of an operating system = kernel along only a single line, the kernel-user bound- ary. Finer-grained decomposition of both the kernel- level and user-level portions remains an issue. Also, = decomposition of the user-level portion into multiple = user-level server tasks may be inefficient due to over- head associated with task-based protection [Condict et = al. 93] [Ford & Lepreau 94] [Lepreau et al. 93] [Maeda = & Bershad 93]. The language based object-oriented = approach is generally applicable only to new systems. An alternate approach to modularity which pro- vides sufficient flexibility, efficiency, and is easily appli- cable to operating systems is needed. Protected Shared = Libraries (PSLs) are just such an approach. PSLs extend = the familiar notion of shared libraries by adding support = for protected state and allowing data to be shared across = protection boundaries. Protected Shared Libraries con- sist of two separate mechanisms: Protected Libraries = and Context Specific Libraries. Protected Libraries associate access to specific state = information with each library entry point. Entrance to a = library routine, which can be enacted only via a defined = entry point, conveys access to data associated with the = routine; access is revoked when the routine is exited. = Similarly, access to a process's data segment is revoked = upon entry to a Protected Library routine and restored = upon returning from it. Thus, library and client data are = protected from each other's code. Context Specific Libraries, or CSLs, share a single = copy of code at a single address as traditional shared = libraries do, but offer significantly increased flexibility = regarding shared data. They may be seen as a mecha- nism for encapsulating information that needs to be = shared between protection domains. CSLs allow data to = be shared between clients and a service in different = ways. A CSL may share a single copy of data between = multiple clients, such that all clients see the same data at = identical locations. Alternatively, a CSL may share data = between a client and a service at a single location such = that the actual contents of the shared region is associated = with the calling (client) or the called (service) region. = Later sections present details of the various forms of = sharing supported. The PSL infrastructure is based on two distinct = hypotheses. First, given the benefits of shared libraries, = it may be easier to compose user-level system services = as sets of cooperating shared libraries rather than as sep- arate processes. Associating protection with shared = libraries would allow modular library-based system ser- vices to be constructed to replace traditional servers and = perhaps even some privileged mode components such as = loadable virtual file systems. The ease with which = shared libraries can be loaded, unloaded and dynami- cally relocated provides flexibility not easily attained = with cooperating processes. Second, sharing between = cooperating entities has to be an intrinsic part of the pro- gramming model and not retrofitted through facilities = such as mmap. CSLs allow programmers to share infor- mation in libraries and control the exact nature of shar- ing. This opens up possibilities for easily creating UNIX = u_block [Goodheart & Cox 92] implementations, shar- ing I/O buffers across protection domains [Khalidi & = Nelson 93] and sharing closures and objects across pro- tection boundaries [Banerji et al. 94a]. The remainder of this paper proceeds as follows. = The next section presents the motivation for Protected = Shared Libraries, both to improve modularity and to = facilitate sharing. After that, PSL semantics are = described. Implementation issues are discussed in Sec- tion 4. Performance results from a prototype PSL imple- mentation are presented in Section 5, and finally, = Section 6 presents a brief discussion of the contributions = made by Protected Shared Libraries. 2. Design Motivation Protected Shared Libraries are motivated by two = factors. First, passive protection domains, particularly = shared libraries, provide an excellent basis for software = modularity. Second, efficiency requires cross-domain = interactions use shared data. Discussion of these obser- vations continues in the next subsection. Following that, = the overall PSL design approach is described. 2.1 Shared Libraries as Protection Domains Enforced protection boundaries have been found to = be a very effective software structuring tool especially = for large systems [Nelson 91] [Bogle 94] [Khalidi & = Nelson 93] [Chase 94]. Protection can be enforced = through a variety of means including separate address = spaces [Acetta et al. 86], language support [Nelson 91], = and post-processing of binary code [Wahbe 93]. Each of = these approaches has been used to increase modularity = and security and to facilitate debugging of large soft- ware systems. Protection has also been used to ease = modification or replacement of software components = [Pu 95] [Khalidi & Nelson 93] [Orr 92]. Most work regarding enforcement of protection = boundaries in current operating system software, both = user-level and kernel-level, has been focussed on = improving the efficiency of cross-domain invocations. = Invocation times have been significantly reduced by = handoff scheduling [Black 89] and thread migration = [Bershad 90] [Lepreau et al. 94] [Hamilton & Kou- giouris 93]. These protection domains, however, have = typically been associated with processes. Counter exam- ples exist [Organick 72] [Scott et al. 90] [Wulf et al. 81], = but have generally been limited to research efforts = encompassing entirely new operating systems. The only = system known to use passive protection domains effec- tively in a commercial operating system is Mach 4.0 = [Lepreau et al. 94], but even that implementation is = closely tied to the notion of processes. Our focus on = using passive abstractions to represent protection = domains is based both on the experience of others = [Carter et al. 93] as well as our own [Banerji et al. 94a] = that clearly demonstrate the advantages of passive mod- ularity. Use of passive entities, as opposed to active pro- cesses, to represent protection domains has usually = taken one of two forms, objects as in Clouds and Psyche = or shared libraries as in Multics. A strong case for the = support of passive objects as the basic structuring mech- anism has been made [Ford 93]. The most important = advantages of passive protection domains are their abil- ity to better represent the common case of synchronous = communication, their documented ability to support = optimized implementations [Druschel 92] [Chase 94] = [Carter et al. 93], and the ease with which they can be = managed in user-level client code. The last advantage is = especially important in making shared libraries a good = vehicle for passive protection domains. = 2.1.1 Shared Library Limitations Most commercial operating systems support shared = libraries in one form or another, but semantics vary from = system to system. On most UNIX systems, library code = is shared, but each client process accessing a shared = library gets its own copy of library data. This copy gets = mapped into the process' private data segment and is, = therefore, equally accessible by client and library code. = Some systems such as OS/2 [Deitel & Kogan 92] and = Windows [King 94] allow shared or dynamically linked = libraries (DLLs) to contain shared data as well as code, = but offer little or no protection. Each client task access- ing a DLL has equal access to the DLL's data. Malicious = or errant clients can, therefore, corrupt shared data and = adversely impact other clients. What is desired then is shared library support that = features both per-client data as in UNIX, and global data = as in OS/2 and Windows, but with protection. Specifi- cally, we desire protection of library data, including per- client data, from client code, and client data from library = code. 2.2 Cross-Domain Sharing Increasingly, system software complexity is being = addressed through use of modular protection domains. = Cross-domain interactions usually take the form of fast = RPC mechanisms that circumvent much of the tradi- tional in-kernel RPC code-path. [Bershad 90] [Hamilton = & Kougiouris 93] [Condict et al. 93] Efficiency of fast = RPC mechanisms has been improved through use of = shared message buffers [Bershad 90]. Some optimized = implementations, such as the Fbufs approach [Druschel = & Peterson 93], improve throughput by two orders of = magnitude. Efficiency concerns, therefore, make a com- pelling case for sharing. Sharing is also indicated by = structural considerations. Cross-domain sharing has been used to improve = structure in various parallel programming models [Scott = et al. 90], and to support persistent databases [Bogle 94] = [Chase 94] and shared object frameworks [Campbell et = al. 93] [Banerji et al. 94]. Sharing enables cooperation = between domains with limited trust [Chase 94]. Thus, = sharing can be used to support a variety of interactions = including producer-consumer, non-intrusive monitoring, = asynchronous service providers [Bogle 94], shared = pipes, stateless servers with client maintained state, and = shared objects exported by servers [IBM 93]. There is a third reason for sharing across protection = domains. Most implementations of cross-domain object = interactions include a fair amount of overhead for the = locally distributed case [IBM 93] [Janssen 95]. Thus, = sharing object instances [Banerji et al. 94] and passing = enclosures between protection domains on the same = machine are usually inefficient. Most of these problems = may be solved by judicious sharing of data and = addresses between interacting protection domains. Such = techniques can drastically reduce the cost of object = interaction between protection domains on the same = machine. Capturing this rich yet diverse set of structuring = possibilities in an uniform abstraction requires careful = design. Adequate programming support is needed so the = benefits of sharing may be fully and easily exploited. 2.2.1 Programming Support for Sharing Two attributes make shared information attractive = to programmers. =B7 First, the ability to share pointer-rich data across = domains is attractive. This ability has been found to = be useful for persistent stores [Chase 94], shared = C++ objects [Banerji et al. 94], distributed shared = data and system software. The obvious argument = against uniform sharing is the need to reserve por- tions of an address space. This concern is decreas- ing with the increasing popularity of large = effective address spaces such as the 52-bit global = address space and 64-bit non-segmented address = space in the POWER [Weiss 94] and Alpha archi- tectures respectively. The advantages in efficiency = of avoiding pointer transformations in various = applications and system software, as well as pro- grammer convenience are significant. =B7 Second, the ability to treat shared data through = symbolic names that maintain meanings across = domains is attractive. A good example of this is the = ability to call the shared "libc" version of malloc, = uniformly from any process that links in the shared = libc library. This facility can easily be extended to = shared data, by involving the linker or loader in the = manipulation of shared information. This approach = can be seen in the shared libraries of systems as = diverse as Multics, OS/2 and Hemlock [Garrett et = al. 93]. Clearly, with a little system support uniform = addressing and naming, can be integrated into relocate- able object modules, as has been done with shared = libraries in Multics, Hemlock and OS/2. In current = implementations, however, the available sharing mecha- nisms provide limited flexibility. Context Specific Libraries are motivated by the = observation that with a few system extensions, different = modes of sharing, along with uniform addressing and = naming, can easily be integrated with the common = notion of shared libraries. Although other efforts have = provided improved forms of sharing, few have been = integrated with commercial operating systems. 3. Protected Shared Library Semantics Protected Shared Libraries extend traditional shared = libraries in two ways, with protected state data and = cross-domain data sharing. Protected Libraries protect = library and client data from each other's code. Context = Specific Libraries allow global, client-specific and = domain-specific data to be shared across protection = domains. The semantics of each of these mechanisms is = now described in detail. 3.1 The Mechanisms Protected Libraries improve modularity and secu- rity and facilitate debugging of large software systems = by enforcing protection domains between client and = library code. Previously, other approaches to protection = based on active entities such as processes have been = used to improve modularity. Protected libraries investi- gate the alternative of using dynamically loadable pas- sive shared libraries to enforce protection. This idea is = based on efforts such as Psyche [Scott et al. 90] and = Multics [Organick 72] both of which supported pro- tected dynamically loadable object modules. Context Specific Libraries represent modules of = code and data that may be shared in various forms = between different protection domains. They represent a = communication channel between protection domains = and thus augment traditional RPC mechanisms. CSLs = extend the notion of cross-domain data and address = sharing as found in Fbufs [Druschel & Peterson 93], the = zero-copy I/O framework and most implementation of = the UNIX u-block [Leffler et al. 89]. Together, these = mechanisms form a coherent set of structuring and com- munication abstractions which support construction of = system software at user-level. 3.2 Protected Libraries Protection domain semantics are determined prima- rily by resource management. Traditional protection = domains such as processes act as containers of resources = such as memory, threads, file handles, and semaphores. = With process-based protection, resource management = during control transfer from one domain to another is = fairly simple. Resources belonging to the currently run- ning process are accessible. Resource management is = more complex with library-based protection because = resources are not typically associated with libraries. Resources may be categorized into memory = resources, such as client and service data, and non- memory resources, such as file handles and semaphores. = A programmer using protected libraries typically need = be concerned only with the handling of memory = resources. Handling of non-memory resources is more = complicated and usually only relevant to advanced pro- grammers and library authors. Prior to describing = resource management issues in detail, we define several = terms. 3.2.1 Definitions Two threads executing concurrently within a pro- tection domain always see the same set of memory = resources. A UNIX process constitutes a primary or root = protection domain in which execution is initiated. A = Protected Library is viewed as a secondary protection = domain which is always associated with one or more = primary domains. Execution in a secondary domain is = always initiated by a thread entering it from another pri- mary or secondary domain. A thread is the primary unit of execution and can = traverse protection domains. As in most multi-threaded = process models, each thread owns a small set of per- thread resources such as scheduling and accounting = information. These resources, usually encapsulated in a = shuttle [Hamilton & Kougiouris 93], belong to the = thread and remain associated with the thread as it = traverses protection domains. Most resources a thread = accesses belong to the domain in which it is currently = executing. All threads within a domain have equal = access to the domain's resources. Threads could con- ceivably be created in either primary or secondary = domains, but thread creation is currently restricted to = primary domains. If Protected Library calls were the only form of = cross-domain interaction, thread movement would be = restricted to one primary domain and its associated sec- ondary domains. Alternate mechanisms such as thread = migration would allow a thread to move between multi- ple primary domains. The following discussion is sim- plified by limiting cross-domain control transfers to = Protected Library invocations and eliminating consider- ation of other forms of IPC. 3.2.2 Memory Resources A Protected Library is an enforced unit of modular- ity which contains code and data. Any stateless or state- full service that requires protection from either multiple = users or from other software components can be imple- mented in a Protected Library. Protected Libraries can = be viewed as regular shared libraries that export pro- tected entry points. Data associated with a Protected = Library is accessible only once the library has been = entered via a defined entry point. To summarize, a Pro- tected Library is a passive protection domain that can be = entered only through defined entry points. Figure 1 depicts a Protected Library that does not = use any Context Specific Libraries. A client process can- not access any of the service data until it calls a defined = entry point. A thread starts executing in the client- domain where it can access only the process's own pri- vate code and data. This includes code and data defined = in the client program and in any ordinary libraries linked = with it. Upon calling a defined service entry point, the = thread enters the service protection domain. In doing so = the thread loses access to its own private code and data = and gains access to service code and data. = Figure 1 Protected Library This description illustrates the similarity between = Protected Libraries and encapsulated objects. However, = Protected Libraries differ from encapsulated objects in = three respects. First, Protected Libraries enforce protec- tion boundaries. Admittedly, some object implementa- tions also do this. Second, Protected Libraries allow for = direct information sharing between a client and a ser- vice, through Context Specific Libraries as described = shortly. Finally, because Protected Libraries are actually = protection domains, their semantics encompass operat- ing system and user environment resource management. 3.2.3 Non-Memory Resources Protected Libraries use the "sum of resources" = model to specify accessibility to non-memory resources = when executing in a secondary or Protected Library = domain. A thread's resource set while executing in a = Protected Library is the sum of a set of resources passed = from the original primary domain, a set of resources = associated with the library domain, and a subset of the = per-thread resources associated with a thread no matter = what domain it is executing in. The "sum of resources" = model only affects Protected Shared Library (second- ary) domains and not the root or primary domain which = is represented by a regular process context. This follows = the Hydra [Wulf et al. 81] model of resource manage- ment which allows certain sets of resources to be passed = essentially as parameters into a domain along with a call = to the domain. The set of resources that must be explic- itly passed when entering a secondary domain is part of = a Protected Library's interface specification. For the most part, programmers building Protected = Libraries can safely ignore the notion of passing = resources from one domain to another. Default specifi- cations allow this detail to be disregarded, in general, = without complications. A programmer might have to = choose a particular set of resources to be passed into a = domain in the case of exception vectors, such as UNIX = signal handlers, for user-level threads. Even in the case = of UNIX signals, however, determination of which sig- nals should be handled on a per-domain basis and which = must be handled by the primary domain would generally = be straightforward. 3.3 Context Specific Libraries Context Specific Libraries (CSLs) add sharing = primitives to the PSL abstraction. CSLs are units of = modularity, but not protection, that may be accessible by = multiple domains simultaneously. CSLs contain infor- mation that must be shared across multiple domains. = The CSL is mapped into different protection domains = depending on the type of sharing required. CSLs thus = extend traditional cross-domain RPC with which infor- mation is shared through parameters only by adding = communication via shared memory. CSLs are not pro- tection domains but communication channels that may = be associated with protection domains to share informa- tion. Depending on the type of sharing, as described = below, a CSL may be viewed as a fixed piece of infor- mation accessible by all domains, mapped information = that moves with a thread across domains, or in other = ways. A programmer need decide only what information = resides in the CSL and what kind of sharing is required. = Mapping and sharing of CSLs in multiple domains is = handled by the PSL implementation. 3.3.1 Context Specific Library Properties A CSL is uniformly shared and named in all = domains in which it is visible. This implies that symbol = names and addresses seen by different domains, are con- sistent for a given CSL. It does not imply that all = domains that use a particular CSL necessarily see the = same contents. However, all domains using a particular = CSL see the same resolved names at the same addresses; = they may or may not see the same contents depending = on the type of CSL. As discussed in the previous section, this approach = of sharing information between domains has consider- able benefits, such as the ability to exchange pointers = between domains and deal with shared data through = symbolic names rather than pointers. This advantage = results from encapsulating information in shared librar- ies. Use of shared libraries rather than a facility such as = mmap or shared memory IPC implies that shared infor- mation is subject to relocation during loading. This = allows uniform relocation and address space allocation = required for uniform naming and sharing to be imple- mented relatively easily. CSLs may, therefore, be called = and used from PSL based protection domains or the = originating root domain with exactly the same effect. = =46rom the point of view of the calling protection domain, = CSLs look like regular shared libraries with different = data sharing characteristics. CSLs appear to execute in = the context of the calling domain with one exception. A CSL module executes in the kernel context of the = calling domain. The kernel resources available to CSL = code are, therefore, those of the client domain calling = the CSL. A CSL also maintains its own user-level con- text. The reason for this is based on programming expe- rience with CSLs which are frequently used to allocate = and manipulate shared data. In such cases, it is = extremely useful to depend on a memory allocation or = malloc function that is specialized towards the kind of = sharing supported by the particular CSL. Thus, a pro- grammer requiring a particular kind of data sharing = encapsulates data and its manipulating code in a CSL = and uses regular malloc calls. This frees the programmer = from explicitly having to deal with multiple versions of = malloc. CSLs support three distinct types of sharing each = intended to support a different type of interaction = between protection domains. In all three types, each = protection domain accessing the CSL views the CSL at = the same address. The three types of sharing differ in the = contents seen by different domains. 3.3.2 Global Context Specific Libraries Global CSLs are used to share data and addresses = among multiple domains. A Global CSL is depicted in = Figure2. Global CSLs feature a single instance of their = associated data. This instance is mapped at a single = address into the address space of each primary or sec- ondary protection domain accessing the CSL. The glo- bal sharing model relies on clients of the shared data = using voluntary locking protocols to maintain coher- ency. = Figure 2 Global Context-Specific Library One example use of a Global CSL is for a global = memory allocation facility that can be used by multiple = processes and PSLs to support globally shared data. In = this scenario, both addresses and their associated data = must be shared. Nearly the same effect could be = achieved by mmaping files between different protection = domains. However, the involvement of the linker in the = creation of CSLs allows reference to data variables via = symbolic names instead of through pointers. Global = CSLs thus simplify use of shared data. 3.3.3 Client Context Specific Libraries Client CSLs facilitate sharing of process-specific = information between client and library domains. Shared = information, encapsulated in a library, is mapped into = each process's address space as shown in Figure3. All = processes get their own copy of the data, but the data is = located at identical locations in all domains. Upon call- ing a Protected Library entry point, the CSL data of the = current primary (process) domain gets re-mapped into = the service domain. Thus, Protected Library code sees = the CSL data belonging to the current process. Figure3 = shows that when process one calls the service, its CSL = data gets mapped into the library domain. With this = form of sharing, the client CSL gets mapped and = unmapped as a protected call traverses multiple = domains. Synchronization mechanisms based on volun- tary locks may be used to ensure no two threads of the = same parent process simultaneously modify shared data. = Figure 3 Client Context-Specific Library Client CSLs can be used to implement client-spe- cific meta-data. A Protected Library which implements = a file system, for example, might maintain process spe- cific meta-data at a certain fixed address. Whenever a = thread of a particular process invokes the file system = library, the library code automatically refers to the cli- ent-specific meta-data. A similar principle is used to = implement u-blocks in most commercial UNIX imple- mentations. Per-client meta-data may also be used to = maintain per-client method/function tables. This allows = client-process specific behavior to be automatically = encapsulated within Client CSLs. Any changes to a cli- ent-specific behavior is thus easily limited to particular = process. 3.3.4 Domain Context Specific Libraries With a Domain Context Specific Library, depicted = in Figure 4, addresses are shared, but data is not. With a = Domain CSL, a distinct copy of the library data is main- tained for each protection domain. The data is mapped = at the same address in each domain. This form of = address sharing includes both static and dynamic data. = Dynamic allocation of memory in a Domain CSL may = be viewed as acquisition of a shared resource, specifi- cally the addresses shared by all clients of the Domain = CSL. If a particular domain allocates memory in a = Domain CSL, the allocated addresses may not be reused = by any other domain. Consequently, in Domain CSLs = locking is supported during address allocation but no = locks are necessary when data is accessed because every = domain has its own data. Figure 4 Domain Context-Specific Library One possible use of Domain CSLs is for sharing of = C++ objects that contain pointers to virtual function = tables (vtbls). In this case, the vtbl must be located at the = same address in every domain, but its contents must be = unique per domain [Banerji et al. 94]. Thus, the address = of a function table can be shared while ensuring the con- tents of the table are unique per domain. 3.4 Potential Uses of PSLs Protected Shared LIbraries are a valuable tool for = structuring systems. PSLs provide obvious benefits = including the efficiency associated with the use of = shared data and passive as opposed to active protection = domains. PSLs also provide more subtle benefits as = described below. 3.4.1 Scope Management Protected Libraries and CSLs may be used to effi- ciently implement scope management in systems that = use meta-object protocols for extensibility. Meta-object = protocol based implementations offer two sets of inter- faces; one provides access to normal functionality, the = other optionally allows manipulation of the service = implementation. The two interfaces associated with = meta-object protocols are depicted in Figure 5. A critical = issue in meta-object protocol implementations is ensur- ing that changes made to a service implementation by a = client affect only the client making the changes. This = issue is referred to as scope management. = Figure 5 Meta-Object Protocol Scope management is often implemented by main- taining per-client function or method tables. Creating = these per-client method tables in client CSLs as shown = in Figure 6 ensures that the service protected library = always "sees" the method tables of the current calling = root domain. Most UNIX implementations use a similar = facility to implement u_blocks. However, this requires = that u_block addresses be hard-coded and thus the = functionality is limited to in-kernel co-location of = u_blocks. Client CSLs eliminate the need for hard- coding addresses and open up the functionality to any = protected library client. = t Figure 6 Scope-Management with PSLs 3.4.2 Locally Distributed Objects Distributed object implementations that cross = machine boundaries need marshalling/unmarshalling = and method table pointer initializations because shared = memory facilities often do not extend across machines. = Most marshalling/unmarshalling and method table = pointer manipulations are unnecessary in distributed = object implementations that do not cross machine = boundaries. However, most implementations do not use = shared memory to implement distributed objects effi- ciently in the local case [Radia 95]. Client CSLs and = domain CSLs can be used to avoid marshalling/unmar- shalling or method table pointer initialization int the = local case. = Figure 7 Locally Distributed Objects Figure 7 shows instance data created in a client = CSL with a method table in a domain CSL. The instance = data gets mapped into the called domain when a locally = distributed object is invoked. Use of domain CSLs = ensures the appropriate method tables are found in each = domain at identical addresses. As Figure 7 indicates the = caller and callee method tables are co-located since they = are created in a domain CSL, but their contents are dif- ferent. In the caller, the method table contains a pointer = to a stub-method, whereas in the callee the method table = contains a pointer to the actual method. Thus, judicious = use of client CSLs and domain CSLs can eliminate = extraneous overhead in locally distributed object imple- mentations. 3.4.3 Per-Client Protection Schemes Use of passive libraries implies per-client bindings = thus allowing for flexibility in composing protected = library modules. Shared libraries with or without protec- tion require binding of client relocations to service sym- bols. The nature of these bindings is such that they are = always maintained on a per-client basis. Thus, when = shared libraries are used as protection domains, the = binding between a client and a protected service may be = adjusted on a per-client basis. While most processes = transfer to some trampoline code to access a service = entry point, a trusted client may be linked to have direct = access. This allows for highly efficient trusted pro- cesses. Similarly, well-tested and debugged clients may = be linked to directly access a service, bypassing protec- tion boundaries. Finally, using facilities for dynamic = binding of symbols, protection boundaries may be = removed and inserted at run-time, if necessary. All these = facilities, result from a single design choice, the use of = shared libraries or more specifically the use of passive = modules subject to relocation. 4. PSL Implementation One important aspect of the PSL research to date = has been the construction of a prototype implementa- tion. The main objectives of the prototype were to clar- ify the PSL semantics and provide an experimental = testbed for a quantitative performance analysis. The pro- totype was built on AIX 3.2.5, and consists of a modi- fied AIX kernel, C runtime libraries and a new linker. = This prototype is only one possible implementation of = PSL semantics. Because other operating systems, exe- cutable file formats and hardware architectures may = imply different implementations, only the main imple- mentation issues are described here. We first present an overview that draws the connec- tion between different semantic features and aspects of = the implementation. The following subsection describes = the components themselves in detail. Prior to delving = into implementation issues, a brief description of the = RS/6000 memory architecture is presented. 4.1 RS/6000 Memory Architecture A given address space on an RS/6000 is defined by = a set of sixteen segment registers, each of which con- tains a 24-bit segment ID. The RS/6000 uses 32-bit vir- tual addresses, four bits of which identify a segment = which effectively extends the 32-bit virtual address to 52 = bits. Of the remaining 28 bits from the original 32-bit = virtual address, 16 bits identify a virtual page within the = segment, and the remaining 12 bits identify a byte = within the page. By definition, addresses are not generally valid = across address spaces. Regions can, however, be shared = among multiple address spaces if each space loads a = given segment register with the same 24-bit segment ID. = As with most architectures, loading of a segment regis- ter is a privileged operation on the RS/6000. 4.2 Implementation Overview PSL semantic features map rather directly to = aspects of the PSL implementation. In each of the fol- lowing paragraphs the relationship between a specific = PSL semantic feature and the relevant aspects of the = prototype implementation is described. The relation- ships between semantic features and implementation = aspects are summarized in Table 1. = 4.2.1 Shared Libraries as Protection = Domains The PSL implementation divides a process's = address space into different regions. PSLs are mapped = into these regions depending on the type of sharing and = protection desired. Protection is provided by making = different regions visible at any given instance. The = implementation utilizes various hardware capabilities, = such as page tables, segment registers and supervisor = calls to control address space visibility. Division of the = process-private address space into regions and mapping = of libraries into these regions is the responsibility of the = system loader. 4.2.2 Address Space Switch on Library Call A partial address space switch is performed each = time a thread enters a PSL via a protected entry point. = The switch is performed by a small piece of privileged = mode code called the trampoline code. The PSL linker = ensures that threads trap to the trampoline code when = calling a protected library entry point and upon return- ing from the PSL routine. 4.2.3 Sharing between Protection Domains Sharing information between protection domains = requires that certain address space regions be deemed = shareable. CSLs are mapped into these shareable = regions which in turn are then made visible to the appro- priate domains. As a thread traverses domains, shared = address space regions are mapped and unmapped as = needed depending on the type of sharing. This mapping = and unmapping is performed by the trampoline code. 4.2.4 Uniform Addressing and Naming In order for shared data to appear at the same = address in different domains, addresses which map = shared data in one domain must be reserved in all = domains. This reservation is ensured primarily by the = system loader. However, because address allocation in = UNIX is not encapsulated within the loader or any other = single component, several other pieces of the kernel had = to be modified to ensure reservation. 4.3 Implementation Components Implementing PSLs on AIX involved modifying the = AIX kernel, C run-time libraries and programming = tools. We now describe the primary aspects of the PSL = implementation. 4.3.1 Address Space Reservation Uniform sharing requires a portion of each = domain's address space be reserved. Unfortunately, in = AIX as with most UNIX implementations, address = ranges are allocated independently by a number of ker- nel subsystems. The situation is further complicated in = AIX by hard-coded starting addresses of code and data = segments. To ensure address space reservation, almost = all responsibility for address allocation was extracted = from the various kernel subsystems and relocated to the = system loader. In certain low-level assembly routines = where this was not possible, address allocation logic = was modified in place. 4.3.2 Partial Address Space Switches The overhead associated with PSL protection is = largely determined by the efficiency of the partial = address space switches performed by the trampoline = code. Typically an address space switch involves setting = up page tables and flushing caches and translation = lookaside buffers (TLBs). This process can be very = costly depending on the number of pages involved and = the size of the cache. Partial address space switches = were implemented quite carefully in the prototype to = minimize overhead and maximize performance. Page table entries that must be switched during a = domain transition are preallocated. These entries are = maintained in software using a sparse representation = technique [Acetta et al. 86], so a large number of pages = can be represented using a small number of entries. The = trampoline code only switches a couple of pointers to = incorporate these new entries into the software main- tained page tables. As pages are referenced in the new = domain, the hardware page tables are lazily evaluated by = updating them from the software tables. Certain sets of addresses are invalidated during a = partial address space switch using architecture-specific = techniques. Typically, cache and TLB contents are inval- idated during an address space switch to prevent cached = data and translations from being used erroneously. To = avoid the high cost of such invalidations, the PSL imple- mentation implements a partial address space by chang- ing segment register contents instead of modifying page = tables. This prevents user-level threads from generating = illegal addresses, while allowing for verify fast = switches. Variations of this architecture-specific tech- nique has been used on other architectures as well = [Liedtke 95]. = 4.3.3 Protected Shared Library Linker The Protected Shared Library linker subsumes the = functionality of /bin/ld, the normal AIX linker. For = binaries that are not and do not use PSLs, the PSL linker = simply calls /bin/ld. For binaries that are PSLs or = use PSLs, the PSL linker has three main responsibilities. = First, wherever it detects calls to a protected library = entry point, the linker patches in a trap to trampoline = code. Second, when creating a PSL, the linker adds = descriptive information into unused field of the resulting = binary. This information describes the kind of PSL, and = type of sharing. It is used by the system loader to map = the library into an appropriate address space region. = Finally, the linker ensures PSL initialization and termi- nation routines are called as needed. Like many other = shared library implementations [Dietel & Kogan 92], = PSLs support sub-system and per-client initialization = and termination routines. 4.3.4 Protected Shared Library Loader The PSL loader replaces the AIX 3.2.5 system = loader. In short, the loader implements most static = aspects of PSL semantics. It creates multiple address = space regions within private address spaces, maps librar- ies to these address space regions and generates per- domain information made available to the trampoline = code. The PSL loader differs from the original AIX = loader in two main ways. First, data mapping, symbol = resolution and relocation were modified to ensure PSL = semantics. Second, functionality was added to set up = virtual memory data structures for each address space = region a library is mapped into. Typical UNIX loaders map object module data sec- tions into a single address region, the process data seg- ment. In contrast, the PSL loader may map object = module data sections into multiple address space = regions. The exact address space regions that a data sec- tion is loaded into depends on the type of PSL. This = mapping and subsequent resolution and relocation cre- ates multiple address space regions; one of which is the = traditional data segment, the rest map PSL libraries as = shown in Figure 2, Figure 3, and Figure 4. The PSL loader employes a number of data struc- tures to ensure visibility of address space regions in = each protection domain. These structures include both = hardware and operating system dependencies and are = designed to allow path lengths through the critical tram- poline code to be minimized. 4.3.5 Stack Management Passive protection domains simplify stack manage- ment during domain transitions. As in thread migration = implementations [Bershad 90], there are two types of = stacks. A system maintained activation stack ensures = protected library calls can be nested. As a thread enters a = new domain, state information for the calling domain is = pushed onto the activation stack. Upon returning to the = caller, the state information is restored from the activa- tion stack and the stack is popped. The second stack is = the execution stack used by almost all run-time environ- ments. Because all secondary protection domains are = passive, the execution stack of the calling thread moves = with it across domains. Specifically, a thread's execution = stack gets unmapped from the calling domain and = mapped into the called domain during a protection = domain switch. This eliminates the complexity of = dynamic stack allocation associated with most thread = migration implementations [Bershad 90]. 4.3.6 Resource Management The current PSL prototype transfers almost all = resources from the calling to the called domain during a = protection domain switch. The only exceptions are sig- nal related resources that are handled on the basis of sig- nal type to allow for error recovery after exceptions. A = complete implementation of the PSL resource handling = semantics would require significant modifications of the = UNIX kernel. This is a reflection on resource handling = in UNIX kernels, not on PSL semantics. = Over the last decade or so, kernel code in most = UNIX implementations has become quite structured. = The vnode [Kleiman 86], HAT layer [Goodheart & Cox = 93] and the emerging UDI interfaces [UDI 96] ensure = the file-system, low-level virtual memory management = services and I/O system are accessed through well- defined interfaces. This provides some degree of encap- sulation and isolates clients of these interfaces from = implementation changes. Furthermore, indirections such = as those postulated by the stackable file systems stan- dard [Heidemann 95] can be easily implemented. This = tends to make subsystem implementations more flexible = and easier to maintain. Unfortunately the same cannot be said for process = and resource management. Typically this is done = through the u_block and proc structures and UNIX ker- nels are typically littered with direct access to these = structures. Such direct access prevents any degree of = encapsulation and makes it very difficult to build indi- rections or change UNIX resource handling. The need = for encapsulation of resource handling in UNIX kernels = has been previously recognized [Zajcew et al. 93]. = Implementation of PSL based protection domains and = process migration clearly require better encapsulation of = resource handling and standardization in this area is = strongly encouraged. 4.3.7 Trampoline Code The trampoline code performs the partial address = space switches required when a client calls and subse- quently returns from a protected shared library routine. = The trampoline code performs six functions. =B7 Stack management =B7 Changing of address space visibility =B7 Handling of shared address space regions =B7 Modification of non-memory resource accessibility =B7 Passing of caller's resources to called domain =B7 Transfer of control to target entry point The trampoline code is responsible for almost all = dynamic aspects of PSL semantics. It is by far the most = performance critical part of the PSL implementation. = Because of this, the code is written in assembly lan- guage and pinned in memory at run time. Furthermore, = in the prototype AIX implementation, the trampoline = code is accessed via a special trap handler that avoids = much of the overhead of a typical UNIX system call. 5. Performance This section sheds light on various aspects of PSL = performance. In general, Protected Shared Libraries = have been found to perform better than other popular = forms of cross-domain cooperation. The next subsection = begins with a comparison of null-RPC times. This is fol- lowed by a comparison of PSLs with other competitive = protection schemes. Finally, the section ends with a = breakdown of PSL call costs. A more thorough analysis = of PSL performance can be found in [Banerji 96]. 5.1 Null-RPC Benchmark Table 2 shows the null RPC times for several RPC = implementations in AIX 3.2.5 on an RS/6000 Model = 530 with a relatively slow 25 Mhz POWER processor. = The first two numbers are for classic user-level IPC; the = third is for hand-off scheduling [Black 89]. The fourth = number indicates thread migration is a bit slower than a = PSL call due to resource handling overhead. The final = two lines indicate the PSL null-call time is comparable = to the time required for a null system call. = 5.2 Benchmarks This section evaluates five different protection = schemes for six different benchmark tests. In each case, = the benchmark code is built as a service which is = invoked by a client. The goal is to evaluate the cost of = protecting the service from the client. Figure 8 shows = how each invocation has to cross from client code to ser- vice code, and indicates where we start and stop our data = gathering. The benchmarks used were: Figure 8 Client/Service Relationship =B7 MD5, a secure one-way hash function developed to = reliably identify long byte strings [Rivest 92]. The = implementation used is based on code made avail- able by RSA [RSA 93]. The input byte string is par- titioned into fixed-length substrings, and the = algorithm operates on the substrings in succession. = =B7 Nsieve, a well-known benchmark that computes = prime numbers. Problem size for Nsieve is the total = number of primes to calculate; granularity is the = number of numbers searched. The iterative portion = of the Nsieve code was built as the service, so the = number of invocations depends on the density of the = primes. =B7 tdbm_i, tdbm_f, and tdbm_d, three benchmarks = involving the tdbm database, a small in-memory = database based on the Berkeley UNIX ndbm = library. This is a slight modification of the sdbm = library released by Ozan Yigit [Yigit 92], and is = based on the 1978 dynamic hashing algorithm by = Paul Larson [Enbody 88]. Changes were made to = avoid unnecessary copying and remove file depen- dencies. The tests involve insertion of N words = from an extended version of /usr/dict/words, ran- dom fetch of N/2 words, and deletion of N/2 words. =B7 Nullc, a custom benchmark that is essentially a null = call. The service is passed a block of data. It = touches every data page and returns the data to the = client. This test measures the base cost of transfer- ring variable-sized parameters between protection = domains. 5.2.1 Protection Schemes The protection schemes include both classic and = new approaches. Three are hardware based and depend = upon the kernel protection boundary. The other two are = software-based approaches. The protection schemes are = null-protection which is used as a baseline, traditional = kernel-based system calls, process-based protection = with thread migration, library-based PSLs, a language- based safe-subset of Modula 3 and software-fault-isola- tion [Wahbe 93]. Figure 9 Execution Times for MD5 5.2.2 Methodology Measurements were taken on an IBM RS/6000 = Model 390 with a single 66-MHz POWER2 processor. = [Weiss 94]. For each benchmark, the granularity of the = protection domain was varied, and the number of = machine cycles needed to perform the service was = recorded. Only plots for md5 and tdbm_d are shown = here. For the md5 tests, the total message size was kept = constant at 512 KBytes, and the number of bytes passed = to the service with each invocation was varied. Figure 9 = shows the resulting performance as a function of prob- lem granularity. Increasing granularity increases results = in fewer service invocations which causes the execution = time of all schemes to decrease. For tdbm, the number = of elements in the database was varied. Each invocation = deals with one entry, but for larger databases the service = does more work. The results for tdbm, shown in Figure = 10, differ significantly from the md5 curves in Figure 9. = Figure 10 Execution times for tdbm_d 5.2.3 Analysis A few points about the results shown in Figure 9 = and Figure 10 bear mention. First, PSLs outperformed = thread migration. This is primarily due to cross-domain = sharing, simplified stack management and improved = resource management. Second, Figure 9 indicates that at = higher problem granularities PSLs outperformed kernel = based protection. With md5 this happens when the extra = kernel trap and return of PSL interactions is outweighed = by data copying costs of kernel interactions. Finally, in = certain cases the PSL implementation may actually out- perform the unprotected case as shown in Figure 9. This = is due to the PSL implementation of shared which = allows data to remain in the cache between runs of dif- ferent processes, whereas ordinary shared library data is = always faulted into the caches on a per-process basis (at = least in UNIX). = 5.3 PSL Cost Breakdown Figure 11 shows a breakdown of the costs associ- ated with PSL-based protection compared to the case of = no protection. Most of the overhead is due to aliasing, = resolution of multiple virtual addresses to the same = physical address. The POWER2 architecture provides = hardware support to resolve aliases, but the support is = not exploited by AIX 3.2.5. = Figure 11 PSL Cost Breakdown 5.3.1 Aliasing Cost Aliasing arises with hardware-based protection = because client and service domains are different virtual = address spaces. Consequently, virtual memory data = structures must be updated when control is transferred = between client and server. Also, references to shared = data can cause TLB misses. The resulting alias faults = dramatically impact transition overhead. These two = costs, the in-kernel transition cost of updating aliasing = data structures and extra faults due to aliasing, signifi- cantly impact PSL performance. Figure 11 indicates the in-kernel transition cost, = which includes adjustments to aliasing data structure, is = essentially the same for all benchmarks. For nullc and = md5, only one domain actually touches the shared data, = so there are no alias faults. For the other tests, however, = both client and library code access the data which = results in considerable time spent handling aliasing = faults. Excluding time for alias faults, for five of the six = tests, the PSL overhead lies between 1300 and 2500 = cycles. To assess the cost of updating aliasing data struc- tures, the number of instructions in the transition code = was counted for the simplest service, nullc, with a four = byte transfer. This code calls AIX routines, written in C, = which adjust the virtual memory data structures. Thus, = the difference between the transition code instruction = count (210) and the total instructions used to make the = transition (1250) provides an indication of the cost of = aliasing (1040 instructions approx). Hence, if we ignore = the costs due to aliasing, which can be safely done for = most modern hardware, the cost of a PSL call is actually = the cost of executing 210 instructions and the kernel = trap/return times. This cost is 275 cycles for the 66 MHz = IBM POWER2. 5.3.2 Trap costs Given that the aliasing problem can be solved in = hardware, it is important to look at the other costs. = These are approximately 210 instructions per transition, = or about 275 cycles including kernel trap and return. = This cost approaches those for highly optimized cross- domain transfer mechanisms [Hamilton & Kougiouris = 93]. Without aliasing, the hardware cost of the kernel = trap and return is significant, 57 cycles for the = POWER2. Library-based protection traps twice for = every service invocation, thus 114 of the remaining 275 = overhead cycles are due to the trap and return. Some = modern processors, such as the UltraSparc, have = reduced trap overhead to as little as 11 cycles (that is, 22 = cycles for PSL-like double trap and returns) and trap = costs are expected to continue decreasing. 6. Discussion Shared libraries form an excellent basis of modular- ity for structuring large systems. Protected Shared = Libraries enhance the popular notion of shared libraries = in two ways, by adding protection and allowing data to = be shared across protection boundaries. This enables = PSLs to be used to securely implement sensitive ser- vices. Sharing reduces many of the costs of cross- domain interactions, thus making it a viable alternative = to language-based or process-based protection schemes. = A prototype PSL implementation has demonstrated the = efficiency of the PSL approach. PSLs are well-suited for use in large user-level = applications and for implementation of operating system = services at user level as is common in microkernels. = There are also less-obvious uses for PSLs. First, dynam- ically loadable kernel extensions suffer from their abil- ity to corrupt kernel data. Extensions could be prevented = from corrupting data to which they do not need write = access by building them as dynamically loadable privi- leged mode PSLs. Thus, PSLs may provide safe ways of = extending existing operating system kernels. Second, = one important research area in operating systems is the = design of low-level nanokernels or exokernels [Engler = 95]. These kernels provide low-level protected inter- faces to the hardware with most operating system func- tionality implemented in library routines. PSLs are an = attractive approach to protecting such operating systems = from application code. References [Acetta et al. 86] Acetta, M., Baron, R., Golub, D., = Rashid, R., Tevanian, A. and Young, M. "Mach: A New = Kernel Foundation for UNIX Development." In Pro- ceedings of the Summer 1986 USENIX Conference = (Atlanta, GA. July). The USENIX Association, Berke- ley, CA, 1986, pp. 93-112. [Banerji et al. 94] Banerji, A. et al. "Shared Objects = and vtbl Placement - Revisited." Journal of C Language = Translation, September 1994, pp. 31-46.. = [Banerji et al. 94a] Banerji, A., Kulkarni, D. and = Cohn, D. "A Framework for Building Extensible Class = Libraries.", In Proceedings of the 1994 USENIX C++ = Conference, 1994, pp. 26-41. [Banerji 96] Banerji, A. et al. "Quantitative Analy- sis of Protection Options." Technical Report (unnum- bered), University of Notre Dame, Notre Dame, IN, = 1996. [Batlivala et al. 92] Batlivala, N., Gleeson, B., = Hamrick, J., Lurndal, S., Price, D., Soddy, J. and = Abrossimov, V. "Experience with SVR4 Over CHO- RUS." In Proceedings of the USENIX Workshop on = Micro-Kernels and Other Kernel Architectures (Seattle, = WA. April 27, 28). The USENIX Association, Berkeley, = CA, 1992, pp. 223-241. [Bershad, 90] Bershad, B. "Lightweight Remote = Procedure Call." ACM Transactions on Computer Sys- tems, *(1), February, 1990. [Bershad et al. 95] Bershad, B. N., Saveag, A., Par- dyak, P., Sirer, E. G., Fiuczynski, M. E., Becker, D., = Chambers, C. and Eggers, S. "Extensibility, Safety and = Performance in the SPIN Operating System." In Pro- ceedings of the Fifteenth ACM Symposium on Operating = System Principles (Copper Mountain Resort, CO. Dec. = 3-6. ACM Press, NY, 1995, pp. 267-284. (https:// www.cs.washington.edu/research/projects/spin) [Black 89] Black, D. "Scheduling support for Con- currency and Parallelism in the Mach Operating Sys- tem." Unpublished. [Black et al. 92] Black, D. et al. "Microkernel Oper- ating System Architecture and Mach." In Proceedings of = the USENIX Workshop on Micro-Kernels and Other = Kernel Architectures (Seattle, WA. April 27, 28). The = USENIX Association, Berkeley, CA, 1992, pp. 11-30. [Bogle 94] Bogle, P., and Liskov, B. "Reducing = Cross Domain Call Overhead Using Batched Futures." = In Proceedings of OOPSLA 94, ACM, 1994. [Borgendale et al. 94] Borgendale, K., Bramnick, = A. and Holland, I. M. "Workplace OS: What is the OS/2 = Personality?" March 24, 1994. [Campbell et al. 93] Campbell, R. et al. "Designing = and Implementing Choices: An Object-Oriented System = in C++." Communications of the ACM, 36(9), 1993, pp. = 117-126. [Carter et al. 93] Carter, J. et al. "FLEX: A Tool for = Building Efficient and Flexible Systems." In Proceed- ings of the Fourth Workshop on Workstation Operating = Systems (Napa, CA. Oct. 14, 15). IEEE Computer Soci- ety Press, Los Alamitos, CA, 1993, pp. 198-202. [Chase 94] Chase, J., et al. "Sharing and Protection = in a Single Address Space Operating System." ACM = Transactions on Computer Systems, Vol. 12(4), Novem- ber 1994, pp. 271-307. [Condict et al. 93] Condict, M., Mitchell, D. and = Reynolds, F. "Optimizing Performance of Mach-based = Systems By Server Co-Location: A Detailed Design." = August 10, 1993. [Deitel & Kogan 92] Deitel, H. M. and Kogan, M. = S. The Design of OS/2. New York: Addison Wesley, = 1992. [Druschel 92] P. Druschel et. al. "Beyond Microker- nel Design: Decoupling Modularity and Protection in = Lipto." In Proceedings of the 12th International Conf. = on Distributed Computing Systems, IEEE Computer = Society Press, Los Alamitos, CA, pp. 512-520. [Druschel & Peterson 93] Druschel, P. and Peter- son, L. L. "Fbufs: A High-Bandwidth Cross-Domain = Transfer Facility." In Proceedings of the Fourteenth = ACM Symposium on Operating Systems Principles = (Asheville, NC. December 5-8). ACM Press, New York, = NY, 1993, pp. 189-202. [Enbody 88] Enbody, R. and Du, H. "Dynamic = Hashing Schemes." ACM Computing Surveys, Vol. 20, = No. 2, 1988, pp. 85-113. [Engler 95] Engler D., et al. "Exokernel: An Oper- ating System Architecture for Application-Level = Resource Management" In Proceedings of the Fifteenth = ACM Symposium on Operating Systems Principles = (Copper Mountain Resort, CO. Dec. 3-6), December = 1995. [Ford & Lepreau 94] Ford, B. and Lepreau, J. = "Evolving Mach 3.0 to a Migrating Thread Model." In = Proceedings of the Winter 1994 USENIX Technical Con- ference (San Francisco, CA. Jan. 17-21). USENIX = Association, Berkeley, CA, 1994, pp. 97-114. [Garrett et al. 93] Garrett, W. E., et al. "Linking = Shared Segments." In Proceddings of the Winter 1993 = USENIX Conference (San Diego, CA, January 25-29), = The USENIX Association, Berkeley, CA, 1993, pp. 13- 27. [Golub et al. 90] Golub, D., Dean, R., Florin, A. = and Rashid, R. "Unix as an Application Program." In = Proceedings of the Summer 1990 USENIX Conference = (Anaheim, CA. June 11-15). The USENIX Association, = Berkeley, CA, 1990, pp. 87-95. [Golub et al. 93] Golub, D. B., Manikundalam, R. = and Rawson, F. L. III. "MVM - An Environment for = Running Multiple Dos, Windows and DPMI Programs = on the Microkernel." In Proceedings of the Third = USENIX Mach Symposium (Santa Fe, NM. April 19- 21). USENIX Association, Berkeley, CA, 1993, pp. = 173-190. [Goodheart & Cox 93] Goodheart, B. and Cox, J. = The Magic Garden Explained. New York: Prentice Hall, = 1993. (ISBN 0-13-098138-9) [Hamilton & Kougiouris 93] Hamilton, G. and = Kougiouris, P. "The Spring Nucleus: A Microkernel for = Objects." In Proceedings of the Summer 1993 USENIX = Conference (Cincinnati, OH, June). The USENIX Asso- ciation, Berkeley, CA, 1993. [Heidemann 95] Heidemann, J. S. Stackable Design = of File Systems. Ph.D. Dissertation, University of Cali- fornia, Los Angeles, 1995. [IBM 93] SOMObjects Developer Toolkit User's = Guide, Version 2.0, June 1993, IBM, Austin, TX. [Janssen 95] Janssen, B., et al., ILU 1.7 Reference = Manual, Xerox Corporation, January 1995. [Khalidi & Nelson 93] Khalidi, Y. A., and Nelson, = M. N. "An Implementation of Unix on an Object-Ori- ented Operating System." In Proceedings of the Winter = 1993 USENIX Conference. The USENIX Association, = Berkeley, CA, 1993, pp. 469-479. [King 94] King, A. Inside Windows 95. Redmond, = WA: Microsoft Press, 1994. [Kleiman 86] Kleiman S. "Vnodes: An Architec- ture for Multiple File System Types in Sun UNIX." In = Proceedings of the Summer 1986 USENIX Conference, = June 1986, pp. 238-247. [Leffler at al. 89] Leffler, S., McKusick, M. K., = Karels, M. J. and Quarterman, J. S. The Design and = Implementation of the 4.3 BSD UNIX Operating System. = New York: Addison-Wesley Publishing Company, 1989. = (ISBN 0-201-06196-1) [Lepreau et al. 93] Lepreau, J. et al. "In_Kernel = Servers on Mach 3.0: Implementation and Perfor- mance." In Proceedings of the Third USENIX Mach = Symposium (Santa Fe, NM. April 19-21). USENIX = Association, Berkeley, CA, 1993, pp. 39-55. [Lepreau et al. 94] Lepreau, J., et. al. "The Flux = Operating System Project." https://www.cs.utah.edu/ projects/flexmach. [Liedtke 95] Liedtke, J. "On m-Kernel Construc- tion." In Proceedings of the Fifteenth ACM Symposium = on Operating System Principles (Copper Mountain = Resort, CO. Dec. 3-6). ACM Press, New York, NY, = 1995, pp. 237-250. [Maeda & Bershad 93] Maeda, C. and Bershad, B. = N. "Services without Servers." In Proceedings of the = Fourth Workshop on Workstation Operating Systems = (Napa, CA. Oct. 14, 15). IEEE Computer Society Press, = Los Alamitos, CA, 1994, pp. 170-176. [Malan et al. 90] Malan, G., Rashid, R., Golub, D., = and Baron, R. "DOS as a Mach 3.0 Application." In = Proceedings of the USENIX Mach Workshop (Burling- ton, VT. Oct.). The USENIX Association, Berkeley, CA, = 1990, pp. 27-40. [Nelson, 91] Nelson, G., Systems Programming = with Modula-3. Englewood Cliffs, NJ: Prentice Hall, = 1991. [Orr 92] Orr, D. and Mecklenburg, R. W. "OMOS - = An Object Server for Program Execution." In Proceed- ings of the International Workshop on Object Oriented = Operating Systems, IEEE Computer Society Press, Los = Alamitos, CA, 1992, pp. 200-209. [Organick 72] Organick E., The Multics System: An = Examination of its Structure, Cambridge: The MIT = Press, 1972. [Phelan et al. 93] Phelan, J. M., Arendt, J. W., and = Ormsby, G. R. "An OS/2 Personality on Mach." In Pro- ceedings of the Third USENIX Mach Symposium (Santa = Fe, NM. April 19-21). The USENIX Association, Ber- keley, CA, 1993, pp. 191-201. [Pu 95] Pu, C., et al. "Optimistic Incremental Spe- cialization: Streamlining a Commercial Operating Sys- tem." In Proceedings of the Fifteenth ACM Symposium = on Operating System Principles (Copper Mountain = Resort, CO. Dec. 3-6). ACM Press, New York, NY, = 1995. [Radia 95] Radia S., et al, The Spring Object = Model, Proceedings of the Conference on Object Tech- nologies and Systems, July 1995. [Rivest 92] Rivest, R. The MD5 Message-Digest = Algorithm, Network Working Group RCF 1321, 1992. [RSA 93] https://www.rsa.com/pub/md5.txt [Rosier et al. 92] Rosier, M., Abrossimov, F., = Armand, F., Boule, I., Gien, M., Guillemont, M., Her- rman, F., Kaiser, C., Langlois, S., L=E9onard, P., and Neu- hauser, W. "Overview of the Chorus Distributed = Operating System." In Proceedings of the USENIX = Workshop on Micro-Kernels and Other Kernel Architec- tures (Seattle, WA. April 27, 28). The USENIX Associ- ation, Berkeley, CA, 1992, pp. 39-69. [Scott et al. 90] Scott, M. L., LeBlanc, T. J., and = Marsh, B. D. "Multi-Model Parallel Programming in = Psyche" In Proceedings of the Second ACM Symposium = on Principles and Practice of Parallel Programming = (Seattle, WA, March 14-16), 1990, pp. 70-78. [UDI 96] Uniform Driver Interface, ftp://tel- ford.nsa.hp.com/pub/hp_stds/udi/home.html [Wahbe 93] Wahbe, R., et. al. "Efficient Software- based Fault Isolation." In Proceedings of the Fourteenth = ACM Symposium on Operating Systems Principles, = December 1993, pp. 203-216. [Weiss 94] Weiss, S., Smith, J., POWER and Pow- erPC, San Francisco: Morgan Kauffman Publishers, = Inc., 1994. [Wiecek et al. 93] Wiecek, C. A., Kaler, C. G., = Fiorelli, S., Davenport, W. C. Jr., and Chen, R. C. "A = Model and Prototype of VMS Using the Mach 3.0 Ker- nel." In Proceedings of the USENIX Symposium on = Microkernels and Other Kernel Architectures (Seattle, = WA. April 27, 28). The USENIX Association, Berkeley, = CA, 1992, pp. 187-203. [Wulf et al. 81] Wulf, W. A., Levin, R. and Harbi- son, S. P. Hydra/C.mmp: An Experimental Computer = System, McGraw-Hill, New York, 1981. [Yigit=A092] Yigit, O. ftp://ftp.x.org/contrib/util/sdbm [Yokote 92] Yokote, Y. "The Apertos Reflective = Operating System: The Concept and its Implementa- tion." In Proceedings of the Seventh Annual Conference = on Object-Oriented Programming Systems, Languages, = and Applications (OOPSLA `92), ACM Press, NY, = 1992, pp. 414-434. [Zajcew et al. 93] Zajcew, R. et al. "An OSF/1 = UNIX for Massively Parallel Multicomputers." In Pro- ceedings of the 1993 Winter USENIX Conference, The = USENIX Association, Berkeley, CA, 1993, pp. 449-468. --------------1F071C4274D3--