USENIX ;login: - Standards

A Programmer's Overview of the POSIX.1d Draft Standard: Additional Realtime Extensions

Karen Gordon <kgordon@vuse.vanderbilt.com>
Joe Gwinn <gwinn@raytheon.com>
Jim Oblinger <Oblinger.JT@nuwc.navy.mil>
Frank Prindle <prindle@nadc.navy.mil>

Introduction

The POSIX.1d draft standard is the third in a series of realtime POSIX operating system interface standards. The first standard, POSIX.1b-1993, specifies interfaces in nine areas: realtime signals, synchronized I/O, asynchronous I/O, semaphores, memory locking, memory mapped files and shared memory, priority scheduling, high resolution clocks and timers, and message passing. The second standard, POSIX.1c-1995, specifies a number of interfaces that together define a threads extension to the POSIX.1-1990 process model. The standards body responsible for the development and maintenance of all the POSIX standards is the IEEE Portable Applications Standards Committee (PASC). The particular PASC working group that focuses on the realtime POSIX standards is known as the Systems Services Realtime Working Group and is the successor to the fabled POSIX.4 Working Group.

POSIX.1d specifies additional interfaces in support of the realtime goals of predictability and high performance. The interfaces fall into seven areas:

Process creation via posix_spawn()
Sporadic server execution scheduling policy
Execution time monitoring of processes and threads
I/O advisory information
Timeouts for selected blocking functions
Device control
Interrupt control

In the POSIX.1d draft standard, these interfaces are specified as a number of options to POSIX.1, as amended by POSIX.1b and POSIX.1c. Most of the options can be implemented independently of other options. This means that operating system vendors can choose to implement some POSIX.1d interfaces and not others, just as they can choose to implement some POSIX.1b and POSIX.1c interfaces and not others. Likewise, users must decide which optional interfaces they require, and purchase operating systems that implement those required interfaces. It should be noted that some options are dependent on the existence of certain other options. The options and their dependencies are described in the following sections.

As the POSIX.1d draft standard matures, the Working Group will fold subsets of the options specified in the draft standard into existing and/or new application environment profiles. The existing profiles are documented in POSIX.13-1998.

Process Creation Via `posix_spawn()`

The POSIX.1d draft standard introduces interfaces for enabling an application to create a new process – including loading a new process image containing new executable code into the new process's address space-in a single step. These interfaces can be used in place of the POSIX.1 process creation interfaces, which entails two steps: first a fork() to create a process whose process image is a copy of the parent's process image, and then an exec() to overlay the copy of the parent's process-image with a new image, taken from a specified executable file. This two-step process creation mechanism can waste resources; one process image is loaded, and then it is immediately replaced by another process image. More significantly, the two-step mechanism places an undue hardship on systems lacking memory management units (MMUs) or other hardware support for dynamic address translation, because such systems cannot readily implement the fork() operation. The problem is that, without dynamic address translation, the addresses in a process image are actual physical addresses. Therefore, the child's process image, if it were indeed an exact copy of the parent's process image, would be competing for the same physical memory as the parent's process image.

The new process creation interfaces, which are available under the Spawn option, are posix_spawn() and posix_spawnp().

Both of these create a new process, whose process image is constructed from a specified executable file referred to as the new process image file. The posix_spawn() operation takes a fully defined pathname as an argument; the posix_spawnp() operation builds a pathname from a file argument and the PATH environment variable of the calling process. Both operations enable the creator to specify arguments and environment strings for the newly created process. In addition, both operations enable the creator to specify how file descriptors, process group IDs, signal masks, and default signal actions are handled at process creation.

POSIX.5 defines the Start_Process Ada language procedure that performs a function nearly identical to that of posix_spawn(). While preparing Draft 11 of POSIX.1d, the Working Group considered making posix_spawn() a C language binding to Start_Process, but realized that doing so would be detrimental to the realtime goals that motivated the development of posix_spawn() in the first place. Therefore, the Working Group decided to keep posix_spawn()simple and efficient, forgoing some of the power of Start_Process. The Start_Process procedure allows the caller to specify an arbitrarily long ordered script of file open, close, and duplicate operations to be performed for the new process before the new process image begins execution; the posix_spawn() function allows the caller to specify only how each file descriptor in the new process either is mapped to an open file descriptor in the calling process or remains closed. Other than this, posix_spawn() provides the same functionality as Start_Process and, in fact, Start_Process may be directly implemented using posix_spawn() in the case where the caller-specified script of file actions is empty.

Sporadic Server Execution Scheduling Policy

The POSIX.1d draft standard supplements the existing POSIX priority scheduling facilities with the sporadic server scheduling policy, a mechanism for scheduling aperiodic tasks in a primarily periodic environment [Sprunt et al. 89]. The sporadic server scheduling policy was developed as an extension to the rate monotonic scheduling policy. It is based on dynamic changes to the priority of a "sporadic server" set up by an application to service a stream of aperiodic task arrivals; the changes in the priorities are made by the operating system in accordance with parameters supplied by the application and a set of rules spelled out in POSIX.1d. In particular, the priority of the sporadic server is varied between two application-specified levels: a (normal) high foreground priority and a low background priority. The aim of the priority assignment rules is to allow the aperiodic tasks serviced by the sporadic server to execute at a high priority as long as their execution does not pose a threat to the hard deadlines of other tasks.

In short, the sporadic server scheduling policy allows aperiodic tasks to be served at a high priority for a bounded amount of time and at a low background priority at other times. The high-priority service provides the aperiodic tasks with a better average response time than they might otherwise receive. At the same time, the low-priority service that kicks in when they meet their bound on high-priority service prevents them from interfering with hard-deadline periodic tasks, because the aperiodic tasks are then forced to execute in the background and so cannot flood the processor.

The sporadic server scheduling policy is characterized by four parameters, in addition to the normal priority used in POSIX.1b and POSIX.1c:

Low priority: the priority level at which the sporadic server executes (i.e., services aperiodic task arrivals) while in the background. While in the foreground, the sporadic server executes at the normal (high) priority.
Replenishment period: the length of the time intervals over which the foreground CPU-time usage of the sporadic server is monitored and limited.
Initial budget: a bound on the amount of CPU-time that the sporadic server can consume while executing in the foreground during any time interval of length equal to the replenishment period.
Maximum number of pending replenishment operations: this value effectively limits the number of aperiodic tasks that can be serviced during any time interval of length equal to the replenishment period. The point of having this limit is to bound the amount of system overhead (storage and CPU time) required to implement the sporadic server scheduling policy.

Under the Process Sporadic Server and Thread Sporadic Server options, the POSIX.1d draft standard adds these four parameters to the process scheduling parameter structure of POSIX.1b and to the thread scheduling parameter attribute of POSIX.1c. It also amends the following POSIX.1b and POSIX.1c execution scheduling functions to take these parameters into account:

The POSIX.1b functions for assignment of the scheduling parameters (sched_setparam()) and the scheduling policy plus parameters (sched_setscheduler()).
The POSIX.1b functions sched_getparam() and sched_getscheduler() are implicitly amended. Because of the way the sched_getparam() and sched_getscheduler() functions are worded in POSIX.1b, it is not necessary to explicitly modify the functions. That is, the descriptions of the functions do not call out other scheduling policies and parameters by name, so it is not necessary to amend the functions to call out the new sporadic server policy and parameters by name. The wording was intentionally designed to make these interfaces extensible in this regard.
The POSIX.1c functions for assignment and retrieval of scheduling attributes in thread attributes objects for use in thread creation: pthread_attr_setschedpolicy(), pthread_attr_getschedpolicy(), pthread_attr_-setschedparam(), pthread_attr_getschedparam().
The POSIX.1c functions for dynamic assignment and retrieval of thread scheduling policy plus parameters: pthread_setschedparam() and pthread_getschedparam(). Note that it is implementation-defined whether an application can dynamically change its scheduling policy to the sporadic server scheduling policy.

Execution Time Monitoring of Processes and Threads

The POSIX.1d draft standard supplements the clock and timer facilities of POSIX.1b through two options: the Process CPU-Time Clocks option and the Thread CPU-Time Clocks option. In particular, it defines two new types of clocks:

Process CPU-time clocks
Thread CPU-time clocks

In a POSIX.1b clock or timer function call, the CPU-time clock of the calling process is designated by the symbol CLOCK_PROCESS_CPUTIME_ID, and the CPU-time clock of the calling thread is designated by the symbol CLOCK_THREAD_CPUTIME_ID.

These clocks can be used to monitor the CPU usage of processes and threads, as well as to establish limits on such usage through the setting of timers. As stated in IEEE POSIX.1d, the mechanism used to measure CPU time is implementation-defined; thus, the resolution and accuracy of the CPU-time clocks is implementation-dependent. The software data structure used for representing time, however, is standardized by POSIX.1b, for the sake of application portability. Specifically, the POSIX.1b data structure provides for nanosecond resolution, although practical computer clocks are not nearly that good.

The functionality provided by CPU-time clocks can be used during the development and operation of many realtime applications, as in the following examples:

During development, the CPU-time monitoring capability facilitates the collection of information that is vital to system engineers in analyzing the ability of a realtime application to meet its performance specifications.
During operation, the capability of setting CPU-time limits can be used to enhance the robustness of an application, by enabling the application to prevent a potentially faulty process from capturing the CPU.
During operation, the capability of setting CPU-time limits can be used to facilitate the scheduling of certain realtime applications, such as those having iterative components whose results become more precise with each iteration. For example, an iterative component might be allowed to execute at a high priority until it reaches its CPU-time limit, at which point it is considered to have achieved an "acceptable" but imprecise result (i.e., a result – with bounded imprecision – that has been shown through a priori analysis to be acceptable, although not optimal, to the application). Then the iterative component might be allowed to execute at a low priority, improving the precision of its result, up until its deadline. At its deadline, the iterative component provides its (possibly still imprecise) result to the application. The idea behind returning an imprecise result at the deadline is that a timely result of bounded imprecision is better than a precise, but late, result (see [Chung et al. 90] for an overview of how to schedule "imprecise computations").

The measurement of thread execution time may incur excessive overhead in some systems and some applications. Therefore, a new thread creation attribute is introduced by the POSIX.1d execution-time monitoring facility. This attribute allows/disallows thread access to CPU-time clocks; it is set at thread creation-time and is unchangeable thereafter, making it possible for an operating system to optimize the implementation of a given thread.

The specific functions introduced by the POSIX.1d draft standard include the following:

Getting the clock ID of the CPU-time clock of a specified process (clock_getcpuclockid()).
Getting the clock ID of the CPU-time clock of a specified thread (pthread_getcpuclockid()).
Setting or getting the value of the CPU-time clock thread creation attribute (pthread_attr_setcpuclkallow(), pthread_attr_getcpuclkallow()).

I/O Advisory Information

The capability of performing I/O operations with deterministic high performance is crucial in realtime systems. To this end, the POSIX.1d draft standard proposes interfaces for enabling an application to give the operating system advisory information on how the application expects to use specified file and memory space. Notably, the application does not tell the operating system how to manage file and memory access; it just offers "hints" relating to characteristics of the application, which the operating system can take into account in making its resource management decisions. The specific interfaces, available under the Advisory Information option, are as follows:

Providing advisory information on how the application expects to use a specified range of a specified file (posix_fadvise()). The information is conveyed through an advice argument that has a number of standard values.
{POSIX_FADV_NORMAL} No further special treatment
{POSIX_FADV_SEQUENTIAL} Expect sequential references
{POSIX_FADV_RANDOM} Expect random references
{POSIX_FADV_WILLNEED} Will need the specified range soon
{POSIX_FADV_DONTNEED} Don't need the specified range anymore
{POSIX_FADV_NOREUSE} Expect data will not be reused once accessed
Providing advisory information on how the application expects to use a specified range of memory (posix_madvise()). This function is available if the Memory Mapped Files option or the Shared Memory Objects option, in addition to the Advisory Information option, is supported. Like the posix_fadvise() function, this function conveys information through an advice argument having a number of standard values. The values are labeled {POSIX_MADV_NORMAL}, {POSIX_MADV_SEQUENTIAL}, {POSIX_MADV_-RANDOM}, {POSIX_MADV_WILLNEED}, {POSIX_MADV_DONTNEED}, and have the same meanings as the values used with posix_fadvise().

Note that there is no posix_madvise() advice argument value corresponding to the posix_fadvise() advice argument value {FADV_NOREUSE}. This is because "reuse" of data is accomplished through explicit application-specified sharing of memory in the case of memory mapped files and shared memory objects, whereas reuse is accomplished through operating system buffering in the case of nonmapped files.

Also in the interest of enabling an operating system implementation and an application to work together to optimize performance, the POSIX.1d draft standard introduces the following new pathname variables which provide the indicated information on files:

Name:{ALLOC_SIZE_MIN}

Description: Minimum number of bytes of storage actually allocated for any portion of a file. For direct (unbuffered) I/O, the number of bytes transferred in an I/O operation should be a multiple of {ALLOC_SIZE_MIN}. The file offset should also be a multiple of {ALLOC_SIZE_MIN}. Valid increments for file transfer sizes between the {POSIX_REC_MIN_XFER_SIZE} and {POSIX_REC_INCR_XFER_SIZE}{POSIX_REC_-MAX_XFER_SIZE} values. Note that {POSIX_REC_INCR_XFER_SIZE} should be a multiple of {ALLOC_SIZE_MIN}.

Name:{POSIX_REC_MIN_XFER_SIZE}
Description: Minimum recommended file transfer size.

Name:{POSIX_REC_MAX_XFER_SIZE}
Description: Maximum recommended file transfer size.

Name:{POSIX_REC_XFER_ALIGN}
Description: Recommended file transfer buffer alignment.

An application can use the POSIX.1 functions pathconf() and fpathconf() to determine the current value of each of these variables for any given pathname. The application can then use the values to optimally set up its transfers of data between files and memory.

The POSIX.1d draft standard proposes the following additional interfaces, available under the Advisory Information option, that further assist applications in optimizing their performance:

Pre-allocating or releasing a specified amount of storage space for a specified file (posix_fallocate(), posix_ffree()). The file size is not affected by these functions. Thus, space can be pre-allocated beyond the current end of the file. This enables append-mode writes to take advantage of the pre-allocation offered by posix_fallocate().
Allocating a block of memory of a specified size on a specified alignment (posix_memalign()). The block can be freed with the C Standard free() function [ISO/IEC C].

Timeouts for Selected Blocking Functions

Prior to the development of the POSIX.1d draft standard, some POSIX.1b and POSIX.1c blocking functions had timed versions (i.e., versions that would block subject to a specified timeout period), while others did not. The timed versions included the following:

The POSIX.1b function sigtimedwait(), a timed alternative to sigwaitinfo().
The POSIX.1c function pthread_cond_timedwait(), a timed alternative to pthread_cond_wait().
The POSIX.1b function aio_suspend() with a non-NULL timeout argument, a timed alternative to aio_suspend() with a NULL timeout argument.

The Working Group came to view the lack of timed versions of blocking functions as a serious shortcoming, especially in the context of life-critical or mission-critical embedded systems. In these systems, an application must ensure that unbounded blocking can never result from a service request, since unbounded blocking causes the application to lose control of the system. An application's loss of control is intolerable in realtime systems, for it is the application that is supposed to be directing the system in support of the mission. Therefore, an application must be able to detect and to escape from what it considers to be an unreasonable delay in receiving service, and hence to regain control of the system. Upon regaining control, the application may invoke some system-specific fault diagnosis and fault recovery procedures.

Therefore, all the blocking functions of IEEE Stds POSIX.1, POSIX.1b, and POSIX.1c were reviewed. It was decided to be unnecessary to supplement blocking I/O services with timed services, because asynchronous (nonblocking) services had already been added to the standard under the IEEE Std POSIX.1b Asynchronous I/O option. In the end, only the following timed functions were added to POSIX.1 (as amended by IEEE Stds POSIX.1b and POSIX.1c), under the Timeouts option:

The function sem_timedwait(), as a timed alternative to the POSIX.1b function sem_wait() (POSIX.1d, Section 11.2.6).
The function pthread_mutex_timedlock(), as a timed alternative to the POSIX.1c function pthread_mutex_lock() (POSIX.1d, Section 11.3.3). The pthread_mutex_timedlock() function can be used only with mutexes whose timeout-allowed attribute is set to PTHREAD_TIMEOUT_ALLOWED. The timeout-allowed attribute was introduced under the Timeouts option as a performance-preserving mechanism. An application programmer can set the timeout-allowed attributes of selected mutexes to PTHREAD_TIMEOUT_DISALLOWED to avoid the overhead associated with the mutexes being set up to handle timed locks.
The functions mq_timedsend() and mq_timedreceive(), as timed alternatives to the POSIX.1b functions mq_send() and mq_receive() (POSIX.1d, Sections 15.2.4-15.2.5).

Device Control

The POSIX.1d draft standard addresses device control in Annex I. This annex, which is informative only (i.e., not a normative part of the draft standard), suggests standardizing the functionality of the traditional UNIX function ioctl() in the form of a new function called posix_devctl(), which would be available under a Device Control option. In earlier drafts of POSIX.1d, the posix_devctl() function was specified in normative text in the main body of the document. However, some members of the balloting group opposed the inclusion of a device control function in the draft standard, even as an optional interface, and so, in the interest of consensus building, the technical reviewers decided to move the specification of posix_devctl() into an informative annex. This section describes the suggested posix_devctl() function, the motivation behind it, and some of the objections to it.

The UNIX function ioctl(), although proven in practice to provide essential functionality, was recognized as having some room for improvement in its specification. Thus, in designing the posix_devctl() function, the Working Group was driven by two somewhat conflicting goals: (1) maintaining compatibility with current ioctl() implementations, and (2) establishing sound definitions of the posix_devctl() arguments and return value.

The motivation behind the posix_devctl() function lies in the fact that the I/O capabilities of POSIX.1 fall short of meeting the I/O needs of realtime applications, as well as other applications. The problem is that many applications need to interact with I/O devices not contemplated by POSIX.1. The applications can choose to interact with such devices in one of two ways: (1) through device drivers or (2) through application code, directly using the POSIX.1d interrupt control facility. In Draft 11 of the POSIX.1d draft standard, Annex I, "Device Control Considera-tions," addresses the first approach; Annex J, "Interrupt Control Considerations," addresses the second approach.

POSIX.1 specifies general-purpose I/O functions, including open(), close(), read(), write(), and lseek(). These I/O functions are designed to capture the functionality of random-access mass-storage devices such as disks. Devices whose functionality is not completely captured by the general-purpose I/O functions are considered to be "special devices." POSIX.1 addresses only one type of "special device," terminal I/O. It defines device-specific functions for terminals that enable an application to specify the number of bits per character, the type of parity, the baud rate, etc., for an asynchronous serial communication port.

Realtime systems typically encompass special devices other than terminals. Some of the special devices are common commercially available devices, while others are unique application-specific devices. For common commercially available devices (e.g., magnetic tape drives and printers), it would be theoretically possible, although not necessarily practical, to define a full set of device-specific I/O functions such as those defined for terminals. However, for unique application-specific devices (e.g., specific actuators, sensors, or other controlled devices), it would be impossible to define a full set of device-specific I/O functions, because new devices are continually being developed for new applications. The functions needed by these yet-to-be-invented devices cannot be anticipated and thus cannot be defined or standardized.

The posix_devctl() function that is suggested in Annex I of the POSIX.1d draft standard does not attempt to standardize individual device-specific functions. Instead, it serves as a "standard" mechanism for transmitting any "nonstandard" (i.e., device-specific) I/O commands to any special devices. The posix_devctl() function is, in practice, a general application program interface to the device drivers for "special devices." That is, in the posix_devctl() model of communication between application software and device drivers, application programs funnel all device-specific I/O commands through the posix_devctl() interface.

The posix_devctl() function provides a layer of standardization that has proven to be useful, as evidenced in the widespread use of the ioctl() function. The posix_devctl() function benefits two groups:

Users of device drivers for special devices. Users are given a uniform model of communication between application software and device drivers. The model isolates device-specific application code into readily recognizable locations (i.e., at posix_devctl() function calls). Thus, application portability is improved. For example, application software that interacts with a specific analog-to-digital converter can be "ported" relatively easily to interact with another analog-to-digital converter, if the device drivers for both analog-to-digital converters implement the posix_devctl() function.
Writers of device drivers for special devices. (As previously noted, these writers may be application developers, as in the case of unique application-specific devices.) The model that the posix_devctl() function imposes on communication between application software and device drivers serves as a guide to device driver writers. In this way, the posix_devctl() function tends to simplify the development of device drivers. In the same way, the posix_devctl() function tends to simplify the porting of device drivers when devices need to be moved to different systems.

Like the common ioctl() function, the posix_devctl() function has the following arguments: (1) a file descriptor of an open device, (2) a driver-specific command requesting the designated device to perform some action, and (3) a pointer to a buffer whose content is command-dependent and therefore also driver-specific. Data is passed between the device driver and the buffer in a command-dependent direction.

The posix_devctl() function has two additional arguments: (1) a byte count of the data to be passed between the device driver and the buffer and (2) a pointer to a word (specifically, a data object of type int) of driver-specific device information that may be returned by the function in addition to the usual success/failure indication. In the interest of compatibility with the ioctl() function, a byte count of zero can be used to indicate that the amount of data to be passed between the device driver and the buffer is unspecified. Also in the interest of compatibility, the device information word can be used to report information that, in the case of the ioctl() function, would be reported via the function return value.

The fact that the posix_devctl() function has driver-specific arguments (i.e., the command, the buffer, and the device information word) is the source of the most serious objections to the function. Some balloters do not see the value of the level of standardization provided by the posix_devctl() function. In their minds, a function which is by its very nature so implementation-specific (and, moreover, driver-specific) simply is not a candidate for standardization in POSIX.1d or in any other amendment to POSIX.1. On the other side of the controversy, the proponents of the posix_devctl() function point to the ubiquity of the UNIX function ioctl() as proof of the usefulness of standardizing device control at the "template" level, where the template is standard and the elements of the template are driver-specific.

Interrupt Control

Annex J of the POSIX.1d draft standard proposes an optional interrupt control facility that would make interrupts visible to the application. This facility represents a departure from traditional UNIX practice but is in keeping with realtime kernel practice.

The interrupt control facility, like the device control facility, was at one time specified in normative text in the main body of POSIX.1d, but was moved to an informative annex in response to opposition from some balloters. The opposition stemmed from the same fundamental issue facing device control: interrupt control is inherently implementation-specific. The remainder of this section describes the proposed interrupt control facility and explains how it could be used to improve application portability.

The interrupt control facility put forward in Annex J offers two primary capabilities: (1) An application can associate user-written interrupt service routines (ISRs) with specified interrupts. (2) An application can request to be notified of the occurrence of a specified interrupt. These interfaces are aimed at enabling "connection of nonstandard interrupt-generating hardware in a standard way" [POSIX.1d, Section J.5.1.1]. Here, "nonstandard hardware" means special devices not supported by the operating system vendor. As noted in Section 8, "Device Control," special devices are common in realtime systems. Application developers could use the POSIX.1d interrupt control facilities to manage special devices in user-level application code. The alternative would be to manage special devices in full-fledged device drivers, installed in the operating system and executed in kernel mode.

The motivation behind the interrupt control facility is given in the POSIX.1d draft standard [POSIX.1d, Section J.5.2]:

Although interrupt handling isn't entirely portable, there is still profit in standardizing the interrupt control interface. First is the implicit standardization of core functionality. Second is programmer portability. Third is that interrupt handling code can follow the hardware device for which it was written. . . . The resulting modularization and isolation of nonportable code also aids portability.

The specific interfaces that would be available under the Interrupt Control option are as follows:

Associating a specified user-written ISR, with a specified interrupt, or disassociating the ISR and the interrupt (posix_intr_associate(), posix_intr_disassociate()). The process of associating a user-written ISR with an interrupt is referred to as "registering" the ISR with the operating system.
The target interrupt is specified through an argument of type intr_t, whose value identifies an interrupt in an implementation-defined manner.
When registering the ISR, the application specifies the address and size of a "communication region," an area of memory through which the ISR and the application can exchange data. The communication region is simply an area in the address space of the registering process that the application chooses to make accessible to the ISR. Upon invocation of the ISR, the operating system passes the address of the communication region to the ISR as its first argument. In this way, the ISR can gain access to a region of the registering process's address space, even if the ISR executes in a context different from that of the registering process.
In Annex J of POSIX.1d, the execution context of ISRs is declared to be implementation-defined. In many cases, the ISR will execute as a thread in the context of the operating system or kernel; in other words, it will execute as a kernel thread. In such cases, the address spaces of the registering thread and the ISR are different. This is why a pointer to a "communication region" must be explicitly passed to the ISR upon invocation.
Specifying to the operating system whether or not the thread that registered a given ISR should be notified when an interrupt is handled by the ISR. This is accomplished through the ISR return value, which is used as a code for indicating (1) whether or not the interrupt washandled and (2) if it was, whether or not the registering thread should also be notified.
The model envisioned in Annex J of the POSIX.1d draft standard is as follows. Multiple devices are typically mapped onto the same interrupt. For each device, the user may write a separate ISR. Then the multiple ISRs are registered for the interrupt. The registered ISRs are invoked in last-registered- first-invoked order upon an occurrence of the interrupt. When invoked, an ISR must poll its device to determine whether or not its device was the source of the interrupt. If its device was not the source, then the ISR sets its return code to indicate "not handled" and returns immediately. If its device was the source of the interrupt, then it handles the interrupt and sets the return code to indicate "handled" and "notify" or "do not notify," and it returns.
Designating a specified segment of application code as a critical section whose execution must not overlap the execution of a specified ISR (or ISRs). Typically, the protected application code and the protected ISR(s) require mutually exclusive access to shared data, in particular, the communication region (or some part thereof) that is identified at the time of ISR registration.
The "protected application code" is the code that falls between posix_intr_lock() and posix_intr_unlock() function calls. In other words, a thread signifies its intention to enter a critical section by calling the posix_intr_lock() function. In the posix_intr_lock() function call, the thread specifies an interrupt as the single argument. The "protected ISRs" are the user-written ISRs registered by the calling thread for the specified interrupt.
The protected ISRs cannot begin executing while the protected application code is executing; likewise, the protected application code cannot begin executing until any protected ISR active at the time of the posix_intr_lock() call has completed. The mechanisms used to ensure mutual exclusion between protected application code and protected ISRs are implementation defined. Possible mechanisms include operating system (i.e., kernel) mutexes and hardware disabling of interrupts. In addition, several details of these interfaces, such as whether or not locking a given interrupt for a given thread causes other interrupts (e.g., the same interrupt for other threads, or lower-priority interrupts for the same thread or other threads) to also be locked, are also implementation defined.
Waiting for notification of an (unspecified) interrupt (posix_intr_timedwait()). The duration of the wait can be bounded through specification of a timeout argument.

Status of the POSIX.1d Draft Standard

The POSIX.1d draft standard was first balloted at Draft 8 in December 1993. It was recirculated at Draft 10 in March 1997. Due in part to the time that lapsed between the first ballot and the recirculation, many balloters failed to respond to the recirculation. Faced with a nonresponsive ballot group, the Working Group decided the only way to make progress was to re-form the ballot group. This step was completed in July 1998.

The POSIX.1d draft standard is being reballoted, with the new ballot group, at this time. The results should be available in October 1998. Then ballot resolution will begin. Since the ballot resolution process may lead to changes in the draft standard, the reader should regard this paper as a snapshot of the draft standard as it stands at Draft 11.

References

[Chung et al. 90] J.Y. Chung, J.W.S. Liu, and K.J. Lin, "Scheduling Periodic Tasks that Allow Imprecise Results," IEEE Transactions on Computers 39, 9, 1156-1174.

[IEEE POSIX.13] POSIX.13-1998 (to be published). For now, available as IEEE Draft Standard POSIX.13/D9 (September 1997), Draft Standard for Information Technology-Standardized Application Environment Profile-POSIX Realtime Application Support (AEP).

[POSIX.1d] IEEE Draft Standard POSIX.1d/D11 (May 1998), Draft Standard for Information Technology-Portable Operating System Interface (POSIX)-Part 1: System Application Program Interface (API)-Amendment d: Additional Realtime Extensions [C Language].

[IEEE POSIX.5] POSIX.5-1992, Information Technology-POSIX Ada Language Interfaces-Part 1: Binding for System Application Program Interface (API).

[ISO/IEC 9945-1] International Standard ISO/IEC 9945-1:1996, Information Technology-Portable Operating System Interface (POSIX)-Part 1: System Application Program Interface (API) [C Language]. Includes IEEE Stds POSIX.1-1990, POSIX.1b-1993, POSIX.1c-1995, and POSIX.1i-1995.

[ISO/IEC C] International Standard ISO/IEC 9899:1990, Information Processing Systems-Programming Languages-C.

[Liu and Layland 73] C.L. Liu and J.W. Layland, "Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment," Journal of the ACM 20, 1, 46-61.

[NGCR 94] U.S. Navy Next Generation Computer Resources (NGCR) Program, "NGCR Operating Systems Standards Group (OSSWG) Advisory: A Programmer's Overview of the IEEE POSIX Realtime Standards" (prepared by K. Gordon, J. Gwinn, J. Oblinger, and F. Prindle), NGCR Document No. OSS-A 001, U.S. Navy Space and Naval Warfare Systems Command, May 1994.

[Sha and Goodenough 90] L. Sha and J.G. Goodenough, "Real- Time Scheduling Theory and Ada," IEEE Computer 23, 4, 53-62.

[Sha and Sathaye 93] L. Sha and S.S. Sathaye, "A Systematic Approach to Designing Distributed Real-Time Systems," IEEE Computer 26, 9, 68-78.

[Sprunt et al. 89] B. Sprunt, L. Sha, and J. Lehoczky, "Aperiodic Task Scheduling for Hard-Real-Time Systems," Real-Time Systems 1, 1, 27-60.

[TOG 97] The Open Group, Go Solo 2: The Authorized Guide to Version 2 of the Single UNIX Specification, Andrew Josey (editor), Reading, UK, May 1997. Includes "POSIX Realtime" and "POSIX Threads," by K. Gordon, J. Gwinn, J. Oblinger, and F. Prindle, as Chapters 9 and 10.