Robert Thau rst@ai.mit.edu
I think it a fair guess that most individuals submitting position papers to this workshop will believe in the importance of server APIs generally, and many will propose one specific API for discussion. This note instead discusses an aspect of server functionality which is not served particularly well by many extant server APIs. (The Apache server API, for which the author was largely responsible, handles it fairly well, but as discussed below, it is still less than ideal).
The position advocated here is that if a server extension is implemented through a server API, it should be possible for that server extension to be configured in a flexible way, and that provisions for doing so ought to be part of any proposed API standard.
In particular, suppose a server supports multiple administrative domains, as many do. (These may be multiple research groups, multiple product groups, different clients of a multihoming ISP). Then it ought to be possible for the maintainer of the server to allow those organizations to control the behavior of server extensions themselves, without interfering with each other --- and ideally, to be able to enforce administrative boundaries so that they can't affect each other even if some malicious subgroup might want to.
(Note that this note does
The remainder of this note is divided up into two parts. In the first, I explain situations in which this requirement comes up. In the second, I discuss some of the ways it might be dealt with, in an appropriate way, for each of a few different styles of server APIs --- in other words, first why it is a good idea to provide for flexible delegation of authority over a server extension, and then how it may be provided. We begin with the whys.
The most basic reason to provide a server API is to allow webmasters to provide services which were not provided for in the base design of the server itself, by supplying code to implement those features. (After all, if the server can do what you want out of the box, why fuss with an API?)
However, not every server extension is applicable to every situation. If a webmaster supplies a scripting engine, for example, they may wish to control what, exactly, the scripts are allowed to do. This does come up in practice --- many servers supply small (or not so small) scripting languages as a set of "server-side include" tags which may be embedded in HTML, and which cause the server to modify the data stream sent to the client. Many webmasters want to be able to control, whether these inclusion directives are allowed to cause extermal programs or CGI scripts to be invoked, and the NCSA and Apache servers do provide options which allow for that control.
This need to be able to configure extensions is not confined to those which are overtly scripting languages, of course. For instance, consider a server extension which provides an integrated text-search facility. Typically, good search engines work off precomputed indices, to avoid the expense of scanning the entire text base at run time, and they need to be told where those indices are.
To make the search-engine example more interesting, let us suppose that our search engine is on a server which has three searchable text databases --- two separate databases containing end-user documentation on different releases of the same product, and a third (maintained by an employee as an after-hours project) on fly-fishing. By the standards of a lot of corporate web sites, this is pretty modest, but configuring it already poses some interesting problems.
To begin with, it is obvious that we can't use the same index files to handle searches on all three databases. To do so would certainly cause people who were trying to search the project documentation to get information on both releases (which would be confusing and difficult to handle); to make matters worse, they might get a few fish stories as well.
In other words, to support this search engine, it would be best if the API made some sort of provision for directing a server extension to behave in different ways when employed in different contexts.
Now, one way to provide this facility would be to have a single configuration file, which contained parameters for how the extension (i.e., our search engine) was to behave in all contexts. This is in many respects the simplest thing for the server writer to implement. Indeed, if things are done this way, then the extension can simply read in a configuration file of its own from a standard location, without involving the rest of the server at all.
However, if there are multiple administrative domains on a single physical web server, and different parameters are needed for each (as in our text-indexing example), a single configuration file quickly becomes unmanageable. Typical situations of this sort are multihomed ISPs supporting multiple clients (in which upwards of a hundred administrative domains on a single server are not uncommon), and university servers supporting a sizable undergraduate population (which can go up to thousands).
For instance, let us suppose that the owner of the fly-fishing database wishes to update it from time to time, and that other people may want to supply indices for information relating to their own hobbies. In the single-configuration-file scenario, this means that each of them would have to edit the single configuration file. By Murphy's law, it is inevitable that sooner or later, someone is going to slip up, and alter the configuration of a database which is not their own.
Indeed, it may be the case that some of the text-datbase providers are not trusted to alter the configuration of databases which they don't own, and shouldn't be permitted to do so. Ideally, the server would provide for this, and would also provide for disallowing certain individuals from configuring a given server extension at all. (The latter is not likely to be an issue with text databases, but is an issue in practice with server extensions which provide new scripting mechanisms --- witness the features which the NCSA and Apache servers provide for controlling their inclusion machinery, or for that matter where CGI scripts may reside).
So much for the need for configurability. We now consider how a server might provide it. The details depend, of course, on how the server interfaces to the extension code in the first place. While most widespread server APIs (aside from CGI) work by dynamically linking code into the server (or just compiling it into the server binary), that is not the only option, and alternatives are worth discussing. But, we begin with the common case.
The most typical server APIs (Apache, NSAPI, ISAPI, Spyglass, etc.) work by dynamically linking code which implements the extension into the server. The APIs are typically C-based.
Most servers implementing these APIs are configured by means of a configuration file (or a set of config files), though some provide graphic front ends which hide the underlying config files from the end user. In such a server, to supply parameters to a server extension (e.g., to tell a search engine where its index files reside), there has to be some command which allows the server maintainer to pass parameters (ideally, multiple sets of parameters) on to the extension code.
So, for instance, the Netscape servers have a set of commands which cause a particular piece of code (a "server function", in NSAPI parlance) to be invoked to handle a particular phase of transaction processing (access control, producing the response, whatever). These commands allow text parameters to be given to the extension; these are passed to the code, when it runs, as a table of name-value pairs of strings.
Apache is somewhat more flexible, in that it allows user extension code to supply a command table, which gives any number of commands. These, in turn, can side effect a data structure whose exact layout is up to the external module --- this data structure is built when the command file is read, and handed back to the user code at run time. (See the Apache documentation, or the WWW5 paper on Apache API design considerations, for more details).
Continuing with "common-case" scenarios, the way delegation is most
commonly implemented, by servers that implement it at all, is by
providing for per-directory configuration files. These are
files named .htaccess
, .nsconfig
, or
something similar, which contain commands which are applied to
requests for files in that directory, or subdirectories.
So long as commands to configure server extensions may appear in these files, they provide for a basic delegation capability. In fact, they address some of the security issues as well --- if each administrative domain corresponds to a directory and its subdirectories (as is almost always the case in practice), then the normal access control mechanisms of the underlying file system can generally be used to restrict who is allowed to set controls for server operation in a given area.
Unfortunately, the cost of providing this facility is not trivial.
In many servers which implement a per-directory configuration file
facility, each directory which might contain a per-directory
config file is scanned for one on each request to which it
might be relevant. The overhead associated with these
operations is substantial. It could be reduced by caching
per-directory configuration information, but it can only be eliminated
completely by turning the feature off (in Apache, via
AllowOverride None
).
What per-directory config files do not provide for, in their raw form, is a way for the centralized server maintainer to restrict what options are available to an administrative subdomain. That is, in (say) university environments, it is often useful to be able to specify that some users may not turn on the "run CGI scripts here" parameter in their personal web space, even if they are allowed to configure server behavior for files within their directory in other ways.
Apache provides for this in a limited way. Briefly, each command in
a command table has a bitmask associated with it, which declares, in
effect, what privileges are required for that command to be valid.
The AllowOverride
command allows the server maintainer to
specify which privileges .htaccess
files in a given
directory are granted --- in particular AllowOverride
None
says that no commands are valid, and (as
indicated above) per-directory configuration files there may be
ignored.
Unfortunately, Apache does not provide a way for a module to declare a new privilege. So, for instance, if a server extension provides a "safe" scripting mechanism which one might want to enable even where CGI is disallowed, you cannot create a separate privilege to control it. It would be easy to extend the API to allow for this, and the failure to do so yet is one of the current Apache API's flaws.
Another technique which might be used to provide control over what server extensions could and could not do would be to allow a "security manager" to audit, and selectively allow or deny, their access to the facilities of the underlying operating system. Controlling the behavior of extensions (including extensions invoked or supplied by users) would then be as simple, or as complicated, as writing (or configuring) an appropriate security manager.
This technique has been most widely deployed, to date, in the libraries implementing the applet API for the Java Virtual Machine. However, the basic idea is not confined to the JVM context --- it is applicable anywhere that a malicious extension is unable to trick up its own machine code to invoke system services directly; indeed, the JVM itself is not inextricably tied to the Java language, per se, as Intermetrics' Ada-to-JVM compiler demonstrates.
An alternative style of providing for server extensions is that instead of linking the code to the server directly, one instead provides them in external processes, which the web server communicates with via some form of remote procedure call (RPC). A special case of considerable interest is when the RPC protocol in question is invocation of methods in a CORBA-style structure of distributed objects. The Digicool ILU Requester is an example of such an approach.
This style of interface obviously decouples the extension code from the server (as compared to a dynamic link) --- in particular, it means that buggy extension code can only crash itself and not the rest of the server, a bit of fail-safety which may be handy in some applications. On the other hand, it entails a bit more overhead. An entire position paper could be written on the relative merits of the two schemes (I hope to see one), but a detailed elaboration of the arguments would not be relevant here.
What is relevant is the question of how to provide for flexible configuration within the framework of an RPC-based API. What this means, in the RPC framework, is that users need to be able to start their own RPC services, and need to able to instruct the server to direct certain requests within their domain to be handled by their particular RPC services.
Where this becomes awkward, again, is in allowing different services, or different forms of the same service, to be active in different administrative domains. If, for instance, there is a single registry listing all of the various extension modules (or objects, in CORBA parlance) which a server may contact, we instantly have all the same problems in adminstering that database which we had in adminstering a single, centralized configuration file.
If the distributed object system, or the underlying OS, provides a suitably flexible registry, one can, of course, just use that. If not, one approach might be to provide per-directory configuration files which designate the addresses of the services which apply to that directory and subdirectories, and which also designate which requests will be passed along to those services.