As an example, we will illustrate how to deploy the SGML
related components used to parse and display SGML encoded
documents. The SGMLviewer is based on SP, James Clark's
conforming SGML parser
We will start by illustrating how to extend the default HTML document type definition. The document instance given below is specified in HTML but employs in addition two extensions for active documents. The first extension is a video tag, which is used to display inline video fragments. The other is an applet tag, used to embed small applications written in a scripting language. We use Tcl in the examples. In the example below, the applet displays some notes that are played when the user clicks on the image. The first line of the example defines the type of the document (demo). It is specified in a separate document type definition (DTD) that we will describe below. The parser automatically retrieves the DTD from the Web if its location is specified by a URL. The next line illustrates the use of an entity declaration, an SGML mechanism used here as a primitive macro facility defining the title of the document. The title entity is used twice, in the title and h1 tag, and will be expanded by the parser. The third line specifies the style sheet that is needed to display the contents of the document. The use of style sheets will be explained later. The video element requires a src attribute defining the location of the video file. Finally, the applet tag is used to inline embedded script code defining a small musical applet. The image and the note files are located in the directory specified in the data attribute of the applet tag.
Whether active document elements are to be defined by a new, specific tag or by the more general applet mechanism is a matter of taste. A new tag requires modification of the DTD and style sheet but describes the element in a more declarative way, which gives the application more freedom in displaying the contents. For example, a browser may decide to display a text alternative if the local platform does not support video.
Recall that a document type definition defines the structure of a
document by describing the elements and attributes
that can be interspersed as tags with the document content. The
following DTD extends the (draft) HTML 3.0 DTD
The information contained in the DTD is used by the parser to
generate a complete and validated document instance. Note that
this task could be performed by an HTTP-server as well, which
would significantly simplify the design and implementation of web
clients. Therefore, there are strong arguments to add SGML
functionality to servers as well
Style sheets define how the various elements should be processed. The hush browser (see figure fig:browser) defines a default style for HTML elements. However, these styles can be redefined and extended by a document instance using a special processing instruction, notated as <?stylesheet url>. The browser retrieves the URL specified in the processing instruction and uses it to display the contents of the document. The example specifies a URL to a style sheet that describes how to process the new video tag. Recall that processing instructions are application dependent, so the parser passes the text in a processing instruction directly to the application. At the moment, we use an experimental style sheet language based on Tcl. An example of a style sheet fragment that specifies how the title tag should be processed, is given below. While the style sheet mechanism needs some refinement, our approach supports the extension of existing document types and allows for extensive experimenting with the (many) new tags proposed for the HTML 3.0 standard.
The document instance of the previous example was very similar to plain HTML documents, which made it worthwhile to reuse the original HTML DTD and style sheet. However, for some applications HTML is not suited at all and a completely new document structure is needed. The next example shows a document instance of a simple musical application. The first line defines the (filename of) a new document type definition. The next line contains an entity definition describing a G7 chord. An SGML processing instruction is used to specify the filename of the style sheet. The root of the document hierarchy is the score element, consisting of several chords. Chords are build of notes, which are described by single characters. The first two chords use the entity defined before, specifying the notes of a G7 chord. The third occurrence of chord describes a C major chord.
The DTD corresponding to the simple musical document given above defines the structural elements and their attributes. When the application processes the document, the parser will fill in the default duration for all notes, resolve the entity definition and add the missing end tags for the notes and chords.
The DTD defines three structural elements: a score containing several chords, a chord containing several notes and a note consisting of data. Both chord and note have three attributes: id, name and duration. The first two are optional and the last has a default value of 4. Note the difference between the id attribute, which has to be a unique identifier, and the name attribute, which can be an arbitrary string.
Playing the document by a web browser, does not
necessarily involve displaying the data visually.
The simple style sheet shown below
simply collects notes and chords and plays
them by using the play command.
Note that most of the timing relations are implicit
in the document. For example, the notes within a single chord are
to be played in parallel, and the chords themselves are to be
played sequentially. However, this is not explicitly defined by
the document instance or DTD and can only be intuitively derived
from the element names. Even in the style sheet below, these
timing relations remain implicit.
The procedures corresponding to the open and close tags build a
string representation of the score. At the opening of the score
element the string is initialized with a command defining the tempo
at 120 beats per minute. During the parsing process the string is
extended with the parsed notes. After the last chord has been
parsed the resulting string is:
t120 (g<b<f< r)(g<b<f< r)(c<e<g< r).
This string is played after the score end tag has
been encountered. The Tcl command play used to play the notes is
provided by the hymne extension of hush