Browsable grammars

We offer the following grammars:

VS COBOL II (fully tested)
COBOL with OO/recursion/intrinsic functions etc. (not yet corrected/completed; preliminary release)
OS PL/I V2R3 (not yet corrected/completed; preliminary release)

Further grammars will be released soon.

How to use browsable grammars ...

From our grammars we generate a browsable version. Such a "grammar" browser is very useful, for example, to understand the syntax of the language described by the underlying grammar. The actual grammar rules are connected using hyperlinks so that it possible to go from the use site of a sort to its definition. In addition to the grammar rules, also a graphical representation in the form of so-called syntax diagrams is provided. The syntax diagrams are also connected using hyperlinks. There are special cross-reference tables for the sorts and the keywords used in the grammar. By the way, we extensively use grammar browsers for completing and correcting grammars in our (re-) engineering cycle.

A browsable grammar consists of the following sections:

Summary: The number of rules, sorts, top sorts, bottom sorts, lexical sorts and keywords are listed. This information is useful to get an impression on the size and the completeness of the grammar. The terms sorts, top sorts, bottom sorts are explained in some detail below.
Context-free syntax: The actual context-free grammar rules are provided in this section. We use a variant of EBNF (Extended Backus Naur Form). Rules are of the form s = e, where s is the sort to be defined and e is the expression built from some EBNF constructs as follows. Alternatives are separated by "|". Concatenation coincides with juxtaposition. Repetition is indicated by the braces, where "*" and "+" are used as postfix operators in order to express whether 0 or more resp. 1 or more repetitions are required. Keywords are enclosed in double quotes. There is one non-standard construct for permutation phrases. ( a b | b a ) can be abbreviated as ( a || b ).
Lexical syntax: Lexical definitions are written using some standard notation for regular expressions in the spirit of LEX. Character sets are enclosed in "[" and "]". All characters but letters and digits have to be escaped by "\". Enumerations can be abbreviated using "-" as an infix operator. The complementary character set of some set is obtained by using "~" as a prefix operator. There are the standard postfix operators for regular expressions: "?" for optional parts, "*" for zero or more repetitions and "+" for 1 or more repetitions of certain parts. Alternatives are separated by "|" which binds weaker than concatenation. A lexical definition consists of rules of the form s = e, where s is the lexical sort defined by the regular expression e. Auxiliary lexical sorts might be introduced in the scope of a rule by enclosing the rules defining the auxiliary sorts by "auxiliaries begin" and "end". The notation "..." denotes the undefined regular expression. We use "..." as the right-hand side of a rule in order to indicate that we do not want to provide a lexical definition for the corresponding sort for some reason.
All sorts: This table enumerates all sorts in alphabetical order. Sorts are usually called nonterminals. Since we also deal with incomplete grammars, we prefer the distinguishable term sort. For each sort all the RHS (right-hand side) occurrences or use sites are listed in the table, i.e., the sorts using the sort in their definition. All sorts in the table are connected with their definition (if there is any) via hyperlinks. If you browse the grammar rules, you can get to the table by clicking on the LHS (left-hand side) sort in a given rule.
Top sorts: This section just lists all top sorts. A top sort is a sort which is defined but not used, i.e., a sort with a LHS occurrence in the grammar rules, but with no RHS occurrence. A complete grammar should ideally contain one top sort which is usually called start symbol or axiom.
Bottom sorts: This section enumerates all bottom sorts. A bottom sort is a sort with one or more RHS occurrence, but with no LHS occurrence. In a complete grammar there should be no bottom sorts. Before providing a lexical definition, all potential lexical sorts will be counted as bottom sorts.
Keywords: All the keywords occurring in the grammar are listed in alphabetical order. For each keyword all the use sites are listed as well.
Diagrams: We use a graphical notation very similar to IBM's syntax diagrams used in several language documents. For more information on IBM's syntax diagrams refer to some IBM language document, e.g., the corresponding section How to Read the Syntax Diagrams from IBM's VS COBOL II Reference Summary. The most important deviation in our notation is that we have a special form of stacking using "||" to express permutation phrases. Stacking blocks using the ordinary "|" is a way of expressing alternatives, whereas stacking blocks using "||" is a way to express that the sequential order of the blocks is immaterial. A minor deviation is that we do not qualify the occurrences of sort names in diagrams by naturals.

Page by Ralf Lämmel (CWI, Amsterdam, Email: ralf@cwi.nl)