Browsable grammars
Copyright (C) 1999
Ralf Lämmel
(CWI, Amsterdam)
&
Chris Verhoef
(WINS, Universiteit van Amsterdam)
We offer the following grammars:
- VS COBOL II (fully tested)
- COBOL with OO/recursion/intrinsic functions etc. (not yet corrected/completed; preliminary release)
- OS PL/I V2R3 (not yet corrected/completed; preliminary release)
Further grammars will be released soon.
How to use browsable grammars ...
From our grammars we generate a browsable version.
Such a "grammar" browser is very useful, for example, to understand
the syntax of the language described by the underlying grammar. The
actual grammar rules are connected using hyperlinks so that it possible
to go from the use site of a sort to its definition. In addition to the
grammar rules, also a graphical representation in the form of so-called
syntax diagrams is provided. The syntax diagrams are also connected
using hyperlinks. There are special cross-reference tables for the
sorts and the keywords used in the grammar. By the way, we extensively
use grammar browsers for completing and correcting grammars in our
(re-) engineering cycle.
A browsable grammar consists of the following sections:
- Summary
- The number of rules, sorts, top sorts, bottom sorts, lexical sorts
and keywords are listed. This information is useful to get an impression
on the size and the completeness of the grammar. The terms sorts,
top sorts, bottom sorts are explained in some detail below.
- Context-free syntax
- The actual context-free grammar rules are provided in this section.
We use a variant of EBNF (Extended Backus Naur Form).
Rules are of the form s = e, where s is the
sort to be defined and e is the expression
built from some EBNF constructs as follows. Alternatives are
separated by "|". Concatenation coincides with juxtaposition.
Repetition is indicated by the braces, where "*" and "+" are
used as postfix operators in order to express whether 0 or more
resp. 1 or more repetitions are required. Keywords are enclosed
in double quotes. There is one non-standard construct for permutation
phrases. ( a b | b a ) can be abbreviated as ( a || b ).
- Lexical syntax
- Lexical definitions are written using some standard notation for
regular expressions in the spirit of LEX. Character sets are enclosed in
"[" and "]". All characters but letters and digits have to be escaped by "\".
Enumerations can be abbreviated using "-" as an infix operator. The
complementary character set of some set is obtained by using "~" as a
prefix operator. There are the standard postfix operators for regular
expressions: "?" for optional parts, "*" for zero or more repetitions and
"+" for 1 or more repetitions of certain parts. Alternatives are separated
by "|" which binds weaker than concatenation. A lexical definition consists
of rules of the form s = e, where s is the lexical sort defined
by the regular expression e. Auxiliary lexical sorts might be introduced
in the scope of a rule by enclosing the rules defining the auxiliary sorts by
"auxiliaries begin" and "end". The notation "..." denotes the undefined regular
expression. We use "..." as the right-hand side of a rule in order to indicate
that we do not want to provide a lexical definition for the corresponding sort
for some reason.
- All sorts
- This table enumerates all sorts in alphabetical order.
Sorts are usually called nonterminals. Since we also deal with
incomplete grammars, we prefer the distinguishable term sort.
For each sort all the RHS (right-hand side) occurrences or use sites
are listed in the table, i.e., the sorts using the sort in their
definition. All sorts in the table are connected with their
definition (if there is any) via hyperlinks. If you browse the
grammar rules, you can get to the table by clicking on the LHS
(left-hand side) sort in a given rule.
- Top sorts
- This section just lists all top sorts. A top sort is a sort which
is defined but not used, i.e., a sort with a LHS occurrence in the grammar
rules, but with no RHS occurrence. A complete grammar should ideally contain
one top sort which is usually called start symbol or axiom.
- Bottom sorts
- This section enumerates all bottom sorts. A bottom sort is
a sort with one or more RHS occurrence, but with no LHS occurrence.
In a complete grammar there should be no bottom sorts. Before providing
a lexical definition, all potential lexical sorts will be counted as
bottom sorts.
- Keywords
- All the keywords occurring in the grammar are listed in alphabetical
order. For each keyword all the use sites are listed as well.
- Diagrams
- We use a graphical notation very similar to IBM's syntax diagrams used
in several language documents. For more information on IBM's syntax diagrams
refer to some IBM language document, e.g., the corresponding section
How to Read the Syntax Diagrams from IBM's VS COBOL II Reference Summary.
The most important deviation in our notation is that we have a special
form of stacking using "||" to express permutation phrases.
Stacking blocks using the ordinary "|" is a way of expressing alternatives,
whereas stacking blocks using "||" is a way to express that the sequential order of
the blocks is immaterial. A minor deviation is that we do not qualify the occurrences
of sort names in diagrams by naturals.
Page by Ralf Lämmel
(CWI, Amsterdam,
Email: ralf@cwi.nl)