Algebraic specification

Algebraic specification techniques have been developed as a means to specify the design of complex software systems in a formal way. The algebraic approach has been motivated by the notion of information hiding put forward in  [Parnas72a] and the ideas concerning abstraction expressed in  [Ho72]. Historically, the ADJ-group (see Goguen et al., 1978) provided a significant impetus to the algebraic approach by showing that abstract data types may be interpreted as (many sorted) algebras. (In the context of algebraic specifications the notion of sorts has the same meaning as types. We will, however, generally speak of types.) As an example of an algebraic specification, look at the module defining the data type Bool, as given in slide 8-Bool.

Algebraic specification -- ADT

\zline{\fbox{Bool}}
  adt bool is
  functions
    true : bool
    false : bool
    and, or : bool * bool -> bool
    not : bool -> bool
  axioms
    [B1]  and(true,x) = x
    [B2]  and(false,x) = false
    [B3]  not(true) = false
    [B4]  not(false) = true
    [B5]  or(x,y) = not(and(not(x),not(y)))
  end
  

slide: The ADT Bool

In this specification two constants are introduced (the zero-ary functions true and false), three functions (respectively and, or and not). The or function is defined by employing not and and, according to a well-known logical law. These functions may all be considered to be (strictly) related to the type bool. Equations are used to specify the desired characteristics of elements of type bool. Obviously, this specification may mathematically be interpreted as (simply) a boolean algebra.

Mathematical models

The mathematical framework of algebras allows for a direct characterization of the behavioral aspects of abstract data types by means of equations, provided the specification is consistent. Operationally, this allows for the execution of such specifications by means of term rewriting, provided that some (technical) constraints are met. The model-theoretic semantics of algebraic specifications centers around the notion of initial algebras, which gives us the preferred model of a specification. To characterize the behavior of objects (that may modify their state) in an algebraic way, we need to extend the basic framework of initial algebra models either by allowing so-called multiple world semantics or by making a distinction between hidden and observable sorts (resulting in the notion of an object as an abstract machine). As a remark, in our treatment we obviously cannot avoid the use of some logico-mathematical formalism. If needed, the concepts introduced will be explained on the fly. Where this does not suffice, the interested reader is referred to any standard textbook on mathematical logic for further details.

Signatures -- generators and observers

Abstract data types may be considered as modules specifying the values and functions belonging to the type. In  [Dahl92], a type T is characterized as a tuple specifying the set of elements constituting the type T and the collection of functions related to the type T. Since constants may be regarded as zero-ary functions (having no arguments), we will speak of a signature Σ or Σ defining a particular type T. Also, in accord with common parlance, we will speak of the sorts s   ∈  Σ, which are the sorts (or types) occurring in the declaration of the functions in Σ. See slide 8-signature.
Σ
slide: Algebraic specification

A signature specifies the names and (function) profiles of the constants and functions of a data type. In general, the profile of a function is specified as f : s1 ×…×sn → s where f : s1 ×…×sn → s n=0 the function f may be regarded asa constant. More generally, when s_1,...,s_n are allunrelated to the type T being defined, we may regardf as a relative constant.Relative constants are values that are assumed tobe defined in the contextwhere the specification is being employed.The functions related to a data type T may be discriminatedaccording to their role in defining T.We distinguish between producers g \e P_T the function f may be regarded asa constant. More generally, when s_1,...,s_n are allunrelated to the type T being defined, we may regardf as a relative constant.Relative constants are values that are assumed tobe defined in the contextwhere the specification is being employed.The functions related to a data type T may be discriminatedaccording to their role in defining T.We distinguish between producers f \e O_T , that have Tas their argument type and deliver a result ofa type different from T.In other words, producer functions define howelements of T may be constructed.(In the literature one often speaks of constructors,but we avoid this term because it already has aprecisely defined meaning in the object−orientedprogramming language C++.)In contrast, observer functions do not producevalues of T, but give instead informationon some particular aspect of T.The signature %S_T of a type T is uniquely definedby the union of producer functions P_T of a type T is uniquely definedby the union of producer functions O_T .Constants of type T are regarded as a subset ofthe producer functions P_T .Constants of type T are regarded as a subset ofthe producer functions P_T \cap O_T = \emptyset TGU_{Bool} consists only of the valuest and f.As another example, consider the data type Nat(representing the natural numbers) with generator basisG_{Nat} = {{ 0, S }} consists only of the valuest and f.As another example, consider the data type Nat(representing the natural numbers) with generator basisS : Nat -> Nat (that delivers the successor of its argument).The terms that may be constructed by G_{Nat} (that delivers the successor of its argument).The terms that may be constructed by GU_{Nat} = {{ 0, S 0, SS 0, ... }} ,which uniquely corresponds to the natural numbers {{ 0, 1, 2, ... }} .(More precisely, the natural numbers are isomorphic with GU_{Nat} .(More precisely, the natural numbers are isomorphic with Set_{A} result ina universe that contains terms such asadd(\emptyset,a) and add(add(\emptyset,a),a) and %S defining T with equations defining the (behavioral) properties of T we will lookat another example illustrating how the choiceof a generator basis may affect the structureof the value domain of a data type.In the example presented in slide ,the profiles are given of the functionsthat may occur in the signature specifying sequences.(The notation _ defining T with equations defining the (behavioral) properties of T we will lookat another example illustrating how the choiceof a generator basis may affect the structureof the value domain of a data type.In the example presented in slide ,the profiles are given of the functionsthat may occur in the signature specifying sequences.(The notation G' ) or many−to−one(as for G'' ).Since we require our specification to be first−orderand finite, infinite generator bases (such as G''' ).Since we require our specification to be first−orderand finite, infinite generator bases (such as GU/= , that is GUfactored with respect to equivalence,may be regarded as the abstract elements constituting the type T,and from each subset we may choose a concreteelement acting as a representative for the subsetwhich is the equivalence class of the element.Operationally, equations may be regarded as rewrite rules (oriented from left to right),that allow us to transform a term in whicha term t_1 , that is GUfactored with respect to equivalence,may be regarded as the abstract elements constituting the type T,and from each subset we may choose a concreteelement acting as a representative for the subsetwhich is the equivalence class of the element.Operationally, equations may be regarded as rewrite rules (oriented from left to right),that allow us to transform a term in whicha term t_1 is replaced by t_2 if t_1 = t_2 if S(y)
S^n 0 (where n corresponds to the magnitude of the natural number denoted by the term).The opportunity of symbolic evaluation by termrewriting is exactly what has made the algebraicapproach so popular for the specification of software,since it allows (under some restrictions)for executable specifications.Since they do not reappear in what may be consideredthe normal forms of terms denoting the naturals(that are obtained by applying the evaluations induced by theequality theory),the functions plus and mul may be regarded as secondaryproducers.They are not part of the generator basis of the type Nat.Since we may consider mul and plus as secondaryproducers at best, we can easily see that when we define mul and plus for the case 0 and Sxfor arbitrary x, that we have covered all possible (generator) cases.Technically, this allows us to prove properties of thesefunctions by using structural induction on the possible generator cases.The proof obligation (in the case of the naturals) then is toprove that the property holds for the function appliedto 0 and assuming that the property holds for applying the function to x, it alsoholds for Sx.As our next example, consider the algebraicspecification of the type Set_A
|A (where n corresponds to the magnitude of the natural number denoted by the term).The opportunity of symbolic evaluation by termrewriting is exactly what has made the algebraicapproach so popular for the specification of software,since it allows (under some restrictions)for executable specifications.Since they do not reappear in what may be consideredthe normal forms of terms denoting the naturals(that are obtained by applying the evaluations induced by theequality theory),the functions plus and mul may be regarded as secondaryproducers.They are not part of the generator basis of the type Nat.Since we may consider mul and plus as secondaryproducers at best, we can easily see that when we define mul and plus for the case 0 and Sxfor arbitrary x, that we have covered all possible (generator) cases.Technically, this allows us to prove properties of thesefunctions by using structural induction on the possible generator cases.The proof obligation (in the case of the naturals) then is toprove that the property holds for the function appliedto 0 and assuming that the property holds for applying the function to x, it alsoholds for Sx.As our next example, consider the algebraicspecification of the type Set_A
where A is a set of values, and Σ specifies the signature of the functions operating on A. A multi-sorted algebra is a structure Σ where S is a set of sort names and As the set of values belonging to the sort s. The set S may be ordered (in which case the ordering indicates the subtyping relationships between the sorts). We call the (multi-sorted) structure A\skipx a A\skipx .
  
  • Σ-algebra - A\skipx = ( {As}s   ∈  S , Σ)

  • interpretation - eval : TΣA\skipx

  • adequacy - A\skipx \models t1 = t2 \desdak E \vdash t1 = t2


slide: Interpretations and models

Having a notion of algebras, we need to have a way in which to relate an algebraic specification to such a structure. To this end we define an interpretation   
  • Σ-algebra - A\skipx = ( {As}s   ∈  S , Σ)

  • interpretation - eval : TΣA\skipx

  • adequacy - A\skipx \models t1 = t2 \desdak E \vdash t1 = t2

which maps closed terms formed by following the rules given in the specification to elements of the structure A\skipx . We may extend the interpretation eval to include variables as well (which we write as eval : TΣ(X) → A\skipx ), but then we also need to assume that an assignment eval : TΣ(X) → A\skipx is given, such that when applying θ to a term t the result is free of variables, otherwise no interpretation in θ exists. See slide 8-algebra.

Interpretations

As an example, consider the interpretations of the specification of Bool and the specification of Nat, given in slide 8-B-N.
  
  • B\skipx = ( {tt, ff}, {¬, ∧, ∨} )

  • eval B\skipx : TBoolB\skipx = {or → ∨, and → ∧, not → ¬}

  
  • N\skipx = ( \nat , {++ , + , ∗} )

  • eval N\skipx : TNatN\skipx = {S → ++ , mul → ∗, plus → +}


slide: Interpretations of Bool and Nat

The structure B\skipx given above is simply a boolean algebra, with the operators B\skipx , ∧ and ∧. The functions not, and and or naturally map to their semantic counterparts. In addition, we assume that the constants true and false map to the elements tt and ff. As another example, look at the structure N\skipx and the interpretation N\skipx , which maps the functions S, mul and plus specified in Nat in a natural way. However, since we have also given equations for Nat (specifying how to eliminate the functions mul and plus) we must take precautions such that the requirement

     
  
  
   
   N\skipx \models eval N\skipx (t1) =  N\skipx  eval N\skipx (t2) \desdak ENat \vdash t1 = t2 
  
is satisfied if the structure N\skipx \models eval N\skipx (t1) = N\skipx eval N\skipx (t2) \desdak ENat \vdash t1 = t2 is to count as an adequate model of Nat. The requirement above states that whenever equality holds for two interpreted terms (in N\skipx ) then these terms must also be provably equal (by using the equations given in the specification of Nat), and vice versa. As we will see illustrated later, many models may exist for a single specification, all satisfying the requirement of adequacy. The question is, do we have a means to select one of these models as (in a certain sense) the best model. The answer is yes. These are the models called initial models.

Initial models

A model (in a mathematical sense) represents the meaning of a specification in a precise way. A model may be regarded as stating a commitment with respect to the interpretation of the specification. An initial model is intuitively the least committing model, least committing in the sense that it imposes only identifications made necessary by the equational theory of a specification. Technically, an initial model is a model from which every other model can be derived by an algebraic mapping which is a homomorphism.
  
  • ΣE-algebra - M\skipx = ( TΣ / ∼ , Σ/ ∼ )

  
  • no junk - \A a : TΣ / ∼ \E t [e\dot]val M\skipx (t) = a

  • no confusion - M\skipx \models t1 = t2 \desdak E \vdash t1 = t2


slide: Initial models

The starting point for the construction of an initial model for a given specification with signature   
  • ΣE-algebra - M\skipx = ( TΣ / ∼ , Σ/ ∼ )

  
  • no junk - \A a : TΣ / ∼ \E t [e\dot]val M\skipx (t) = a

  • no confusion - M\skipx \models t1 = t2 \desdak E \vdash t1 = t2

is to construct a term algebra TΣ with the terms that may be generated from the signature TΣ as elements. The next step is then to factor the universe of generated terms into equivalence classes, such that two terms belong to the same class if they can be proven equivalent with respect to the equational theory of the specification. We will denote the representative of the equivalence class to which a term t belongs by [t]. Hence t1 = t2 (in the model) iff t1 = t2. So assume that we have constructed a structure M\skipx = (TΣ / ∼ , Σ) then; finally, we must define an interpretation, say M\skipx = (TΣ / ∼ , Σ), that assigns closed terms to appropriate terms in the term model (namely the representatives of the equivalence class of that term). Hence, the interpretation of a function f in the structure M\skipx is such that

  
  
  
   
   f M\skipx ([t1],…,[tn]) = [ f(t1,…,tn) ] 
  
where f M\skipx ([t1],…,[tn]) = [ f(t1,…,tn) ] is the interpretation of f in M\skipx . In other words, the result of applying M\skipx to terms t1,…,tn belongs to the same equivalence class as the result of applying t1,…,tn to the representatives of the equivalence classes of t1,…,tn. See slide 8-initial. An initial algebra model has two important properties, known respectively as the no junk and no confusion properties. The no junk property states that for each element of the model there is some term for which the interpretation in M\skipx is equal to that element. (For the M\skipx model this is simply a representative of the equivalence class corresponding with the element.) The no confusion property states that if equality of two terms can be proven in the equational theory of the specification, then the equality also holds (semantically) in the model, and vice versa. The no confusion property means, in other words, that sufficiently many identifications are made (namely those that may be proven to hold), but no more than that (that is, no other than those for which a proof exists). The latter property is why we may speak of an initial model as the least committing model; it simply gives no more meaning than is strictly needed. The initial model constructed from the term algebra of a signature Σ is intuitively a very natural model since it corresponds directly with (a subset of) the generator universe of Σ. Given such a model, other models may be derived from it simply by specifying an appropriate interpretation. For example, when we construct a model for the natural numbers (as specified by Nat) consisting of the generator universe {0, S 0, SS 0, …} and the operators {++, +, ∗} (which are defined as {++, +, ∗}, Sn ∗ Sm = S n ∗m and Sn ∗ Sm = S n ∗m ) we may simply derive from this model the structure ({0,1,2,…}, {++, +, ∗}) for which the operations have their standard arithmetical meaning. Actually, this structure is also an initial model for Nat, since we may also make the inverse transformation.
More generally, when defining an initial model only the structural aspects (characterizing the behavior of the operators) are important, not the actual contents. Technically, this means that initial models are defined up to isomorphism, that is a mapping to equivalent models with perhaps different contents but an identical structure. Not in all cases is a structure derived from an initial model itself also an initial model, as shown in the example below.
   Consider the specification of Bool as given before. For this specification we have given the structure |B and the interpretation eval_{|B}