Computing is a relatively young discipline.
Despite its short history, a number of styles
and schools promoting a particular style have emerged.
However, in contrast to other disciplines
such as the fine arts (including architecture)
and musical composition, there is no
well-established tradition of what is to be considered
as good taste with respect to software design.
There is an on-going and somewhat pointless debate as to
whether software design
must be looked at as an art or must be promoted
into a science. See, for example, [Knuth92] and [Gries].
The debate has certainly resulted in new technology
but has not, I am afraid, resulted in
universally valid design guidelines.
The notion of good design in the other
disciplines is usually implicitly defined by
a collection of examples of good design,
as preserved in museums or (art or music) historical works.
For software design, we are still a long way from
anything like a museum, setting the standards of good design.
Nevertheless, a compendium of examples of object-oriented
applications such as [Pinson90] and [Harmon93],
if perhaps not setting the standards
for good design, may certainly be instructive.
The software engineering literature abounds
with advice and tools to measure the quality of
good design.
In slide 3-design-criteria, a number of the criteria commonly found
in software engineering texts is listed.
In software design, we evidently
strive for a high level of abstraction
(as enabled by a notion of types and a corresponding
notion of contracts),
a modular structure with strongly cohesive units
(as supported by the class construct),
with units interrelated in a precisely
defined way (for instance by a client/server or subtype
relation).
Other desirable properties are
a high degree of information hiding (that is narrowly
defined and yet complete interfaces)
and a low level of complexity (which may be achieved
with units that have only weak coupling, as supported
by the client/server model).
An impressive list, indeed.
Design is a human process, in which cognitive factors
play a critical role.
The role of cognitive factors is reflected
in the so-called fractal design process model
introduced in [JF88], which describes object-oriented
development as a triangle with bases labeled by
the phrases model, realize and refine.
This triangle may be iterated at each of the bases,
and so on.
The iterative view of software development
does justice to the importance of human understanding,
since it allows for a simultaneous understanding
of the problem domain and the mechanisms needed
to model the domain and the system architecture.
Good design involves taste.
My personal definition of good design would
certainly also involve cognitive factors
(is the design understandable?), including
subjective criteria such as is it pleasant
to read or study the design?
In contrast to the arts, however,
software can be subjected to metrics
measuring the cohesiveness and complexity of the system.
In this section, we will look at a number of metrics
which may, if well-established and supported by
empirical evidence, be employed for
managing software development projects.
Also we will look at the Law of Demeter,
which is actually not a law but which may act as a guideline
for developing class interfaces.
And finally, we will have a look at some
guidelines for individual class design.
Metrics for object-oriented design
Object-oriented software development is a relatively
new technology,
still lacking empirical guidance and quantitative
methods to measure progress and productivity.
In [ChidK], a suite of metrics is proposed that
may aid in managing object-oriented software
development projects,
and, as the authors suggest, may be used also
to establish to what extent an object-oriented
approach has indeed been followed.
See slide 4-suite.
In general, quantitative measures of the size and
complexity of software may aid in project planning
and project evaluation, and may be instrumental
in establishing the productivity of tools
and techniques and in estimating the cost
of both development and maintenance of a system.
The metrics proposed in [ChidK] pertain to three distinct
elements of object-oriented design,
namely the definition of objects and their relation to
other objects,
the attributes and/or properties of objects,
and the (potential) communication between objects.
The authors motivate their proposal by remarking that
existing metrics do no justice to the
notions of classes, inheritance, encapsulation and
message passing,
since they were developed primarily
from a function-oriented view,
separating data and procedures.
Definitions
To perform measurements on a program or design,
we need to be able to describe the structure of a program
or design in language-independent terms.
As indicated below, the identifiers x, y and z
will be used to name objects.
Occasionally, we will use the term to
refer to the class of which object x is an instance.
The term will be used to refer to
the set of instance variables of the object x,
and likewise will be used to refer to
the set of methods that exists for x.
Combined, the instance variables and methods
of an object x are regarded as the properties of x.
See slide 4-definitions.
An important property for an instance variable is
whether it is read or written by a method.
The set of instance variables read or written
by a particular method will be referred to
by the term .
Likewise, the set of methods that either read
or write a particular instance variable is referred
to by the term .
A number of metrics are defined by taking the cardinality
of some set.
The cardinality of a set is simply the number
of elements it contains.
To refer to the cardinality of a set S,
the notation will be used.
In addition, we need predicates to
characterize the inheritance structure of a program or
design.
The term will be used to refer to
the root of the inheritance hierarchy of which
is a member.
The term will be used to refer to
the set of classes of which is a direct
ancestor,
and the term will be used to indicate
the distance between and in
the inheritance hierarchy.
The distance will be one if y is a descendant of x
and undefined if x and y are not related by inheritance.
To describe the potential communication between objects
the term will be used to state that
object x calls some method of y.
The term is used to specify more
precisely that x calls the method .
Evaluation criteria
Before discussing the individual metrics,
we need to know by what criteria we may establish
that a proposed metric is a valid instrument for
measuring properties of a program.
One means of validating a metric is gathering
empirical evidence to determine whether a metric
has predictive value, for instance with respect
to the cost of maintaining software.
Lacking empirical evidence, [ChidK]
establish the validity of their metrics with
reference to criteria adapted from [Weyuker].
See slide 4-evaluation.
The criteria proposed by [Weyuker] concern
well-known complexity
measures such as cyclomatic number,
programming effort,
statement count and
data flow complexity.
Evaluation criteria
\E x \E y \bl %m(x) != %m(y)
non-uniqueness --
permutation -- is a permutation of x
implementation --
monotonicity --
interaction -- \ifsli{\n}{}
combination -- \ifsli{\n}{}
R>
As a first criterion (i), it may be required that a metric
has discriminating power,
which means that there
are at least two
objects which give a different result.
Another criterion (ii) is that the metric in question
imposes some notion of equivalence, meaning
that two distinct objects may deliver the same result
for that particular metric.
As a third criterion (iii), one may require that
a permutation (that is a different ordering of the elements)
of an object gives a different result.
None of the proposed metrics, however, satisfy this criterion.
This may not be very surprising,
considering that the method interface of an object
embodies what [Meyer88] calls a shopping list,
which means that it contains all the services needed
in an intrinsically unordered fashion.
The next criterion (iv) is that the actual
implementation is of importance for the outcome of
the metric.
In other words, even though two objects perform
the same function, the details of the implementation
matter when determining the complexity of a system.
Another property that a metric must satisfy (v)
is the property of monotonicity,
which implies that a single object is
always less complex than when it is in some way combined
with another object.
This seems to be a reasonable requirement,
however for objects located in distinct branches
of the inheritance graph this need not always
be the case. \nop{See section DIN.}
Another requirement that may be imposed on a metric (vi)
is that it shows that two equivalent objects
may behave differently when placed in a particular
context.
This requirement is not satisfied by one of the metrics (RFC),
which may be an indication that the metric must
be refined. \nop{See section RFC.}
Finally, the last property (vii)
requires that a metric must reflect that
decomposition may reduce the complexity of
design.
Interestingly, none of the proposed methods satisfy
this requirement.
According to [ChidK],
this raises the issue "that complexity
could increase, not reduce as a design is broken into more
objects".
To conclude, evidently more research, including
empirical validation, is required before adopting
any metric as a reliable measure for the complexity
of a design.
Nevertheless, the metrics discussed below
provide an invaluable starting point for such an effort.
In the following sections, the individual metrics
(WMC, DIN, NOC, CBO, RFC, LCO) will be characterized.
For each metric, a formal definition will be given, and
the notions underlying the definition characterized.
Further, for each metric we will look at its
implications for the practice of software development
and establish (or disprove) the properties
related to the evaluation criteria discussed previously.
Weighted methods per class
The first metric we look at provides a measure
for the complexity of a single object.
The assumption underlying the metric is that
both the number of methods as well as the
complexity of each method (expressed by its weight)
determines the total complexity of the object.
See slide 4-WMC.
}
Weighted methods per class
WMC
Measure
-- complexity of an object
\ifsli{\n}{}
Viewpoint --
the number of methods and the complexity of methods is
an indicator of how much time and effort is required
to develop and maintain objects
The WMC measure pertains to the definition of an object.
From a software engineering perspective,
we may regard the measure as an indicator of how much time
and effort is required to develop and maintain
the object (class).
In general, objects having many (complex) methods
are not likely to be reusable, but must be assumed
to be tied to a particular application.
To illustrate that property (vii) indeed does
not hold for this metric,
consider objects x and y with respectively and
methods.
Assume that x and y have methods in common.
Then , and hence
,
where denotes the combination of objects x
and y.
Depth of inheritance
The second metric (DIN) is a measure for the depth of the (class of the)
object in the inheritance hierarchy.
The measure is directly related to the scope of properties,
since it indicates the number of classes from which
the class inherits its functionality.
For design, the greater the depth of the class in the inheritance hierarchy
the greater will be its expected complexity,
since apart from the methods defined for the class itself
the methods inherited from classes higher in the hierarchy are also involved.
The metric may also be used as an indication for reuse,
that is reuse by inheritance.
See slide 4-DIN.
Depth of inheritance
DIN
Measure -- scope of properties
Viewpoint --
the deeper a class is in the hierarchy, the greater the number
of methods that is likely to be inherited, making the object more complex
Satisfaction of criteria (i), (ii) and (iv) is easily established.
With respect to property (v), the monotonicity property, three cases
must be distinguished.
Recall that the property states that for any object x it holds
that .
Now assume that y is a child of x and ,
then .
But combining x and y will give
and , hence property (v) is not satisfied.
When x and y are siblings, then ,
hence property (v) is satisfied.
Finally, assume that x and y are not directly connected by inheritance
and x and y are not siblings.
Now if x and y are collapsed to the class lowest in the hierarchy,
property (v) is satisfied.
However, this need not be the case.
Just imagine that is collapsed with .
Then, obviously, the monotonicity property is not satisfied.
Number of children
The third metric (NOC) gives the number of immediate subclasses of in
the class hierarchy.
As the previous metric, it is related to the scope of properties.
It is also a measure of reuse, since it indicates how many
subclasses inherit the methods of .
According to [ChidK], it is generally better to have depth than breadth
in the class hierarchy, since depth promotes reuse
through inheritance.
Anyway, the number of descendants may be an indication
of the influence of the class on the design.
Consequently, a class scoring high on this metric may require
more extensive testing.
See slide 4-NOC.
}
Number of children
NOC
Measure
-- scope of properties
Viewpoint --
generally, it is better to have depth than breadth in the class
hierarchy, since it promotes the reuse of methods through
inheritance
The reader is invited to check that properties (i), (ii), (iv) and (v)
are satisfied.
Recall that property (vi) states that for some objects y and z,
if then x might behave differently when combined
with z, that is .
Assume that and both have n children,
that is , and let be a child of ,
and assume that has r children.
Then combining and will result in a class
with children,
whereas combining and will result in a class
with children,
which means that and hence that property (vi)
is satisfied.
Coupling between objects
The next metric (CBO) measures non-inheritance related connections
with other classes.
It is based on the notion that two objects are related if either one
acts on the other,
and as such is a measure of coupling,
that is the degree of interdependence between objects.
As phrased in [ChidK], {\em excessive coupling between objects
outside of the inheritance hierarchy is detrimental to modular design
and prevents reuse}.
In other words, objects with a low degree of interdependence
are generally more easily reused.
Note that coupling, as expressed by the metric, is not transitive,
that is, if x uses y and y uses z, then it is not necessarily the case
that x also uses z.
In fact, a famous style guideline discussed in section demeter
is based on the intuition underlying this metric.
A high degree of coupling may indicate that testing the object may require
a lot of effort,
since other parts of the design are likely to be involved as well.
As a general rule, a low degree of inter-object coupling
should be strived for. \nop{whenever possible}
See slide 4-CBO.
Coupling between objects
CBO
Measure
-- degree of dependence
Viewpoint --
excessive coupling between objects outside of the inheritance
hierarchy is detrimental to modular design and prevents reuse
Establishing properties (i), (ii), (iv), (v) and (vi) is left to
the (diligent) reader.
However, we will prove property (vii) to be invalid.
Recall that property (vii) states that
there exist objects x and y for which
,
meaning that for those objects the complexity of
x combined with y is higher than the total complexity
of x and y in isolation.
Just pick arbitrary objects x and y, and assume that x and y have
couplings in common, for example both use an object z.
Now ,
and hence , contradicting property (vii).
Strongly when .
Response for a class
Our fifth metric (RFC) is based on the notion of response set.
The response set of an object may be characterized as
the set of methods it has available,
consisting of the methods of its class and the methods of other
objects that may be invoked by any of its own methods.
This metric may be regarded as a measure of the communication
that may occur between the object and other objects.
If primarily (potential) extraneous method invocations are responsible
for the size of the response set,
it may be expected that testing the object will be difficult
and will require a lot of knowledge of other parts of the design.
See slide 4-RFC.
Response for a class
RFC
Measure
-- complexity of communication
Viewpoint --
if a large number of methods can be invoked in response
to a message, the testing and debugging of the object
becomes more complex
Establishing properties (i) and (iii) is left to the reader.
To establish property (iv), stating that not only function but also
implementation is important, it suffices to see that the
actual implementation determines which and how many (extraneous)
methods will be called.
Property (v), monotonicity, follows from the observation
that for any object y, it holds
that
and hence .
According to [ChidK],
property (vi) is not satisfied.
To disprove property (vi) it must be shown that given a
object x and an object y for which ,
there is no object z that provides a context discriminating
between x and y, in other words for which .
The proof given in [ChidK] relies on the assumption
that , whereas one would
expect .
However, assuming the latter, property (vi) indeed holds.
Property (vii), nevertheless, may again be proven to be invalid.
Lack of cohesion
The last metric (LCO) we will look at is based on the
notion of degree of similarity of methods.
If methods have no instance variables in common, their degree
of similarity is zero.
A low degree of similarity may indicate a lack of cohesion.
As a measure for the lack of cohesion the number of disjoint
sets partitioning the instance variables is taken.
Cohesiveness of methods within a class is desirable, since
it promotes encapsulation of objects.
For design, lack of cohesion may indicate that
the class is better split up into two or more distinct classes.
See slide 4-LCO.
Lack of cohesion
LCO
Measure
-- degree of similarity between methods
where
\ifsli{\n\hspace*{0.0cm}}{}
Viewpoint --
cohesiveness of methods within a class is desirable
since it promotes the encapsulation of objects
Establishing properties (i), (ii) and (iv) is left to the reader.
To establish the monotonicity property (v), that is
for arbitrary y, consider that combining objects
may actually reduce the number of different sets,
that is for some .
The reduction , however, cannot be greater than the
number of original sets, hence and .
Therefore, and ,
establishing property (v).
To establish property (vi), the interaction property,
assume for some object y and let z
be another object with .
Now, and ,
where and are the reductions for, respectively, and .
Since neither nor is dependent on n,
they need not be equal,
hence in general ,
establishing property (vi).
To disprove property (vii),
consider that
for some , and hence .
The violation of property (vii) seems to indicate that
it may indeed be better sometimes to
have a single non-cohesive object
than multiple cohesive ones, implementing the same functionality.
An objective sense of style
The metrics discussed in the previous section clearly
suggest principles for the design of object-oriented systems,
but do not lead
immediately to explicit guidelines for design.
In contrast, [LH89] present such guidelines,
but they are less explicit in their formal approach.
The guidelines they presented were
among the first, and they still provide good advice
with respect to designing class interfaces.
-- A method m is a client of C if m calls a method of C
Supplier
-- If m is a client of C then C is a supplier of m
Acquaintance
--C is an acquaintance of m if C is a supplier of m
but not (the type of) an argument of m or (of) an instance variable
of the object of m\nl
C is a preferred acquaintance of m if an object of C is created in
m or C is the type of a global variable\nl
C is a preferred supplier of m if C is a supplier and C
is (the type of) an instance variable, an argument or a preferred acquaintance\nl
In slide 4-good, an explicit definition of the dual notions
of client and supplier has been given.
It is important to note that not all
of the potential suppliers for a class may
be considered safe.
Potentially unsafe suppliers are distinguished
as acquaintances, of which those that are either
created during a method call or stored in a global
variable are to be preferred.
Although this may not be immediately obvious,
this excludes suppliers that are
accessed in some indirect way, for instance
as the result of a method call to
some safe supplier.
As an example of using an unsafe supplier,
consider the call
screen->cursor()->move();
which instructs the cursor associated with the screen
to move to its home position.
Although screen may be assumed to be a safe supplier,
the object delivered by need
not necessarily be a safe supplier.
In contrast, the call
screen->move_cursor();
does not make use of an indirection
introducing a potentially unsafe supplier.
The guideline concerning the use of safe suppliers is known
as the Law of Demeter, of which the underlying
intuition is that the programmer should not be bothered
by knowledge that is not immediately apparent
from the program text (that is the class interface)
or founded in well-established conventions
(as in the case of using special global variables).
See slide 4-demeter.
Law of Demeter
\zline-- ignorance is bliss\nl
Do not refer to a class C in a method m unless C is (the type of)
1. an instance variable
2. an argument of m
3. an object created in m
4. a global variable
To remedy the use of unsafe suppliers,
two kinds of program transformation are suggested
by [LH89].
First, the structure of a class should be made invisible
for clients, to prohibit the use of a component
as (an unsafe) supplier.
This may require the lifting of primitive actions
to the encompassing object, in order to make these
primitives available to the client in a safe way.
Secondly, the client should not be given
the responsibility of performing
(a sequence of) low-level actions.
For example, moving the cursor should not
be the responsibility of the client of the screen,
but instead of the object representing the screen.
In principle, the client need not be
burdened with detailed knowledge
of the cursor class.
The software engineering principles underlying
the Law of Demeter may be characterized
as representing a compositional approach,
since the law enforces the use of immediate
parts only.
As additional benefits, conformance
to the law results in hiding the component
structure of classes, reduces the coupling of control
and, moreover, promotes reuse by
enforcing the use of localized (type)
information.
Individual class design
\c{
We have nearly completed a first tour around
the various landmarks of object-oriented design.
Identifying objects, expressing the interaction
between objects by means of client/server contracts
and describing the collaboration between objects
in terms of behavioral compositions belong
to a craft that will only be learned in the
practice of developing real systems.
}
\nop{
We will conclude this chapter by looking at some
informal, pragmatic guidelines for individual class design.
}
\c{
A class should represent a faithful model of a single concept,
and be a reusable, plug-compatible component
that is robust, well-designed and extensible.
In slide 3-individual, we list a number of suggestions
put forward by [McGregor92].
}
\slide{3-individual}{Individual class design}{
Class design {\em -- guidelines}
only methods public
-- information hiding
do not expose implementation details
public members available to all classes
-- strong cohesion
as few dependencies as possible
-- weak coupling
explicit information passing
root class should be abstract model
-- abstraction
}
\c{
The first two guidelines enforce the principle of
information hiding,
advising that only methods public and
all implementation details hidden.
The third guideline states a principle
of strong cohesion by requiring that
classes implement a single protocol
that is valid for all potential clients.
A principle of weak coupling is enforced by
requiring a class to have as few dependencies as possible,
and to employ explicit information passing
using messages instead of inheritance
(except when inheritance may be used in a type
consistent fashion).
When using inheritance, the root class should be
an abstract model of its derived classes,
whether inheritance is used to realize
a partial type or to define a specialization
in a conceptual hierarchy.
}
\nop{
The list given above can be used as a checklist
to verify whether a class is well-designed.
In section 2-metrics we will explore
metrics that capture the guidelines given in
a more quantitative manner.
Such metrics may be an aid in the software engineering
of object-oriented systems and may possibly
also be used to measure the productivity of
object-oriented programmers.
}
\c{
The properties of classes, including their interfaces
and relations with other classes, must be laid
down in the design document.
Ideally, the design document should present
a complete and formal description of the
structural, functional and dynamic aspects of the system,
including an argument showing that the various models
are consistent.
However, in practice this will seldom be realized,
partly, because object-oriented design techniques
are as yet not sufficiently matured to allow
a completely formal treatment, and partly because
most designers will be satisfied with a non-formal
rendering of the architecture of their system.
Admittedly, the task of designing is already
sufficiently complex, even without the additional
complexity of a completely formal treatment.
Nevertheless, studying the formal underpinnings
of object-oriented modeling based on types and polymorphism
is still worthwhile, since it will sharpen the
intuition with respect to the notion of behavioral
conformance and the refinement of contracts,
which are both essential for
developing reliable object models.
And reliability is the key to reuse!
}