Metrics for object-oriented design

Object-oriented software development is a relatively new technology, still lacking empirical guidance and quantitative methods to measure progress and productivity. In  [ChidK], a suite of metrics is proposed that may aid in managing object-oriented software development projects, and, as the authors suggest, may be used also to establish to what extent an object-oriented approach has indeed been followed. See slide 4-suite.

A metric suite

Object-oriented design


slide: A metric suite

In general, quantitative measures of the size and complexity of software may aid in project planning and project evaluation, and may be instrumental in establishing the productivity of tools and techniques and in estimating the cost of both development and maintenance of a system. The metrics proposed in  [ChidK] pertain to three distinct elements of object-oriented design, namely the definition of objects and their relation to other objects, the attributes and/or properties of objects, and the (potential) communication between objects. The authors motivate their proposal by remarking that existing metrics do no justice to the notions of classes, inheritance, encapsulation and message passing, since they were developed primarily from a function-oriented view, separating data and procedures.

Definitions

To perform measurements on a program or design, we need to be able to describe the structure of a program or design in language-independent terms. As indicated below, the identifiers x, y and z will be used to name objects. Occasionally, we will use the term class(x) to refer to the class of which object x is an instance. The term iv(x) will be used to refer to the set of instance variables of the object x, and likewise methods(x) will be used to refer to the set of methods that exists for x. Combined, the instance variables and methods of an object x are regarded as the properties of x. See slide 4-definitions.

Definitions

Read/write properties

Cardinality


slide: Definitions

An important property for an instance variable is whether it is read or written by a method. The set of instance variables read or written by a particular method m_x will be referred to by the term iv(m_x). Likewise, the set of methods that either read or write a particular instance variable is referred to by the term methods(i_x). A number of metrics are defined by taking the cardinality of some set. The cardinality of a set is simply the number of elements it contains. To refer to the cardinality of a set S, the notation | S | will be used. In addition, we need predicates to characterize the inheritance structure of a program or design. The term root(x) will be used to refer to the root of the inheritance hierarchy of which class(x) is a member. The term descendants(x) will be used to refer to the set of classes of which class(x) is a direct ancestor, and the term distance(x,y) will be used to indicate the distance between class(x) and class(y) in the inheritance hierarchy. The distance will be one if y is a descendant of x and undefined if x and y are not related by inheritance. To describe the potential communication between objects the term x uses y will be used to state that object x calls some method of y. The term x calls m_y is used to specify more precisely that x calls the method m_y.

Evaluation criteria

Before discussing the individual metrics, we need to know by what criteria we may establish that a proposed metric is a valid instrument for measuring properties of a program. One means of validating a metric is gathering empirical evidence to determine whether a metric has predictive value, for instance with respect to the cost of maintaining software. Lacking empirical evidence,  [ChidK] establish the validity of their metrics with reference to criteria adapted from  [Weyuker]. See slide 4-evaluation. The criteria proposed by  [Weyuker] concern well-known complexity measures such as cyclomatic number, programming effort, statement count and data flow complexity.

Evaluation criteria

\E x \E y \bl %m(x) != %m(y) non-uniqueness -- \E x \E y \bl %m(x) = %m(y) permutation -- \E x \E y \bl y is a permutation of x /\ %m(x) != %m(y) implementation -- \A x \A y \bl fun(x) = fun(y) \not\Rightarrow %m(x) = %m(y) monotonicity -- \A x \A y \bl %m(x) <= %m(x + y) & %m(y) <= %m(x + y) interaction -- \A x \A y \E z \bl %m(x) \ifsli{\n}{} = %m(y) /\ %m(x + z) != %m(y + z) combination -- \E x \E y \bl %m(x) + %m(y) \ifsli{\n}{} < %m(x + y) R>
slide: Criteria for the evaluation of metrics

As a first criterion (i), it may be required that a metric has discriminating power, which means that there are at least two objects which give a different result. Another criterion (ii) is that the metric in question imposes some notion of equivalence, meaning that two distinct objects may deliver the same result for that particular metric. As a third criterion (iii), one may require that a permutation (that is a different ordering of the elements) of an object gives a different result. None of the proposed metrics, however, satisfy this criterion. This may not be very surprising, considering that the method interface of an object embodies what  [Meyer88] calls a shopping list, which means that it contains all the services needed in an intrinsically unordered fashion. The next criterion (iv) is that the actual implementation is of importance for the outcome of the metric. In other words, even though two objects perform the same function, the details of the implementation matter when determining the complexity of a system. Another property that a metric must satisfy (v) is the property of monotonicity, which implies that a single object is always less complex than when it is in some way combined with another object. This seems to be a reasonable requirement, however for objects located in distinct branches of the inheritance graph this need not always be the case. \nop{See section DIN.} Another requirement that may be imposed on a metric (vi) is that it shows that two equivalent objects may behave differently when placed in a particular context. This requirement is not satisfied by one of the metrics (RFC), which may be an indication that the metric must be refined. \nop{See section RFC.} Finally, the last property (vii) requires that a metric must reflect that decomposition may reduce the complexity of design. Interestingly, none of the proposed methods satisfy this requirement. According to  [ChidK], this raises the issue "that complexity could increase, not reduce as a design is broken into more objects". To conclude, evidently more research, including empirical validation, is required before adopting any metric as a reliable measure for the complexity of a design. Nevertheless, the metrics discussed below provide an invaluable starting point for such an effort. In the following sections, the individual metrics (WMC, DIN, NOC, CBO, RFC, LCO) will be characterized. For each metric, a formal definition will be given, and the notions underlying the definition characterized. Further, for each metric we will look at its implications for the practice of software development and establish (or disprove) the properties related to the evaluation criteria discussed previously.

Weighted methods per class

The first metric we look at provides a measure for the complexity of a single object. The assumption underlying the metric is that both the number of methods as well as the complexity of each method (expressed by its weight) determines the total complexity of the object. See slide 4-WMC. }

Weighted methods per class

WMC

Measure

-- complexity of an object
  • WMC(x) = \ifsli{\n}{} \sum_{m \e methods(x)} complexity(m)

Viewpoint --

the number of methods and the complexity of methods is an indicator of how much time and effort is required to develop and maintain objects
slide: Weighted methods per class

The WMC measure pertains to the definition of an object. From a software engineering perspective, we may regard the measure as an indicator of how much time and effort is required to develop and maintain the object (class). In general, objects having many (complex) methods are not likely to be reusable, but must be assumed to be tied to a particular application. To illustrate that property (vii) indeed does not hold for this metric, consider objects x and y with respectively n_x and n_y methods. Assume that x and y have %d methods in common. Then n_x + n_y - %d <= n_x + n_y, and hence %m(x + y) <= %m(x) + %m(y), where x + y denotes the combination of objects x and y.

Depth of inheritance

The second metric (DIN) is a measure for the depth of the (class of the) object in the inheritance hierarchy. The measure is directly related to the scope of properties, since it indicates the number of classes from which the class inherits its functionality. For design, the greater the depth of the class in the inheritance hierarchy the greater will be its expected complexity, since apart from the methods defined for the class itself the methods inherited from classes higher in the hierarchy are also involved. The metric may also be used as an indication for reuse, that is reuse by inheritance. See slide 4-DIN.

Depth of inheritance

DIN

Measure -- scope of properties

  • DIN(x) = distance( root(x), class(x) )

Viewpoint --

the deeper a class is in the hierarchy, the greater the number of methods that is likely to be inherited, making the object more complex
slide: Depth of inheritance

Satisfaction of criteria (i), (ii) and (iv) is easily established. With respect to property (v), the monotonicity property, three cases must be distinguished. Recall that the property states that for any object x it holds that %m(x) <= %m(x + y). Now assume that y is a child of x and %m(x) = n, then %m(y) = n + 1. But combining x and y will give %m(x + y) = n and %m(x + y) < %m(y), hence property (v) is not satisfied. When x and y are siblings, then %m(x) = %m(y) = %m(x + y) + 1, hence property (v) is satisfied. Finally, assume that x and y are not directly connected by inheritance and x and y are not siblings. Now if x and y are collapsed to the class lowest in the hierarchy, property (v) is satisfied. However, this need not be the case. Just imagine that class(x) is collapsed with root(x). Then, obviously, the monotonicity property is not satisfied.

Number of children

The third metric (NOC) gives the number of immediate subclasses of class(x) in the class hierarchy. As the previous metric, it is related to the scope of properties. It is also a measure of reuse, since it indicates how many subclasses inherit the methods of class(x). According to  [ChidK], it is generally better to have depth than breadth in the class hierarchy, since depth promotes reuse through inheritance. Anyway, the number of descendants may be an indication of the influence of the class on the design. Consequently, a class scoring high on this metric may require more extensive testing. See slide 4-NOC. }

Number of children

NOC

Measure

-- scope of properties
  • NOC(x) = | descendants(x) |

Viewpoint --

generally, it is better to have depth than breadth in the class hierarchy, since it promotes the reuse of methods through inheritance
slide: Number of children

The reader is invited to check that properties (i), (ii), (iv) and (v) are satisfied. Recall that property (vi) states that for some objects y and z, if %m(x) = %m(y) then x might behave differently when combined with z, that is %m(x+z) != %m(y+z). Assume that class(x) and class(y) both have n children, that is %m(x) = %m(y) = n, and let class(z) be a child of class(x), and assume that class(z) has r children. Then combining class(x) and class(z) will result in a class with n - 1 + r children, whereas combining class(y) and class(z) will result in a class with n + r children, which means that %m(x+z) != %m(y+z) and hence that property (vi) is satisfied.

Coupling between objects

The next metric (CBO) measures non-inheritance related connections with other classes. It is based on the notion that two objects are related if either one acts on the other, and as such is a measure of coupling, that is the degree of interdependence between objects. As phrased in  [ChidK], {\em excessive coupling between objects outside of the inheritance hierarchy is detrimental to modular design and prevents reuse}. In other words, objects with a low degree of interdependence are generally more easily reused. Note that coupling, as expressed by the metric, is not transitive, that is, if x uses y and y uses z, then it is not necessarily the case that x also uses z. In fact, a famous style guideline discussed in section demeter is based on the intuition underlying this metric. A high degree of coupling may indicate that testing the object may require a lot of effort, since other parts of the design are likely to be involved as well. As a general rule, a low degree of inter-object coupling should be strived for. \nop{whenever possible} See slide 4-CBO.

Coupling between objects

CBO

Measure

-- degree of dependence
  • CBO(x) = | { y | x uses y \/ y uses x } |

Viewpoint --

excessive coupling between objects outside of the inheritance hierarchy is detrimental to modular design and prevents reuse
slide: Coupling between objects

Establishing properties (i), (ii), (iv), (v) and (vi) is left to the (diligent) reader. However, we will prove property (vii) to be invalid. Recall that property (vii) states that there exist objects x and y for which %m(x) + %m(y) <= %m(x + y), meaning that for those objects the complexity of x combined with y is higher than the total complexity of x and y in isolation. Just pick arbitrary objects x and y, and assume that x and y have %d >= 0 couplings in common, for example both use an object z. Now %m(x + y) = %m(x) + %m(y) - %d, and hence %m(x + y) <= %m(x) + %m(y), contradicting property (vii). Strongly when %d > 0.

Response for a class

Our fifth metric (RFC) is based on the notion of response set. The response set of an object may be characterized as the set of methods it has available, consisting of the methods of its class and the methods of other objects that may be invoked by any of its own methods. This metric may be regarded as a measure of the communication that may occur between the object and other objects. If primarily (potential) extraneous method invocations are responsible for the size of the response set, it may be expected that testing the object will be difficult and will require a lot of knowledge of other parts of the design. See slide 4-RFC.

Response for a class

RFC

Measure

-- complexity of communication
  • RFC(x) = | methods(x) \bigcup { m_y | x calls m_y } |

Viewpoint --

if a large number of methods can be invoked in response to a message, the testing and debugging of the object becomes more complex
slide: Response for a class

Establishing properties (i) and (iii) is left to the reader. To establish property (iv), stating that not only function but also implementation is important, it suffices to see that the actual implementation determines which and how many (extraneous) methods will be called. Property (v), monotonicity, follows from the observation that for any object y, it holds that %m(x+y) >= max(%m(x),%m(y)) and hence %m(x+y) >= %m(x). According to  [ChidK], property (vi) is not satisfied. To disprove property (vi) it must be shown that given a object x and an object y for which %m(x) = %m(y), there is no object z that provides a context discriminating between x and y, in other words for which %m(x+z) != %m(y+z). The proof given in  [ChidK] relies on the assumption that %m(x+y) = max(%m(x),%m(y)), whereas one would expect %m(x+y) >= max(%m(x),%m(y)). However, assuming the latter, property (vi) indeed holds. Property (vii), nevertheless, may again be proven to be invalid.

Lack of cohesion

The last metric (LCO) we will look at is based on the notion of degree of similarity of methods. If methods have no instance variables in common, their degree of similarity is zero. A low degree of similarity may indicate a lack of cohesion. As a measure for the lack of cohesion the number of disjoint sets partitioning the instance variables is taken. Cohesiveness of methods within a class is desirable, since it promotes encapsulation of objects. For design, lack of cohesion may indicate that the class is better split up into two or more distinct classes. See slide 4-LCO.

Lack of cohesion

LCO

Measure

-- degree of similarity between methods
  • LCO(x) =
where
  • partitions(M,I) = \{ J \subseteq I \ifsli{\n\hspace*{0.0cm}}{} | methods(J) \cap methods(I \backsl J) = \0 \}

Viewpoint --

cohesiveness of methods within a class is desirable since it promotes the encapsulation of objects
slide: Lack of cohesion

Establishing properties (i), (ii) and (iv) is left to the reader. To establish the monotonicity property (v), that is %m(x) <= %m(x+y) & %m(y) <= %m(x + y) for arbitrary y, consider that combining objects may actually reduce the number of different sets, that is %m(x+y) = %m(x) + %m(y) - %d for some %d >= 0. The reduction %d, however, cannot be greater than the number of original sets, hence %d <= %m(x) and %d <= %m(y). Therefore, %m(x) + %m(y) - %d >= %m(x) and %m(x) + %m(y) - %d >= %m(y), establishing property (v). To establish property (vi), the interaction property, assume %m(x) = %m(y) for some object y and let z be another object with %m(z) = r. Now, %m(x+z) = n + r - %d and %m(y + z) = n + r - %r, where %d and %r are the reductions for, respectively, x+z and y+z. Since neither %d nor %r is dependent on n, they need not be equal, hence in general %m(x+z) != %m(y+z), establishing property (vi). To disprove property (vii), consider that %m(x+y) = %m(x) + %m(y) - %d for some %d >= 0, and hence %m(x + y) <= %m(x) + %m(y). The violation of property (vii) seems to indicate that it may indeed be better sometimes to have a single non-cohesive object than multiple cohesive ones, implementing the same functionality.