Encapsulation and inheritance in C++

Instructor's Guide


intro polymorphism idioms patterns events summary, Q/A, literature
Operationally, the basic features of object-oriented programming may be characterized as encapsulation and inheritance. Encapsulation means primarily support for the realization of abstract data types, and inheritance provides a mechanism for sharing code which ultimately is a means of defining polymorphic (sub)types. Additional requirements were mentioned: support for type conversion and protection (both for clients and derived classes). In this section we will introduce the features of C++ supporting OOP, and we will (try to) establish to what extent these features satisfy the requirements stated previously. A complementary introduction to C++ is given in appendix C. This section only intends to highlight the more specifically object-oriented features of C++.

Encapsulation

The C++ language is a direct descendant of the popular systems programming language C. The principal difference to C, apart from the introduction of classes and the mechanism of inheritance, is that C++ fully supports static type checking. This allows a programmer to write type safe programs. An important distinction here is between type safe programs and type secure programs. Type secure means that no runtime error can occur due to type errors. By incorporating the mechanism of explicit type conversions (casts), C++ allows the programmer to explicitly deviate from the strict typing scheme. In some cases this may be necessary. Personally, I find that this not a disadvantage, but I am sure not everyone will agree on that. Ultimately, good programming requires a disciplined use of the constructs provided by a language, including the low level constructs which may be necessary for special purposes.

Classes

Classes are the primary construct for realizing abstract data types in C++. As stated before, the ultimate goal in realizing a data type is to provide the elements of that type with behavior such that they conform to our expectations and (equally important) such that they cooperate fluently with objects that already exist, including the objects (of types) predefined by the language. In the following, we will take a simple example (of a counter) to illustrate the various features available for defining the behavior of a class of objects. See slide 2-simple.
  class counter {
  int n;
  public:
  
  counter() { n = 0; }
  
  void operator++() { n = n + 1; }
  int value() { return n; }
  };
  

slide: A simple counter

The counter as defined above maintains an integer n to record its state. The variable n is a private data member of the class counter. In more traditional object-oriented terminology, we may call n an instance variable. In the public section of the counter we encounter a number of methods, or function members as they are called for C++. First, we have a function member with the same name as the class, namely counter. This member is called the constructor and is invoked when creating a counter object. The definition of the function body of the constructor immediately follows the declaration of the constructor. This is done primarily because it is convenient for the exposition. However, in practice this may also be done for reasons of efficiency, since function bodies that are directly defined are, whenever possible, expanded inline in the program text during compilation. When looking at the use of a counter object, as in the example below,
  counter c; c++;
  cout << c.value();
  
  
the first thing to note is that a counter is created by declaring the variable c to be of type counter. Next, we see that the counter c is incremented (by one), and finally, the value of the counter is written to standard output.

Constructors and destructors

Many errors in programming occur due to improperly initialized data values. The proper initialization of elements of a data type often requires conformance with some informally stated protocol. To take care of the creation and initialization of objects, C++ classes support constructors as special member functions. The constructor of a class is called when an object (instance) of that class is created.
  class counter {
  public:
  
  counter(int v = 0 ) : n(v) { init("default"); }
  counter(char* s, int v=0) : n(v) { init(s); } 
  ~counter() { delete[] id; }
  	
  char* name() { return id; }
  
  void operator++() { n = n + 1; }
  int value() { return n; }
  private:
  int n; char* id;
  void init(char* s) { 
  	id = new char[strlen(s)+1];
  	strcpy(id, s);
  	}
  };
  
  

slide: A named counter

To illustrate the use of constructors, the simple counter of the previous section is extended to contain a string (that is a char* pointer to a sequence of characters) as an additional data member. See slide 2-named. This pointer, called id, contains the name of the counter. Instead of one constructor, as in the previous version, the current counter class contains two constructors. The first constructor, that is defined as
  	counter(int v = 0 ) : n(v) { init("default"); } 
  
functions as the default constructor. First note that the integer parameter of the constructor is given a default value of zero. Before evaluating the function body, which initializes the id string to a default value, the initialization stated after the colon is performed. This results in initializing the data member n to its appropriate value. The initialization of data members (which may be objects of user-defined classes) directly after the colon is often more efficient than an initialization by explicit assignment. The function init, which takes care of allocating resources for the name of the counter, is defined outside the scope of the class definition for the counter. That the function belongs to the counter is indicated with the scoping operator, which is written as two colons. Constructors may be overloaded. The second constructor is chosen when a string (char*) is given as an argument at creation time. Also this constructor declares a default value of zero for its integer parameter. In its function body init is called to create a new (char*) string containing the contents of the argument string. An example of its use is:
  counter c("ctr-1"); c++; 
  cout << c.name() << " = " << c.value();
  
  
Apart from a member function name which returns the id of the counter, we also encounter a destructor for a counter object, defined as ~counter() { delete[] id; } This destructor deletes the (char*) string id. The destructor is called either when an object of type counter ends its lifetime by a change of scope or, in the case of a pointer to a counter object, when the programmer explicitly disposes of the object. Both new and delete are keywords of C++, introduced to manage dynamic memory allocation. In the absence of garbage collection, these are the only means the programmer has of dynamically creating and destroying objects. In C++ it is possible to bypass the memory allocation scheme provided by the language, but this requires much programmer expertise.

Protection

Classes provide encapsulation in the sense of data hiding by allowing both private and public sections. Usually, the private section contains data members that may not be directly accessed by clients of the object, and the public section contains methods or function members to inspect or modify the (hidden) data members. However, despite the explicit definition of a private section, illegal access may still be possible. The type modifier const may be employed to indicate that a particular data item is constant or that some operation does not modify an object. See slide 2-const.
  class counter {
  public:
  
  counter(int v = 0 );
  counter( char* s, int v = 0 ); 
  
  ~counter() { delete[] id; }
  	
  const char* name() const { return id; }
  
  void operator++() { n = n + 1; }
  int value() const { return n; }
  
  private:
  int n; char* id;
  void init(char* s);
  };
  

slide: Using const to protect access

First look at the use of const in const char* name() const { return id; } Without the first const the (char*) string that is returned by the function {\em counter::name} may be modified as in the example. This is allowed since the function just returns a pointer to the data member id. When declaring the result of the function to be const, the (char*) string returned may no longer be modified since it is considered to be a constant. The example below then results in a compile-time error
  counter c("ctr-2"); c++; 
  c.name()[4]='1';
  cout << c.name() << " = " << c.value();
  
The other use of const is illustrated by the second occurrence of const in the definition of the member function name. In this way, the programmer may state that the member function does not modify any of the data members of the object. This use of const may be regarded as a means of documenting the role of a member function, namely as one that merely inspects the object instead of modifying its state. In a similar way, the member function value may be documented as being a safe operation.

Type conversion

When defining a class as the realization of some (abstract) type, the programmer must be well aware of the relation of an object to elements of other data types, including built-in types. For example, when defining the constructors for a counter we have allowed for constructing a counter from an integer as well as a (char*) string. The C++ language provides facilities for automatic type conversion based on the one-argument constructors defined for a class. Complementary to type conversion based on constructors, C++ allows the programmer to define type conversion operators that map the object to the specified type by means of a user-defined mapping function. See slide 2-conversions.
  class counter {
  public:
  
  counter(int v = 0 ) : n(v), id("default") { }
  counter( char* s, int v = 0 );
  	
  ~counter() { delete[] id; }
  	
  const char* name() { return id; }
  
  void operator++() { n = n + 1; }
  
  operator int() { return n; }
  operator char*() { return id; }
  
  private:
  int n; char* id;
  };
  

slide: Widening and narrowing conversions

In the example below, the function fun is defined with a reference to a counter as a parameter. When calling this function with a (char*) string, the compiler automatically takes care of calling the appropriate constructor in order to convert the (char*) string to a counter object. For example, when defining
  void fun( counter& c ) {
  
  cout << (char*) c << " = " << (int) c;
  
  }
  
then the call
  fun("ctr-3");
  
results in the creation of a new counter (initialized to zero) and displaying of the name of the counter and its value. Conversely, as shown in the example, it is possible to make use of the opposite conversions, from a counter to either a (char*) string or integer. In the example, the name and the value of the counter are printed not by calling the appropriate member functions, but by an explicit conversion of the counter to, respectively, a (char*) string and an integer. Conversion to a (char*) string is defined by the member function operator char*() { return id; } and conversion to an int is defined similarly. The effect of these conversions may also be achieved by explicit field selection, which is to be preferred whenever the value of the conversion is not immediately obvious. Indeed, returning a $(char*) from a counter is dubious, whereas it would be perfectly natural for a token consisting of an integer identifier and the string representation of the token. Potentially, type conversions are dangerous operations since, inevitably, information will be lost during the conversion. Moreover, the use of both a constructor for (char*) and a type conversion operator to (char*) may lead to ambiguities when the compiler tries to resolve an overloaded function call.

Overloading and friends

Type conversions enable the programmer to overload a function implicitly. Explicit overloading of functions and operators is also possible. Both overloading and type conversions contribute to the polymorphic behavior of objects. Widening and narrowing conversions (also called promotions and coercions) are defined class-wise. Explicit function overloading, in contrast, is of a more global nature, since it may define an arbitrary functional relation between between user-defined and/or built-in types.
  class counter {
  friend int operator<(counter&, int);
  public:
  
  counter(int v = 0 );
  counter( char* s, int v = 0 );
  	
  ~counter() { delete[] id; }
  	
  const char* name() { return id; }
  int value() const { return n; }
  
  void operator++() { n = n + 1; }
  
  private: int n; char* id;
  void init(char* s);
  };
  
  int operator<(counter& c, int i) { return c.n < i; }
  

slide: Overloading and friends

In the counter defined in slide 2-overloading, one of the familiar comparison operators has been overloaded to allow for a comparison of the value of a counter object with an integer value. In the example, the comparison operator has been declared to be a friend of the class counter. Declaring a function (or a class) to be a friend grants that function (or the member functions of the class) access to the private parts of the object. For the example given, it would not have been necessary to declare the operator as a friend, but, as for instance in the case of matrix multiplications, reasons of efficiency often will cause the programmer to break encapsulation of the class by means of a friend declaration. The use of the operator is illustrated by the following code fragment:
  counter c("ctr-4"); c++;
  if ( c < 2 )
  	cout << c.name() << " = " << c.value();
  
Overloading and type conversion exemplify the flexibility of a polymorphic type system. However, both techniques may be considered to provide an ad hoc solution to the problem of incorporating polymorphism in the language when compared with the polymorphism introduced by inheritance and generic (template) types. These will be studied in the following sections.

Inheritance

Inheritance is perhaps the most distinct feature of object-oriented programming. Pragmatically, from a software engineering perspective, inheritance provides a mechanism for code sharing and code reuse. From a type theoretical point of view, inheritance is one of the mechanisms supporting polymorphism. Operationally, the power of inheritance in C++ comes from the use of virtual functions and dynamic binding.

Abstract classes

The classical example to demonstrate the use of inheritance and the virtues of dynamic binding is a hierarchy of shapes. The hierarchy of shapes consists of an abstract shape from which concrete shapes, such as a circle and a rectangle, may be derived. When deriving concrete shapes, the programmer merely has to provide the appropriate constructors and define the actual method for displaying the shape. An abstract shape is defined as in slide 2-shape.
  class shape { 
\fbox{shape}
public: shape(int x = 0, int y = 0) : _x(x), _y(y) { } void move(int x, int y ) { _x += x; _y += y; } virtual void draw() = 0;
// pure virtual
protected: int _x, _y; };

slide: Abstract shape

A shape, viewed as an abstract entity, contains data members for its origin, and further must provide, apart from a constructor, the methods for moving and drawing a shape. The abstract class shape defines a constructor which sets the origin to $(0,0)
, unless other values have been provided. The member function move may be implemented for all shapes as simply changing the origin in an appropriate way. On the other hand, drawing a shape is undefined for an abstract shape. For this reason the member function draw is declared as pure virtual, meaning that it must be redefined by a class derived from the class shape. A class with pure virtual functions is an abstract class. An abstract class can have no instances. For example
shape s; 
// error: abstract class
would result in a compiler error. Having an abstract class shape available, we may define concrete shapes, such as circle and rectangle, as in slide 2-concrete.
  class circle : public shape {  
\fbox{circle}
public: circle( int x, int y, int r) : shape(x,y), _radius(r) { } void draw() { cout << "C:" << _x << _y << _radius; } protected: int _radius; }; class rectangle : public shape {
\fbox{rectangle}
public: rectangle( int x, int y, int l, int r ) : shape(x,y), _l(l), _r(r) { } void draw() { cout << "R:" << _x << _y << _l << _r; } protected: int _l,_r; };

slide: Concrete shapes

For a circle we need to define, apart from its origin, a radius. And, similarly, for a rectangle we need to define the length of the sides. Both circle and rectangle inherit the origin and the member function move from the shape class. Instantiating the inherited part takes place, as indicated after the colon, before evaluating the function body of the constructor. Unlike the initialization of instance variables, which may be assigned a value in the body of the constructor, the initialization of the inherited parts must be done in this way. An explicit initializer is required unless a default constructor is available. The difference between the initialization of a data member immediately after the colon or in the function body of the constructor is quite subtle. In the latter case, a default constructor will be applied to create the data member and the subsequent assignment in the function body may lead to the creation of another instance. Generally, it is safer and more efficient to initialize data members immediately after the colon. Unfortunately, it is not always possible to initialize data in the colon-list. Also, there is no way in which to communicate between the initializers, which may result in repeated computations when there is a dependency between the initial values of the data members. A concrete shape class must necessarily (re)define the member function draw, since an abstract shape cannot possibly know how to draw itself.

A code fragment illustrating the use of concrete shapes looks as follows:


  circle c(1,1,2); rectangle r(2,2,1,1);
  
  c.draw(); r.draw();
  
Note that calling draw is for both kinds of shapes the same. The difference between the two distinct shapes, however, becomes visible when calling the function draw. The function draw specified for circle overrides the specification given for the abstract shape, and similarly for rectangle.

Dynamic binding

The reuse of code is one of the most important aspects of inheritance. The principle underlying the efficient reuse of code (by employing inheritance) may be characterized as {\em "programming by stating the difference,"} which means that one has to (re)define the features of the derived class that are added to or different from what is provided by the base class. To fully exploit this principle we need virtual functions, that is functions for which dynamic binding applies. Operationally, dynamic binding may be regarded as a dispatching mechanism that acts like a case statement to select (dynamically) the appropriate procedure in response to a message. In many procedural programs, such a case statement often occurs (explicitly) when a kind of polymorphism is introduced by means of an explicit tag (as, for example, in combination with a union or a variant-record). The use of such tags may become a nightmare when modifying the informal type system, since each case statement then needs to be updated. Using inheritance with dynamic binding, such case statements are, so to speak, implicitly inserted by the compiler or interpreter. The obvious advantage of such a feature, apart from reducing the amount of code that must be written, is that maintenance is greatly facilitated. A possible disadvantage, however, might be that program understanding becomes more difficult since many of the choices are now implicitly made by the dispatching mechanism instead of being written out explicitly.

To illustrate the power of virtual functions (and dynamic binding) we will add a compound shape to our hierarchy of shapes. See slide 2-compound.


  
  
  
class compound : public shape { 
\ifsli{}{\fbox{compound}}
public: compound( shape* s = 0 ) : fig(s) { next = 0; } void add( shape* s ) { if (next) next->add(s); else next = new compound(s); } void move(int x, int y) { if (fig) fig->move(x,y); if (next) next->move(x,y); } void draw() { if (fig) fig->draw(); if (next) next->draw(); } private: shape* fig; compound* next; };

slide: Compound shapes

A compound shape is actually a linked list of shapes. To add shapes to the list, the class compound extends the class shape with a member function add. Both the member functions move and draw are redefined in order to manipulate the list of shapes in the appropriate way. The list is traversed by recursively invoking the function for the objects stored in the next pointer unless next is empty, which indicates the end of the list. The class compound is made a subclass of shape to allow a compound shape to be treated as a shape.

As an example of the use of a compound shape, consider the following fragment:

compound s;
  s.add( new circle(1,1,2) );
  s.add( new rectangle(2,2,3,5) );
  s.draw(); s.move(7,7); s.draw();
  
After creating an empty compound shape, two shapes, respectively a circle and a rectangle, are added. The compound shape is asked to draw itself, it is moved, and then asked to draw itself again. The compound shape object, when moving and drawing the list of shapes, has no knowledge of what actual shapes are contained in the list, which may be compound shapes themselves. This illustrates how we may achieve polymorphic behavior by using inheritance.

A more explicit example of the polymorphic behavior of shapes is given by the following code fragment.

shape* fig[3];
  fig[0] = &s; 
the compound shape
fig[1] = new circle(3,3,5); fig[2] = new rectangle(4,4,5,5); for( int i = 0; i < 3; i++ ) fig[i]->draw();
After storing some actual shapes, including a compound shape, in an array of (pointers to) shapes, a simple loop with a uniform request for drawing is sufficient to display all the shapes contained in the array, independent of their actual type.

This example is often used to demonstrate that when adopting an object-oriented approach the programmer no longer needs to include lengthy case statements to choose between the various drawing operations on the basis of an explicit type tag.

The careful reader may have noted that the absence of the declaration virtual for the member function move may lead to problems. Indeed, this leads to erroneous behavior since moving only the origin of the compound shape will not do. In our slightly wasteful implementation of a compound shape, the member variables inherited from shape play no role. Instead, each shape in the list must be moved. This could be repaired either by declaring the function shape::move as virtual or by redefining compound::draw and eliminating compound::move. This illustrates that it takes careful consideration to decide whether or not to make a member function virtual. Some even suggest making member functions virtual by default, unless it is clear that they may be declared non-virtual.

Multiple inheritance

Graphical shapes are a typical example of objects allowing for a tree-shaped taxonomy. Sometimes, however, we wish to define a class not from a single base class, but by deriving it from multiple base classes, by employing multiple inheritance.
class student { ... };
  class assistant { ... };
  
  class student_assistant
  		: public student, public assistant {
  public:
  student_assistant( int id, int sal ) 
  		: student(id), assistant(sal) {}
  };
  

slide: Multiple inheritance

In slide 2-multi-1, one of the classical examples of multiple inheritance is depicted, defining a student_assistant by inheriting from student and assistant.

Dynamic binding for instances of a class derived by multiple inheritance works in the same way as in the case of single inheritance. However, ambiguities between member function names must be resolved by the programmer.


class person { };
  class student : virtual public person { ... }
  class assistant : virtual public person { ... }
  
  class student_assistant
  	: public student, public assistant { ... };
  

slide: Virtual base classes

When using multiple inheritance, one may encounter situations where the classes involved are derived from a common base class, as illustrated in slide 2-multi-2.

To ensure that {\em student_assistant} contains only one copy of the person class, both the student and assistant classes must indicate that the person is inherited in a virtual manner. Otherwise, we may not have a declaration of the form

  person* p = new student_assistant(20,6777,300);
  
since the compiler would not know which person was meant (that is, how to apply the conversion from {\em student_assistant} to person).

Using assertions

Whatever support a language may offer, reliable software is to a large extent the result of a disciplined approach to programming. The use of assertions has long since been recognized as a powerful way in which to check whether the functional behavior of a program corresponds with its intended behavior. In effect, many programming language environments support the use of assertions in some way. For example, both C and C++ define a macro assert which checks for the result of a boolean expression and stops the execution if the expression is false.

In the example in slide 2-assertions, assertions are used to check for the satisfaction of both the pre- and post-conditions of a function that computes the square root of its argument, employing a method known as Newton iteration.


  double sqrt( double arg ) { 
\fbox{sqrt}
require ( arg >= 0 ); double r=arg, x=1, eps=0.0001; while( fabs(r - x) > eps ) { r=x; x=r-((r*r-arg)/(2*r)); } promise ( r - arg * arg <= eps ); return r; }

slide: Using assertions in C++

In the example, the macro assert has been renamed require and promise to indicate whether the assertion serves as, respectively, a pre- or post-condition. As the example in slide 2-assertions shows, assertions provide a powerful means by which to characterize the behavior of functions, especially in those cases where the algorithmic structure itself does not give a good clue as to what the function is meant to do.

Object design

The use of assertions has been promoted in  [Meyer88] as a design method for object-oriented programming in Eiffel. The idea is to define the functionality of the various methods by means of pre- and post-conditions stating in a precise manner the requirements that clients of an object must meet and the obligations an object has when executing a method. Together, the collection of methods annotated with pre- and post-conditions may be regarded as a contract between the object and its potential clients. See section contracts.

Whereas Eiffel directly supports the use of assertions by allowing access to the value of an instance variable before the execution of a method through the keyword old, the C++ programmer must rely on explicit programming to be able to compare the state before an operation with the state after the operation.


class counter { 
\fbox{counter}
public: counter(int n = 0) : _n(n) { require( n >= 0 ); promise( invariant() );
\c{// check initial state}
} virtual void operator++() { require( true );
\c{// empty pre-condition}
hold();
\c{// save the previous state}
_n += 1; promise( _n == old_n + 1 && invariant() ); } int value() const { return _n; }
\c{// no side effects}
virtual bool invariant() { return value() >= 0; } protected: int _n; int old_n; virtual void hold() { old_n = n; } };

slide: The counter contract

As an example, the annotated counter in slide 2-ass-1 includes a member function hold to store the value of its instance variable. It is used in the operator++ function to check whether the new value of the counter is indeed the result of incrementing the old value.

Assertions may also be used to check whether the object is correctly initialized. The pre-condition stated in the constructor requires that the counter must start with a value not less than zero. In addition, the constructor checks whether the class invariant, stated in the (virtual) member function invariant, is satisfied. Similarly, after checking whether the post-condition of the operator++ function is true, the invariant is checked as well.


class bounded : public counter { 
\fbox{bounded}
public: bounded(int b = MAXINT) : counter(0), max(b) {} void operator++() { require( value() < max() );
\c{// to prevent overflow}
counter::operator++(); } bool invariant() { return value() <= max && counter::invariant(); } private: int max; };

slide: Refining the counter contract

When employing inheritance, care must be taken that the invariance requirements of the base class are not violated.

The class bounded, defined in slide 2-ass-2, refines the class counter by imposing an additional constraint that the value of the (bounded) counter must not exceed some user-defined maximum. This constraint is checked in the invariant function, together with the original counter::invariant(), which was declared virtual to allow for overriding by inheritance.

In addition, the increment operator++ function contains an extra pre-condition to check whether the state of the (bounded) counter allows it to perform the operation.

From a formal perspective, the use of assertions may be regarded as a way of augmenting the type system supported by object-oriented languages. More importantly, from a software engineering perspective, the use of assertions provides a guideline for the design of classes and the use of inheritance. In the next chapter we will discuss the use of assertions and the notion of contracts in more detail.