Concurrent extensions of C++

Instructor's Guide


intro, components, standards, Java workgroup, corba and hush, summary, Q/A, literature
Extending C++ with facilities for concurrency and distribution has become a very active research area. The common motivation, learned from the reports describing this research, is that C++ was chosen because it is a very popular object-oriented language (which, very importantly, supports static type checking). The advantage of extending an existing language is that full use can be made of the tools and expertise already developed. Moreover, in many cases the transition to using the constructs introduced for distribution and concurrency can be done gradually, without adopting a completely different programming style. In this section, we will look at a number of extensions of C++. These extensions include Concurrent C++ (that introduces process data types, see  [GR88]), ACT++ (which is based on the actor model, see  [Kafura90]), %m{}C++ (which introduces a number of primitives for concurrent shared memory computing, see  [Buhr92]), Compositional C++ (that uses write-once variables to effect synchronization, see  [Chandy]), Mentat (that utilizes data flow graphs for parallel computing, see  [Grimshaw87]) and P++ (that introduces primitives for remote object invocations, see  [Nolte91]). This list is certainly not complete, but each of the various languages discussed covers some of the important issues in the design of a parallel object-oriented language.

Design issues

In slide 6-issues we have listed the design issues that occur when developing a parallel object-oriented language, either from scratch or by extending a given language such as C++.

Design issues

-- parallel object-oriented languages
slide: Design issues

An important choice is whether to distinguish between active and passive objects, or whether to support only one kind of object. Another important issue is how the activity of objects is to be defined. The choice we have is between providing an explicit process data type (as in Concurrent C++) or some merger between the process definition of an object and its class specification (as in ACT++). There is a wide choice of communication primitives, ranging from explicit constructs such as monitors and queues (as in %m{}C++) to global write-once variables that allow for the suspension of processes reading uninitialized variables (as in Compositional C++). Global write-once variables also provide, in addition to a mechanism for synchronization, an excellent mechanism for sharing common resources. Other means to implement shared resources are based on the client/server model, and require the support for remote procedure call or synchronous rendezvous (as in Mentat). Transport across a network, moreover, requires suitable primitives (augmenting the type system of C++) to control the packaging of the data involved in the communication (as in P++). Other issues in the design of a parallel object-oriented language have to do with the combination of concurrency and inheritance (that we will study in section conc-inheritance) and the allocation of objects on a network of processes (which may be done statically or dynamically). The latter issue will be briefly dealt with in section distribution

Encapsulating processes and transactions

Concurrent extensions of C++ may in principle be provided without disrupting the semantics of the original language constructs by employing libraries and the class inheritance mechanism. This approach has, for instance, been followed in  [Bersh88]. However, such an approach leaves the responsibility for process creation, process destruction, communication and synchronization entirely to the user. This is clearly not an optimal solution. The alternative approach (which is viable only when the designers of an extension have access to the compiler or are willing to undertake the effort of developing a pre-processor) allows for more drastic modifications to be realized, modifications that may provide support for implicit synchronization protocols and high level concurrency abstractions. Adopting this approach, we do again have a choice. We may leave the original semantics of the base language intact and provide an orthogonal extension (orthogonal in the sense that only new types are added) or we may introduce modifications that more deeply affect the original language by giving a (semantic) reinterpretation of constructs already available. The latter approach will be illustrated by the language ACT++, which extends C++ with constructs inspired by the actor model. The former approach is exemplified by Concurrent C++.

An orthogonal approach -- Concurrent C++

The Concurrent C++ language has a history that is worth mentioning, since it illustrates some of the difficulties involved in programming language design. At about the same time that C++ was developed as an object-oriented extension of C, another extension of C was proposed, namely Concurrent C, which was intended for parallel distributed computing. Both C++ and Concurrent C may be regarded as orthogonal extensions of C, respectively as C with classes and C with processes. By combining these extensions, the developers of Concurrent C++ hoped to arrive at an orthogonal extension of both C++ and Concurrent C, which was also an orthogonal extension of C. See slide 6-orthogonal. The Concurrent C language introduces the additional data type process in C, and supports a notion of transaction, which is a function belonging to a process type. (Actually, the phrase transaction is a misnomer, since it suggests the functionality of database transactions including properties such as atomicity. Transactions in the sense of Concurrent C are more like entries in Ada.) Process types are declared by means of a process specification that lists the transactions allowed and describes what parameters are needed for the creation of a process. An example of a process specification, declaring a consumer and producer process, is given in slide 6-cc-specification.

Specification

\zline{\fbox{Concurrent C{}}}
  process spec consumer() {
  trans void put(int)
  }
  
  process spec producer(process consumer);
  

slide: Process specification in Concurrent C

To define the actual meaning of a process, a process body must be defined. In addition to the primitives already mentioned, the language Concurrent C has an accept statement that may be employed to define the acceptance conditions of a particular transaction, as illustrated below. The transaction put (which in the case of a multiple element buffer would have to be delayed if the buffer is full) results in assigning the parameter value to a local variable of the process. Following the acceptance of the transaction, the value is properly consumed and the consumer is again willing to engage in another transaction.

Implementation

\zline{\fbox{Concurrent C{}}}
  process body consumer() {
  int c;
  for(;;) {
  	accept put(a) c = a;
  	if ( c == EOF ) break;
  	putchar( islower(c)?toupper(c):c );
  	}
  }
  
  process body producer(process cons) {
  int c;
  do {
  	cons.put( c = getchar() );
     } while ( c != EOF );
  }
  

slide: Process implementation in Concurrent C

In comparison, the body of the producer process is very simple. It just sends the consumer the data taken from some external source. Note that the actual consumer must be given as a parameter to the producer at creation time, as illustrated in the code below.
  void main() {
  	create producer( create consumer() );
  }
  
In Concurrent C, process types have a special status. They allow for checking whether the transactions invoked for a particular object are legal, as in the body of producer where the consumer transaction put is called. Also, they act as generic types that are used to pass process entities around, as in the creation of the producer process that takes a consumer as a parameter. Obviously, the notion of process in Concurrent C provides interface description facilities analogous to the facilities provided by the class construct in C++. Yet, since encapsulation by processes leaves something to be desired, the designers of Concurrent C++ (justifiably) argue that proper encapsulation can only be offered by classes. Moreover, by encapsulating processes in classes, mechanisms such as inheritance may be employed as well. In slide 6-cc-encapsulation, an example is given of how a process may be encapsulated in a class.

Encapsulating processes

\zline{\fbox{Concurrent C++}}
  process spec diskDriver() {
  trans int request( int op, long blkaddr, char* buf );
  trans int wait(int ticket);
  trans void done();
  };
  
  class disk {
  process diskDriver dd;
  int nwaiting, nbadrag, tickets [ MAX_PENDING ], ...;
  public:
   ...
  };
  

slide: Encapsulating processes in Concurrent C++

The example specifies a process diskDriver that is made a private data member of the class disk. Each transaction with the diskDriver process must now go through a member function of the class disk. The class disk now can offer high level disk access functions and hide the low level interactions with the actual diskDriver process from the user. For further details see  [GR88]. Concurrent C++ provides an example of a straightforward approach to combining object and processes. From a pragmatic point of view, supporting separate notions of classes and process types may lead to doubling the effort of defining suitable interfaces. However, some may consider that an advantage. From a theoretical perspective, the solution chosen for Concurrent C++ is simply inadequate, as recognized in  [GR88]. Given that a notion of abstract data types underlies both the notion of process types and classes, a unifying abstraction must be found that incorporates both notions in a common type system, while (preferably) retaining the possibility of defining passive (and also multi-threaded) object types. (To give the developers of Concurrent C++ credit, however, such a unifying abstraction is hard to arrive at, as we will see in the sections to follow.)

The actor model -- ACT++

The actor model, as originally introduced in  [He77], is one of the earliest proposals to unify the notions of object and process. See slide 6-act.

Actor model

-- create & become
  • create -- behavioral description, acquaintances
  • send-to -- asynchronous communication via mailboxes
  • become -- {\em history sensitive (replacement) behavior}

slide: The actor model

An actor is an object whose functionality is characterized by a behavioral description, the actor's script. An actor may in addition have a number of acquaintances, which are other actors to which it may send messages. Each actor object has a mailbox, which so to speak is its address, and by which it is known to other actors. Communication, in the original actor model, is asynchronous. Actor objects repeatedly check their mailbox for incoming messages. If there is a message, it is handled and the actor continues its own behavior. At any time, an actor may decide to change its functionality by means of a become statement. The become statement results in changing the script of an actor (its behavioral description), and is a means to provide for history sensitive behavior. Although an actor may change its behavior, it cannot change its address. In other words, an actor may change its role but not its identity. In the actor model, an actor's mailbox may be regarded as the lifetime identity of the actor. The become statement, since it may affect the willingness of an actor to answer certain messages, may effectively be employed as a serialization technique to structure the computation. See  [He77]. The ACT++ language has been developed to explore whether the actor model would fit in a statically typed language such as C++. See  [Kafura90]. The original idea underlying ACT++ is to employ the polymorphic type structure of class inheritance for specifying behavioral descriptions. In other words, classes are used both to create actor objects as well as to specify the change in behavior effected by a become statement.

Actors in C++

\zline{\fbox{ACT++}}
  class bounded_buffer : actor {
   int buf[MAX];
   int in, out;
  public:
  
  bounded_buffer() { in=0; out=0; }
  
  int get() {
   reply buf[out++];
   out %= MAX;
   if (in == out)
    become( empty_buffer,in,out);
   else
    become( partial_buffer,in,out);
  }
  
  void put( int item ) {
   buf[in++] = item;
   in %= MAX;
   if (in == out )
    become( full_buffer,in,out);
   else
    become( partial_buffer,in,out);
  }
  };
  

slide: Bounded buffer in ACT++

For the user there is no difference in invoking a member function of an ordinary object or sending a message to an actor object. Actor objects, in ACT++, may immediately return a result to the user (by means of a reply statement) and continue to execute the function body in their own thread, invisible to the user. Both for actor objects and ordinary objects, the public interface of the corresponding class determines what calls are legal. However, in addition to the static interface description given in the public section of the class, actor objects may restrict their interface dynamically by (explicitly) becoming an instance of a subtype of the original actor. For the example in slide 6-act-bb, specifying a bounded buffer actor, this means either a full buffer, an empty buffer or a partial buffer, of which the specifications are given in slide 6-act-refinement.

Behavior refinement

\zline{\fbox{ACT++}}
  class empty_buffer:bounded_buffer {
  public:
  	bounded_buffer::put;
  };
  
  class full_buffer:bounded_buffer {
  public:
  	bounded_buffer::get;
  };
  
  class partial_buffer:bounded_buffer {
  public:
  	bounded_buffer::put;
  	bounded_buffer::get;
  };
  

slide: Refinement in ACT++

The various subtypes of bounded buffer merely restrict the functionality of bounded buffer by offering a subset of the methods publicly available for the bounded buffer. With reference to our discussion of behavioral compatibility in section contracts, this notion of subtype seems to be in conflict with the requirement of extending the range of behavior for subtypes. However, the semantics of message passing for actor objects (in ACT++) is such that requests that are legal with respect to static type checking are postponed until they are allowed by the dynamic interface specification as given by a become statement. Operationally, an actor object maintains a buffer of incoming messages from which it may select according to its state, which is determined by the successive become statements. The language ACT++ realizes a close integration of the ordinary notion of a class and the notion of actors. However, this integration comes at the price of using class names to change the type of actor objects dynamically (which is hard to deal with, semantically, without reverting to higher order types). In section conc-inheritance, we will discuss a different interpretation of the become statement in ACT++ which avoids using class names, and allows a more modular approach with respect to the inheritance of acceptance conditions by employing behavioral abstractions. As a criticism, we may note that the concurrency model employed by ACT++ is quite limited. Actor objects derive their functionality by inheriting from a system defined actor class, and have only one (inherited) thread. This approach precludes the use of multiple inheritance to define the functionality of an actor as the combination of multiple actors. As we will discuss in section conc-inheritance, multi-threaded active objects (composed by multiple inheritance) may be needed to fully employ an object-oriented approach in modeling concurrent systems.

Synchronization and communication

Concurrent programs tend to be expensive, due to the overhead involved in process creation and communication. In \muCC\ a number of constructs are offered that combine execution properties of concurrent processes in different ways, to allow the programmer the most optimal choice. A rather different approach is exemplified by Compositional C++, which provides constructs to write concurrent programs in a compositional manner, thus facilitating proofs of the correctness of a particular parallel solution.

Execution properties -- \muCC

The design goal of \muCC\ may be characterized as the intention to provide features that allow maximum flexibility in accepting or subsequently postponing the servicing of requests. The constructs offered in \muCC are motivated by an analysis of elementary execution properties, that are needed to extend the object model to obtain concurrency. See slide 6-mu.

Execution properties

\zline{\fbox{\muCC}}
  • thread
  • execution state
  • mutual exclusion

slide: Execution properties

For concurrency, first we need a notion of thread. A thread may be considered as a virtual processor, in the sense that it may independently advance the computation. Secondly, we need a notion of execution state. An execution state contains the information needed to allow concurrent computation (such as a local state, the current routine invocations, etc.). An execution state may be independent of a thread, in the sense that a thread may continue with a particular execution state after an appropriate context switch. Finally, to guarantee the safety of a concurrent computation, we need constructs for mutual exclusion. A mechanism of mutual exclusion is needed to allow an action to be performed without being interrupted by another operation on the same resource. See  [Buhr92]. The execution properties mentioned above may be combined in a number of ways. For example, an ordinary class (in the traditional object model) may be characterized negatively as having no thread and no execution state of its own, since ordinary objects use the thread and execution state of their clients. Coroutines, which provide a simple way in which to exploit (pseudo) concurrency, do have an execution state of their own, but (like classes) do not possess their own thread. Coroutines may be called as ordinary functions, but may be interrupted by another call until an explicit resumption of the interrupted call is ordered, either by the client of the coroutine or a function of the coroutine itself. Monitors provide a mechanism to effect synchronization. They do not possess a thread of their own, nor do they have an (independent) execution state. They may be used to queue processes that have to wait for a certain condition to hold. Storing processes in the monitor queue and releasing them must be explicitly done, using wait and signal statements respectively. The most general construct to allow concurrency is the task, that provides not only a thread and an execution state but also implicit mutual exclusion by means of an accept statement. With respect to synchronization, tasks in muC++ behave like a generalized monitor. This analogy led the designers of muC++ to allow for postponing an already accepted call by means of a monitor-like wait statement (and to subsequently resume the call when an appropriate state change takes place). The constructs developed in muC++ are restricted to shared memory parallelism. The assumption underlying the facilities offered for communication is, according to  [Buhr92], that communication takes place by means of ordinary (member) function call. Consequently, except for tasks (that allow an interpretation of function calls as a remote procedure call or rendezvous), additional synchronization primitives, such as a monitor construct, must be provided to support safe concurrent programming. The muC++ extension is implemented as a pre-processor that defines additional class-keys, to indicate coroutine and task classes, and a special type specifier mutex to indicate that the concurrency class specified offers mutual exclusion.  [Buhr92] observe that the runtime efficiency of a concurrent program may significantly benefit from choosing a coroutine instead of a task. The overhead involved in communicating with a task almost doubles the overhead involved in the use of a coroutine. An obvious drawback of the approach embodied by muC++ is that it offers the programmer perhaps too rich a choice.

Write-once variables -- Compositional C++

Among the concurrent extensions proposed for C++, Compositional C++ is exceptional in the sense that it is motivated by explicit proof-theoretical considerations with respect to the correctness of concurrent programs (see Chandy and Kesselman, 1992). Due to the non-deterministic nature of concurrent programs, the proof of the correctness of a component usually involves an explicit invariant stating the properties of all other components that are needed to ascertain the independence of that component from its environment. Compositional C++ intends to provide constructs that allow the programmer to develop components that are, to a high degree, independent, and may consequently be combined without danger of disrupting the integrity of the whole. See slide 6-compositional.

Write-once \c{variables}

\zline{\fbox{Compositional C++}}
  • compositional processes
  • sync type modifier

slide: Synchronization in Compositional C++

As stated in  [Chandy], the idea in obtaining compositionality is that a process has private variables that cannot be referenced by other processes and shared variables that can be assigned values at most once. The original contribution of Compositional C++ is the introduction of the sync type modifier. This modifier resembles the const type modifier (which may be used to indicate that a particular variable holds a constant value during its lifetime). In contrast to the const modifier (which may only be used for the initialization of an object at creation time), the sync modifier allows for an object to be initialized at an arbitrary point in the computation. However, a sync variable may be initialized only once. The sync variable provides a convenient mechanism both for synchronization and communication between concurrently executing threads. Moreover, it is an efficient mechanism as well. After being initialized, data pointed to by sync variables may be freely copied, since these data will not be changed thereafter. If a process tries to access data stored in a sync variable before initialization has taken place, it will be made to wait until the initialization is completed, that is until the sync variable is assigned a value by one of the processes sharing the variable. As for the correctness of concurrent programs, according to  [Chandy], a parallel composition is proper if all the variables shared by the processes being composed are sync variables. The concurrency model adopted in Compositional C++ is rather limited, being restricted to processes that communicate by means of (global) shared variables. The advantage of such a restricted concurrency model, however, is that it allows a rigorous approach to proving the correctness of a concurrent program in C++. The notion of a sync (write-once) variable is certainly appealing, and may possibly be generalized to a truly distributed setting.

Data flow -- Mentat/C++

A data flow approach (as exemplified by Petri nets, for example) is often used for the design of concurrent and distributed systems. A data flow diagram is a directed graph in which nodes represent computations and arcs represent the data dependence between nodes (see Grimshaw and Liu, 1987). Tokens carrying data and control information flow along the arcs from one node to another. See slide 6-mentat.

Data flow

\zline{\fbox{Mentat/C++}}
  • persistent actors -- to share data
  • futures -- like a continuation

slide: Continuations in Mentat/C++

In  [Grimshaw87], an extension of C++ called Mentat is proposed, that combines an object-oriented approach to concurrency with a data flow approach. In the data flow model described in  [Grimshaw87], nodes are called actors, which are not to be confused with actors in the Actors model. In contrast to common data flow models, Mentat supports persistent actors that may carry a state. Persistent actors are active objects that behave like monitors. They may be used as resource managers, to share data. In addition, the Mentat language offers so-called futures that may be used to determine the computation following a communication. Futures (which may be regarded as continuations, that is computations that are stored in a function that will be invoked somewhere in the future) correspond to subgraphs in the data flow model. The Mentat compiler provides support for implicitly generated futures, taking into account the persistent behavior of actor nodes. This allows for significant optimizations based on an analysis of the data flow between processes. Mentat is implemented as a pre-compiler that generates actual actor objects inheriting from a system-defined class, defining a single threaded active object. Despite the diversity between the extensions of C++ studied thus far, a common notion of active objects seems to emerge. In ACT++, muC++ and Mentat, active objects are single threaded objects which inherit their functionality from a system-defined base class. As we will see in the next section, this approach limits the opportunities for employing inheritance to characterize the functionality of active object classes in an incremental way. As another issue we may note that none of the extensions discussed so far pays any (explicit) attention to the problems involved in (geographically) distributed objects. This will be the topic of section distribution.