Requirements for distribution

Instructor's Guide


intro, components, standards, Java workgroup, corba and hush, summary, Q/A, literature
Object-oriented systems may be regarded as logically distributed systems. When properly designed, objects may be regarded as servers conforming to a locality principle enforcing strict encapsulation, forbidding access to global variables. However, as observed in  [Wegner92], physically distributed systems add another dimension of abstraction concerns. According to  [Wegner92], physically distributed systems supplement the logical autonomy requirements of object-oriented systems by requirements for failure transparency (robustness for hardware failures), migration transparency (to allow objects being moved), replication transparency (to allow objects being replicated) and location transparency (to ensure that objects function as they should irrespective of their location). See slide 6-transparency.

Transparency -- distributed abstraction


slide: Distribution and transparency

Transparency is a notion very similar to the abstraction concerns of object-oriented (component-based) software technology. According to  [Wegner92], both notions are likely to converge in future computing systems. In addition to the transparency concerns mentioned, distributed computing must (ideally) support access transparency (to allow access to both local and remote resources in an identical way), concurrency transparency (to abstract from the possible concurrent behavior of a server object), performance transparency (to allow for load balancing by means of dynamic reconfiguration) and scaling transparency (so as to be able to embed a system into another, larger system). Distribution transparency (encompassing the requirements listed above) is needed to ensure physical robustness, but significantly increases the design and implementation effort. In this section we will study the facilities needed to achieve distribution transparency in an object-oriented setting. These facilities include support for distributed objects, object identity, object migration, remote object method invocations and protocols for transport across a network.

Distributed object computing

Remote object method invocation closely resembles remote procedure call (RPC), which is available on a variety of platforms. In  [SMDCE] an extension of the Open Software Foundation's Distributed Computing Environment (see OSF DCE, 1992) is described, offering support for fine-grained distributed objects, system-wide object identity, location-independent method call and dynamic object migration. See slide 6-dce.

Distributed Object Computing

\zline{\fbox{DCE++}}

Distributed Computing Environment

\zline{\fbox{OSF DCE}}
slide: Distributed Object Computing

The unit of distribution supported by DCE++ consists of individual C++ objects, whereas for DCE the unit of distribution amounts to heavy-weight processes. Another improvement over DCE is the symmetry in communication between objects, which means in effect that there is no distinction necessary (as for DCE) between client and server processes. Communication between objects, whether remote or local, is by member call, which in the case of remote objects is automatically mapped onto the underlying DCE RPC mechanism. To establish universally unique identifiers for objects, DCE++ relies on the name management services provided by DCE. This has as an additional advantage that applications may be integrated with any system conforming to the DCE standards. Location-independence is implemented by providing a proxy for each distributed object. Requests to an object that is potentially located elsewhere are automatically addressed to the proxy that, in its turn, identifies the current location of the object and forwards the request. In addition to the distributed name management and communication services, DCE offers supports for light-weight processes, security and the synchronization of distributed clocks. DCE defines a standard for which a number of implementations exist and to which, for example, the Microsoft object linking and embedding facilities adhere. In contrast, DCE++ has only an experimental status. It offers support for employing DCE facilities more conveniently in an object-oriented setting.

Object replication

Due to the communication overhead, remote object method invocations are considerably more expensive than local object invocations. As observed in  [Nolte91], even fast invocation protocols may not be sufficient when remote objects have to be accessed frequently. Therefore, replication techniques have to be considered to keep invocation overhead low. In  [Nolte91], a novel object model for C++ (supporting so-called dual objects) is introduced to improve the efficiency of remote object invocations. (A similar model was originally introduced for the language Orca, in 1987 (Bal, 1991).) Dual objects consist of a prototype (which is the original object) and an arbitrary number of extracts (which are partial replicas, containing the public section of the original object). The model described in  [Nolte91] allows for temporary inconsistencies between the prototype and its extracts. However, vertical inconsistencies (that is, differences between the prototype and its replicas) are restored whenever the prototype is accessed directly. In a similar way, horizontal inconsistencies (that is, inconsistencies between various replicas) may be repaired by explicitly accessing the prototype.

Object replication

-- dual objects
  • dual objects -- prototypes and extracts
  • annotations -- invocation, parameter passing
  • placement -- automatic, symbolic, co-located

Remote invocation

-- annotations
  • in -- new extract before invocation
  • out -- new extract after invocation
  • inout -- combination of in and out
  • global -- acts as prototype, no extract is made

slide: The dual object model

The dual object model may significantly improve the efficiency of remote object invocations for read-only access. However, for write access the full price of remote invocation has to be paid if inconsistencies are to be prevented. To avoid simultaneous updates, communication with the prototype is governed by a so-called clerk, a light-weight process that controls access to the prototype. Support for remote object invocations may be provided by means of annotations. Annotations (which are embedded in special C++ comments) are introduced in  [Nolte91], both for the invocation of methods of remote objects and to determine the actual parameter passing mechanism to be used. See slide 6-dual. Invocation annotations are placed in the class defining the remote (prototype) object, in a similar way as const may be used in C++ to indicate that a method has no effect on the state of an object. Methods may be annotated as in (to indicate that an extract must be made), as out (to indicate that a new extract must be made after invocation), as inout (which is a combination of in and out) and as global (to indicate that communication is to take place directly with the prototype). A second category of annotations is needed to determine how parameter passing needs to be handled for remote invocations. In  [Nolte91], annotations are provided that support parameter passing mechanisms such as call-by-value, call-by-result and call-by-value-result, which determine whether to make a copy before, after or before and after execution of the method. The annotations for these are, similar to those for method invocations, respectively in, out and inout. We may note that, to support the full collection of data types offered by C++, these annotations need to be extended with type information to support the efficient marshaling (that is packing and unpacking) of the actual parameters of a remote method call. See also section OMG, in which a more general approach to these problems is described. In addition to these familiar mechanisms, a number of alternative parameter passing mechanisms may be thought of, mechanisms that provide support for object migration and (partial) replication. These include call-by-share (which is employed for the language Orca, see  [Orca]), call by unification (which is the mechanism employed in DLP), and call-by-move and call-by-visit (which are supported by the language Emerald,  [Black]). These mechanisms, however, require a close interaction between the language runtime system and the underlying operating system. In particular, call-by-share is quite expensive, as it offers the choice between full copying (and maintaining consistency between the copies and the original) or passing by reference (which may result in expensive communication traffic). As an intermediate between full copying and relying on remote references,  [Nolte91] proposes a call-by-likeness mechanism that results in making extracts from the public section of the prototype. Finally, a third category of annotations discussed in  [Nolte91] concerns the placement of objects on either a node in a network of processors, a separate address space on the same node, or a different execution context in the same address space. Alternatives that have been mentioned include automatic placement (by some default strategy), symbolic placement (by using names for which a binding is provided separately), direct placement (that uniquely identifies the location), scoped placement (that determines placement dependent on a particular context) and co-located placement (that may be used to place an object near another object). Note that all placement directives may either be interpreted dynamically or statically. In addition, dynamic reconfiguration may be employed to improve load balancing. However, annotations such as these help us only half-way. As observed in  [Lea93], special primitives are needed to support multi-processor object configuration and remote object invocation in an efficient manner. The interested reader is referred to  [Lea93] for further details.