Native objects -- crossing the language boundary

Instructor's Guide


intro, components, case study, crossing boundaries, styles, platform, summary, Q/A, literature
Embedding (script) language interpreters is becoming standard practice, as testified by the existence of embeddable interpreters for Tcl, Perl, Python, Javascript, Java, and Prolog. Each of these languages also supports calling native code, that is code written in C or C++, to allow for accessing system resources or simply for reasons of efficiency.

Native bindings for these languages are available only on the level of functions. Even for Java, native methods of an object are defined as functions that receive a handle to the invoking object. Given a language with objects, possibly by adopting an object extension for the languages without objects, the problem is to find a proper correspondence between objects defined in the high-level (script) language and the native objects defined in C/C++.

In this (sub)section we will first study an extension of Prolog with objects, and then indicate a solution to establish a close correspondence between the (Prolog) objects and their native counterparts. In the next (sub)section, we will apply this approach to establish a correspondence between Java and C++ objects.


Objects in Prolog


slide: Objects in Prolog

In slide Objects our proposed object extension for Prolog (in particular SWI-Prolog,  [SWI]) is presented. Actually, there are many object extensions of Prolog around, for example the well-known Sicstus Objects. Our extension is motivated by the following considerations:

requirements


In our solution, objects are represented by dynamic fact clauses, containing a Handler, indicating how native calls are to be dealt with, a Class, and object identity ID, possibly a reference REF to a native C/C++ object, and a list of Ancestors.

Objects (or classes of objects, if you prefer) are defined by a collection of clauses with a head predicate of the form class_method(This,...), specifying the class, method and object identity parameter. The actual invocation of the method takes the form self(This):method(...), where the colon acts as the familiar dot object access parameter. Note that the identity parameter (This) does not occur among the method parameters, but is instead contained in the object specifier. Instead of the keyword self, we may also use a class name to enforce a cast to specific object type when invoking the method. In the actual object extension, we also support object state instance variables, which are however not relevant for our discussion here.

Object methods may be defined as native by including a goal of the form native(Handler, Method, Result), where Handler specifies the (native) handler to be invoked, Method the actual request, and Result a variable to store the possible outcome of the request. When the Handler parameter is left unspecified, the handler defined for the object will be taken to effect the native call.

Let's look at some examples first, to augment this admittedly concise description.


          midi(This):midi,  
create midi object
Self = self(This), Self:open('a.mid'), Self:header(0,1,480), Self:track(start), Self:melody([48,50,51,53,55]), // c d es f g, minor indeed Self:track(end),
end track
In the fragment above we see how a midi object is created and how a simple melody is written to a file. Note that we use a variable Self for indicating the object specifier self(This). Below, the actual definition of the midi object (class) is given.

  
midi
:- use(library(midi:[midi,lily,music,process])). :- declare(midi:object,class(midi),[handler]). midi_midi(This) :-
constructor
midi(This):handler(H), // gets Handler from class declare(H,new(midi(This)),[],[],_).
The constructor for the midi object, for which the method name is equal to the class name, asks whether there is a Handler for midi objects. This handler, which is specified in the declare command above, is then passed to the declare command for the object. Since there is a handler, the constructor for the native midi object (defined in C++) is automatically invoked.

  
native methods
midi_read(This,F) :- native(_,This,read(F),_). midi_analyse(This,I,O) :- native(_,This,analyse(I,O),_). midi_open(This,F) :- native(_,This,open(F),_). midi_header(This,M) :- native(_,This,header(M,0,480),_). midi_track(This,X) :- native(_,This,track(X),_). midi_tempo(This,X) :- native(_,This,tempo(X),_). midi_event(This,D,C,M,T,V) :- native(_,This,event(D,C,M,T,V),_).
All the methods listed above are implemented using the native midi C++ object. Note that both the Handler and the Result parameter are left unspecified. The handler is by default taken from the class declaration for the midi object class. There is no result when invoking these native methods.

  midi_note(This,D,C,T,V) :- 
          Self = midi(This), 
cast to midi
Self:event(D,C,note_on,T,V), Self:event(D,C,note_off,T,V). midi_melody(This,L) :- self(This):melody(480,1,L,64). midi_melody(_This,_,_,[],_). midi_melody(This,D,C,[X|R],V) :- Self = self(This), Self:note(D,C,X,V), midi_melody(This,D,C,R,V).
direct invocation
The midi object clauses given above augment the native methods by defining additional predicates, such as note and melody. These clauses also illustrate the liberty we have in casting the object specifier to a specific class or bypassing dynamic method invocation. Clearly, a native binding for the midi object is necessary, since Prolog is highly inappropriate for reading or writing midi files directly. It is however very appropriate for specifying rules for analyzing MIDI files!

C++ bindings

To redirect native method calls for our (Prolog) objects to their native C++ counterparts we need some additional machinery. First of all, we have to translate a (Prolog) method call to a format that can be passed to a C++ handler, so that the C++ handler may decide which method to invoke for what object. To get a direct correspondence between objects in Prolog and objects in C++, we store a reference to the C++ object in the REF variable of the Prolog object. When a native method is called, this reference is converted into an object handler or pointer in C++, to which the (native) method invocation will be addressed. We use a smart pointer to encapsulate this reference and to allow for directly invoking (native) methods for the corresponding object type.

As outlined in section Reactor, in the hush framework we use an event-based mechanism to effect foreign language bindings. This means that the information concerning the native call is stored in an event object that is passed to a handler, which invokes the operator function on the occurrence of an event. In the code fragment below it is shown how native method dispatching is taken care of in the operator function of a C++ kit_object, for which a corresponding object in Prolog is assumed to exist.


  int kit_object::operator()() {
          event* e = _event;
  
          vm<kit> self(e);  
smart pointer
string method = e->_method(); if (method == "kit") {
constructor
kit* q = new kit(e->arg(1)); _register(q); result( reference((void*)q) ); } else if (method == "eval") { long res = self->eval(e->arg(1)); result( itoa(res) ); } else if (method == "result") { char* res = self->result( atoi(e->arg(1)) ); result(res); } else {
dispatch up in the hierarchy

return handler_object::operator()(); } return 0; }
Before checking which method is invoked, which is recorded in the event, we create a smart pointer (self) by instantiating a vm instance for the kit class. (The acronym vm is somewhat inappropriately derived from virtual machine.) If the method is a constructor, the result is a reference, that is an integer encoding of the actual pointer. Otherwise, the method is invoked, simply by addressing the smart pointer self. As a comment, the use of smart pointers is a C++ specific technique based on redefining the dereference operator, as illustrated below. When no matching method can be found, the operator method for a handler object higher up in the hierarchy is invoked. In our example, both the kit_object and the midi_object are directly derived from handler_object. This hierarchy, which is intended to encapsulate the native objects, parallels the original hush class hierarchy in a straightforward way. The smart pointer vm class, that we need for our binding of Prolog objects to native C++ objects, is relatively straightforward.

  template <class T>
  class vm  { 
smart pointer class
public: vm(event* e) { int p = 0; char* id = e->option("ref"); if (id) { p = atoi(id); } _self = (T*) p; } virtual inline T* operator->() { return _self; } private: T* _self; };
In summary, the constructor converts the event argument to a reference to the parameterized object type T, which is used as the result of the dereference operator. This allows for invoking methods for object type T without further ado. As a comment, our presentation here is somewhat simplified, since we do not take into account the possibility of upcalls, that is the invocation of Prolog code from C++. We will deal with these additional details when discussing the Java/C++ binding in the next (sub)section.