topical media & game development

talk show tell print

object-oriented programming

Crossing boundaries

subsections:


It is futile to hope for a single language or paradigm to solve all problems. Therefore, as our small case study concerning multimedia feature extraction indicates, components may differ in how they are realized. Some components are better implemented using knowledge-based systems technology, whereas other components require the use of a systems programming language such as C++. Even within components it may be necessary to transgress the language boundary. For example in Java applications, wrapping legacy applications or operating system-dependent code is usually done using the native language interface.

In this section we will look at some studies (executed within the hush framework) that exemplify a multi-paradigm and multi-lingual approach. We will first look at the issues that arise when embedding a logic (that is Prolog) interpreter. Then we will extend the embedded logic with objects that may correspond to (native) objects in the host language, that is C++. These sections may safely be skipped by readers not interested in logic programming. Finally, we will look at how to realize corresponding collections of objects in (native) C++ and Java.

Embedded logic -- crossing the paradigm boundary

Knowledge is a substantial ingredient in many applications. By knowledge we mean information and rules operating on that information, to obtain derived information. As in any (software) engineering effort, maintenance, that is knowledge maintenance, is of crucial importance. When we do not avoid the dispersion of knowledge and information in the actual code of the system, maintenance will be difficult. Put differently, for reasons of flexibility and maintenance we need to factor out the (volatile) knowledge and information components.

Traditionally, the information components are often taken care of by a database that allows for the formulation of views to obtain (possibly aggregate) information. Logic or logic programming is a strictly more powerful mechanism to deal with information and knowledge. In our group, we have been studying the use of logic programming in knowledge-intensive software engineering applications.

embedded logic



  <query kit=pl src=local.pl cmd=X:email_address(X)>
  <param format=" \%s">
  <param result="">
  <param display="<h4>The query</h4>">
  <param header="<h4>The adresses</h4> <ul>">
  <param footer="</ul>">
  email_address(E) :-
  	person(X),
  	property(X,name:N),
  	property(X,familyname:F),
  	email(X,E),
  	cout(['
  • ', N,' ',F,' has email adress ']), cout([ '<a href=mailto:', E, '>', E, '</a>',nl]). </query>
  • As an example, consider the query above, which is expressed in an SGML/XML like syntax. The query command X:email_address(X) asks for all X for which the predicate email_address(X) holds. The predicate email_address is defined between the query begin and end tags.

    The query tag is an element of one of the text processing filters to provide hypermedia support for software engineering described in  [HypermediaSE]. Processing the fragment above results in an HTML list of names and email addresses. The collection of filters itself is written in lex, yacc and C++. To process the query, an embedded logic programming interpreter is invoked. To merge the output from the query, a handler is installed for the cout command.

    The query example was motivated by the need to maintain Web pages for the administration of a colloquium within our group. The actual knowledge base consists of a list of people and some rules to determine their affiliations and email addresses. The knowledge base is made available by consulting the file local.pl.

    As concerns the implementation, the Java fragment below indicates how to access the logic programming interpreter from a (Java) program.

    
      query pl = new query("kit=pl src=remote.pl"); 
    logic.java
    pl.eval("X:assistant(X)"); String res = null; while ( (res = pl.result()) != null ) { System.out.println("
  • " + res); }
  • After creating a query object, the goal X:assistant(X) is invoked, which can be taken to mean, give me every X for which the predicate assistant(X) holds. The final output is obtained by iterating over the results of the evaluation of that goal. As a comment, multiple results may be obtained in Prolog by backtracking over the possible choice points.

    Distributed knowledge servers

    Maintaining knowledge is difficult. As a rule of thumb, avoid the replication of knowledge as much as possible. However, this means that we may need to access knowledge from remote sources. One (obvious) solution that presents itself is to allow for url-enabled consults, as illustrated in the fragment below.

    
      
    remote.pl
    :- source('www.cs.vu.nl/~eliens/db/se/people.pl'). :- source('www.cs.vu.nl/~eliens/db/se/institute.pl'). :- source('www.cs.vu.nl/~eliens/db/se/property.pl'). :- source('www.cs.vu.nl/~eliens/db/se/query.pl').
    This solution has (indeed) be implemented in our filters, since the url addressing scheme is straightforward and easy to implement.

    However, processing the information accessed by url is still done locally. So, the next step that may be suggested is to distribute the knowledge processing itself, for example by using CORBA.

    
      interface query { 
    query.idl
    void source(in string file); long eval(in string cmd); string result(in long id); oneway void halt(); };
    Exploiting the integration of CORBA and hush, we have defined an interface for query in IDL and implemented query client and query server objects. These objects may be created by giving appropriate parameters to the query constructor invocation. This approach allows for embedding remote knowledge processing transparently in our collection of filters. Nevertheless, although we showed that this approach is feasible, we have not addressed the problems that may occur due to the unavailability or faults of the server.

    Native objects -- crossing the language boundary

    Embedding (script) language interpreters is becoming standard practice, as testified by the existence of embeddable interpreters for Tcl, Perl, Python, Javascript, Java, and Prolog. Each of these languages also supports calling native code, that is code written in C or C++, to allow for accessing system resources or simply for reasons of efficiency.

    Native bindings for these languages are available only on the level of functions. Even for Java, native methods of an object are defined as functions that receive a handle to the invoking object. Given a language with objects, possibly by adopting an object extension for the languages without objects, the problem is to find a proper correspondence between objects defined in the high-level (script) language and the native objects defined in C/C++.

    In this (sub)section we will first study an extension of Prolog with objects, and then indicate a solution to establish a close correspondence between the (Prolog) objects and their native counterparts. In the next (sub)section, we will apply this approach to establish a correspondence between Java and C++ objects.


    Objects in Prolog


    slide: Objects in Prolog

    In slide Objects our proposed object extension for Prolog (in particular SWI-Prolog,  [SWI]) is presented. Actually, there are many object extensions of Prolog around, for example the well-known Sicstus Objects. Our extension is motivated by the following considerations:

    requirements


    In our solution, objects are represented by dynamic fact clauses, containing a Handler, indicating how native calls are to be dealt with, a Class, and object identity ID, possibly a reference REF to a native C/C++ object, and a list of Ancestors.

    Objects (or classes of objects, if you prefer) are defined by a collection of clauses with a head predicate of the form class_method(This,...), specifying the class, method and object identity parameter. The actual invocation of the method takes the form self(This):method(...), where the colon acts as the familiar dot object access parameter. Note that the identity parameter (This) does not occur among the method parameters, but is instead contained in the object specifier. Instead of the keyword self, we may also use a class name to enforce a cast to specific object type when invoking the method. In the actual object extension, we also support object state instance variables, which are however not relevant for our discussion here.

    Object methods may be defined as native by including a goal of the form native(Handler, Method, Result), where Handler specifies the (native) handler to be invoked, Method the actual request, and Result a variable to store the possible outcome of the request. When the Handler parameter is left unspecified, the handler defined for the object will be taken to effect the native call.

    Let's look at some examples first, to augment this admittedly concise description.

    
      
              midi(This):midi,  // create midi object
              Self = self(This),
              Self:open('a.mid'),
              Self:header(0,1,480),
              Self:track(start),
              Self:melody([48,50,51,53,55]), // c d es f g, minor indeed
              Self:track(end), // end track
      
    In the fragment above we see how a midi object is created and how a simple melody is written to a file. Note that we use a variable Self for indicating the object specifier self(This). Below, the actual definition of the midi object (class) is given.

    
      
    midi
    :- use(library(midi:[midi,lily,music,process])). :- declare(midi:object,class(midi),[handler]). midi_midi(This) :- // constructor midi(This):handler(H), // gets Handler from class declare(H,new(midi(This)),[],[],_).
    The constructor for the midi object, for which the method name is equal to the class name, asks whether there is a Handler for midi objects. This handler, which is specified in the declare command above, is then passed to the declare command for the object. Since there is a handler, the constructor for the native midi object (defined in C++) is automatically invoked.

    
      
    native methods
    midi_read(This,F) :- native(_,This,read(F),_). midi_analyse(This,I,O) :- native(_,This,analyse(I,O),_). midi_open(This,F) :- native(_,This,open(F),_). midi_header(This,M) :- native(_,This,header(M,0,480),_). midi_track(This,X) :- native(_,This,track(X),_). midi_tempo(This,X) :- native(_,This,tempo(X),_). midi_event(This,D,C,M,T,V) :- native(_,This,event(D,C,M,T,V),_).
    All the methods listed above are implemented using the native midi C++ object. Note that both the Handler and the Result parameter are left unspecified. The handler is by default taken from the class declaration for the midi object class. There is no result when invoking these native methods.

    
      midi_note(This,D,C,T,V) :- 
              Self = midi(This), // cast to midi
              Self:event(D,C,note_on,T,V),
              Self:event(D,C,note_off,T,V).
      
      midi_melody(This,L) :- self(This):melody(480,1,L,64).
      
      midi_melody(_This,_,_,[],_).
      
      midi_melody(This,D,C,[X|R],V) :-
              Self = self(This),
              Self:note(D,C,X,V),  
              midi_melody(This,D,C,R,V).   // direct invocation
      
    The midi object clauses given above augment the native methods by defining additional predicates, such as note and melody. These clauses also illustrate the liberty we have in casting the object specifier to a specific class or bypassing dynamic method invocation. Clearly, a native binding for the midi object is necessary, since Prolog is highly inappropriate for reading or writing midi files directly. It is however very appropriate for specifying rules for analyzing MIDI files!

    C++ bindings

    To redirect native method calls for our (Prolog) objects to their native C++ counterparts we need some additional machinery. First of all, we have to translate a (Prolog) method call to a format that can be passed to a C++ handler, so that the C++ handler may decide which method to invoke for what object. To get a direct correspondence between objects in Prolog and objects in C++, we store a reference to the C++ object in the REF variable of the Prolog object. When a native method is called, this reference is converted into an object handler or pointer in C++, to which the (native) method invocation will be addressed. We use a smart pointer to encapsulate this reference and to allow for directly invoking (native) methods for the corresponding object type.

    As outlined in section Reactor, in the hush framework we use an event-based mechanism to effect foreign language bindings. This means that the information concerning the native call is stored in an event object that is passed to a handler, which invokes the operator function on the occurrence of an event. In the code fragment below it is shown how native method dispatching is taken care of in the operator function of a C++ kit_object, for which a corresponding object in Prolog is assumed to exist.

    
      int kit_object::operator()() {
              event* e = _event;
      
              vm<kit> self(e);  // smart pointer
              string method = e->_method();
      
              if (method == "kit") { // constructor
                      kit* q = new kit(e->arg(1));
                      _register(q);
                      result( reference((void*)q) );
              } else if (method == "eval") {
                      long res = self->eval(e->arg(1));
                      result( itoa(res) );
              } else if (method == "result") {
                      char* res = self->result( atoi(e->arg(1)) );
                      result(res);
              } else { // dispatch up in the hierarchy
                      return handler_object::operator()();
              }
      
              return 0;
              }
      
    Before checking which method is invoked, which is recorded in the event, we create a smart pointer (self) by instantiating a vm instance for the kit class. (The acronym vm is somewhat inappropriately derived from virtual machine.) If the method is a constructor, the result is a reference, that is an integer encoding of the actual pointer. Otherwise, the method is invoked, simply by addressing the smart pointer self. As a comment, the use of smart pointers is a C++ specific technique based on redefining the dereference operator, as illustrated below. When no matching method can be found, the operator method for a handler object higher up in the hierarchy is invoked. In our example, both the kit_object and the midi_object are directly derived from handler_object. This hierarchy, which is intended to encapsulate the native objects, parallels the original hush class hierarchy in a straightforward way. The smart pointer vm class, that we need for our binding of Prolog objects to native C++ objects, is relatively straightforward.
    
      template <class T>
      class vm  { 
    smart pointer class
    public: vm(event* e) { int p = 0; char* id = e->option("ref"); if (id) { p = atoi(id); } _self = (T*) p; } virtual inline T* operator->() { return _self; } private: T* _self; };
    In summary, the constructor converts the event argument to a reference to the parameterized object type T, which is used as the result of the dereference operator. This allows for invoking methods for object type T without further ado. As a comment, our presentation here is somewhat simplified, since we do not take into account the possibility of upcalls, that is the invocation of Prolog code from C++. We will deal with these additional details when discussing the Java/C++ binding in the next (sub)section.

    Combining Java and C++

    The designers of the Java language have created an elegant facility for incorporating native C/C++ code in Java applications, the Java Native Interface (JNI). Elegant, since native methods can be mixed freely with ordinary methods. When qualifying methods as native, the implementer must provide a dynamically loadable library that contains functions, of which the names and signatures must comply with the JNI standard, defining the functionality of the methods. Nevertheless, the JNI does not provide for generic means to establish a direct correspondence between an object class hierarchy in C++ that (partially) implements a corresponding object class hierarchy in Java. In this section, we will study how such a correspondence is realized in the hush framework, using the Java Native Interface.

    The solution to establishing corresponding object class hierarchies in Java and C++ that we have adopted relies on storing a reference to the native C++ object in the Java object and the conversion of this reference to a smart pointer encapsulating access to the native C++ object. Upcalls, which occur for example when Java handlers are invoked in response to an event, require some additional machinery, as will be explained shortly.

    Each Java class in hush is derived from the obscure class, which contains an instance variable _self that may store a C++ object reference, encoded as an integer.

    
      package hush.dv.api;
      
      class obscure { 
    obscure
    public int _self; // peer object pointer ... };
    The class obscure has been introduced so as not to pollute the handler class, which is the base class for almost every hush class. The (Java) handler class is derived from obscure.

    As an example, look at the (partial) Java class description for kit below.

    
      package hush.dv.api;
      
      public class kit extends handler { 
    kit
    public kit() { _self = init(); } protected kit(int x) { } private native int init(); public native void source(String cmd); public native void eval(String cmd); public String result() { String _result = getresult(); if (_result.equals("-")) return null; else return _result; } private native String getresult(); public native void bind(String cmd, handler h); ... };
    Recall that the kit class is used to encapsulate an embedded interpreter, such as a Tcl or Prolog interpreter. When a kit is constructed, the instance variable _self is initialized with the reference obtained from the native init method, which will be given below. The other methods of kit are either native or result in invoking a native method, possibly with some additional processing.

    Each native method must be implemented as a function, of which the name and signature are fixed by the JNI conventions, as illustrated below.

    
      
    kit.c
    include @lt;hush/hush.h> include @lt;hush/java.h> include @lt;native/hush_dv_api_kit.h> #define method(X) Java_hush_dv_api_kit_##X JNIEXPORT jint JNICALL method(init)(JNIEnv *env, jobject obj) { jint result = (jint) kit::_default; // (jint) new kit(); if (!result) { kit* x = new kit("tk"); session::_default->_register(x); result = (jint) x; } return result; }
    The init method, the full name of which is obtained by expanding the macro call method(init), results in an integer-encoded reference to a kit object, which is newly created if it doesn't already exist.

    
      JNIEXPORT jstring JNICALL method(getresult)(JNIEnv *env, jobject obj)
      {
        java_vm vm(env,obj);
        char *s = vm->result();
        if (s) return vm.string(s);
        else return vm.string("-");
      }
      
    In the getresult method, we see how a smart pointer, instantiated for the kit class, is used to obtain the result from the C++ kit object. The smart pointer takes care of converting the reference stored in the Java object to an appropriate pointer.

    
      JNIEXPORT void JNICALL method(bind)(JNIEnv *env, jobject obj,
      	       jstring s, jobject o)
      {
        java_vm vm(env,obj);
        java_vm* vmp = new java_vm(env,o,"Handler");
        const char *str = vm.get(s);
        handler* h = new handler();
        session::_default->_register(h);
        h->_vmp = vmp;
        h->_register(vmp);
        vm->bind(str,h);
        vm.release(s, str);
      }
      
    In the bind method, which is used to bind a (Java) handler object to some (Tcl or Prolog) command, a new C++ handler is created. This handler is modified to contain a reference to the smart pointer, which (indeed) also gives access to the Java handler object. Notice that calling the Java handler object is an upcall, when viewed from the native implementation.

    In somewhat more detail, the Java handler object is invoked through the C++ handler object created in the bind method of the kit. The C++ handler is activated when an event occurs, or a Tcl or Prolog command is given. Activating the handler amounts to calling the dispatch method with an appropriate event. To decide whether the activation must be passed through to the Java handler object, the handler::dispatch method checks for the availability of a smart pointer, as illustrated below.

    handler::dispatch


    
      
      event* handler::dispatch(event* e) {
      _event = e;
      if (_vmp) {
              return ((vm*)_vmp)->dispatch(e);
      } else {
      
               int result = this->operator()();
      
               if (result != OK) return 0; 
               else return _event;
               }
      }
      
    When the C++ handler contains a smart pointer, the dispatch method is called for that pointer.

    The Java smart pointer template class for the Java/C++ binding is derived from the smart pointer template class introduced in the previous (sub)section.

    
      #include <hush/vm.h> 
      #include <jni.h>
      
      template< class T >
      class java_vm : public vm< T > { 
    java_vm
    public: java_vm(JNIEnv* env_, jobject obj_) { _env = env_; _obj = obj_; _self = self(); } ... event* dispatch(event* e) {
    java dispatch
    call("dispatch",(int)e); return e; } T* operator->() { return _self; } T* self() { jfieldID fid = fieldID("_self","I"); return (T*) _env->GetIntField( _obj, fid); } void call(const char* md, int i) { // void (*)(int) jmethodID mid = methodID(md,"(I)V"); _env->CallVoidMethod(_obj, mid, i); } private: JNIEnv* _env; jobject _obj; T* _self; };
    Notice how the value of the _self reference field is obtained from the _self attribute of the Java object. Also notice that calling dispatch for the Java handler is mediated by an additional call function, which obtains an explicit reference to the method that must be invoked. In general, there are many possible method signatures for which such a call function could be supplied, but in our case we only need one, to invoke dispatch.

    Discussion

    Interfacing Java and C++ is at first sight not very difficult, especially not when the majority of calls consists of downcalls (from Java to C++) only. The smart pointer device may then be used as a handy abbreviation. The problems occur, however, when upcalls come into play. Due to the simple design of hush, upcalls occur (almost) exclusively through the dispatch method. This is not the result of explicit design, but in retrospect just sheer luck. When upcalls are spread over the code and may vary in signature, they will most likely bring along significant software engineering and maintenance effort.

    (C) Æliens 04/09/2009

    You may not copy or print any of this material without explicit permission of the author or the publisher. In case of other copyright issues, contact the author.