introduction multimedia
introduction multimedia
[]
readme
preface
1
2
3
4
5
6
7
appendix
checklist
powerpoint
resources
director
PDF
intelligent multimedia @ VU
http://www.cs.vu.nl/~eliens/research/note-intelligent-multimedia.html
project leader: dr A. Eliëns
researcher: dr. Z. Huang
programmer: drs. C. Visser
student: M. Hildebrand
project description
We are developing a high-level platform for 3D and rich media
virtual environments
based on agent-technology, using the languages DLP,
Java, and VRML.
On top of this platform we have developed an scripting language
STEP
for specifying humanoid movements and gestures, based on dynamic logic.
Our goal is to study aspects of
the deployment and architecture of virtual environments as an interface to
imultimedia information systems.
Our platform also supports embodied conversational agents,
see [Eliëns et al. (2002)].
As demonstrators we have developed a distributed soccer-game
prototype with intelligent autonomous avatar-embodied agents as players [Huang et al. (2002)],
a humanoid animation demonstrating Tai-Chi, [Huang et al. (2002b)],
avatars presenting a dialog in a mixed media presentation
environment, [Eliëns et al. (2003)],
a domestic agent that can be addressed in natural language, [Hildebrand et al. (2003)],
an avatar reaching for objects that uses reasoning and inverse kinematics, [ Huang et al. (2003d)],
and an avatar conductiong music, [Ruttkay et al. (2003a) ].
As a student project in the
Casus Practicum,
we have developed a 3D presentation
for INCCA,
the International Network for the Conservation of Contemporary
Art, in cooperation with ICN,
the Dutch Cultural Heritage Institute.
funding
Our research has been supported by two NWO projects:
- WASP -- Web Agent Support Program
- RIF -- Retrieval of Information in virtual worlds using Feature detectors
The combined effort of these projects led to the DLP+X3D
platform and the development of the STEP language.
In addition, we have one pending proposal with Michiel Hildebrand as candidate researcher:
- (submitted) IMMEDIATE -- Intelligent MultiMEDIA Transactions Environment

cooperation
The work on embodied conversational agents
is being done in cooperation with
dr. Z. Ruttkay from CWI.
The work on the cultural heritage application
is done in cooperation with drs. T. Scholte from
ICN.
publications
Apart from publications in international conference proceedings
and workshops,
we have demonstrated our work on the ICT Kenniscongres 2002
in the theme intelligent multimedia.
In preparation is a prestigious book on Life-like Characters,
for which we have contributed a chapter describing our platform
and STEP, [Huang et al. (2003c)].
The editor of this book, Helmut Prendinger from Tokyo University,
commented on our chapter:
I really have to say that your chapter is very very well done, very
interesting, very comprehensive, and very readable. It says exactly
the
things people want to know when they are looking for some scripting
language
to animate their characters. Above that, your STEP system is very well
motivated and nicely embedded in other strands of computer science
research. .
a brief history
Although the WASP proposal was written in 1996, long
before the RIF proposal (1998), dr. Z. Huang started as a post-doc on the
WASP project when the RIF project ran already for more than
six months.
Since we then felt that the WASP project proposal was slightly
outdated, we made an effort to merge the WASP and
RIF projects, by focussing on agents in 3D virtual
environments which resulted in a paper presenting
a taxonomy of Web agents, [Huang et al. (2000)].
However, after about a year the continuity of the RIF project
was endangered due to a mutation of personell at CWI.
So, NWO was asked for permission to utilize the RIF funds
for prolonging the WASP project.
This was granted, provided that the research goals of the RIF
project were sufficiently covered within WASP.
In the RIF project we used the blaxxun
Community Server and VRML
to realize information retrieval and delivery im multi-user 3D
environments. See [van Ballegooij and Eliëns (2001)].
Agents were then conceived as extensions on the server-side
using blaxxun's native agents enriched with embedded logic.
However, at the same time that the RIF project funds were
transferred to WASP, the first prototype of DLP in Java
became available, and we decided, somewhat radically, to drop
the more low-level blaxxun technology in favor of a unified
approach in DLP.
Thus, we extended DLP with primitives for manipulating VRML
worlds, using the Java External Authoring Interface that is
part of VRML. The DLP+VRML framework proved
to be surprisingly effective, as testified by the following references:
DLP+VRML

Now, the language DLP itself has quite a long history, [Eliëns (1992)].
It has also been described in [Eliëns (2000)].
As a language, DLP offers an object-oriented extension
of (traditional, Edinburgh-style) Prolog, with multi-threaded
objects, non-logical instance or state variables,
communication by rendex-vous and (distributed) backtracking.
After preliminary prototypes in C++,
we focussed on an implementation in Java, to be able to use
DLP for Web programming, [Visser and Eliëns (2000)].
The first Java implementation became available relatively late,
but just in time to create the DLP+VRML extension when needed.
As concerns the acceptance of our approach within the Web3D
community, we wish to point to the acceptance of our [Huang et al. (2002)]
paper for the highly competetive international Web3D Conference
2002 (acceptance: 1 out of 13),
and the acceptance of [Huang et al. (2003a)] for the Web3D 2003 Conference.
More recently, we have made an effort to publish in the
ECA (Embodied Conversation Agents) community, as testified
by our contribution to the Life-like Agents book, [Huang et al. (2003c)].
We also received an invitation for the Dagstuhl seminar
on Evaluating Embodied Conversational Agents in March 2004.
embedding in education: focus on multimedia
focus on multimedia
The intelligent multimedia research has a strong impact on the educational activities
for the specialisation(s) Multimedia with Computer Science and
Multimedia and Culture for Information Science.
In the first year students
start with a general Introduction to Multimedia.
This course centers around three themes:
the convergence between media, platforms and delivery technology,
the availability of
broadband communication and its impact on the development of
standards such as MPEG-4, and multimedia information retrieval
as an essential ingredient of the growing multimedia information repository
on the Web.

There are two follow-up courses, which are given in respectively the
second and third year:
courses

The first of these courses deals with the technology for creating 3D scenes and
worlds, whereas the second is
more focused on providing intelligent services in virtual environments.
Students use the DLP+VRML framework for their assignment in the second course. See [Huang et al. (2003)].
In addition for Multimedia and Culture there is
a Multimedia Development Casus Practicum in which the technology is applied in an assignment
developed with the Dutch Cultural Heritage Institute (ICN).
For both specialisations, Multimedia and Multimedia and Culture
we plan to offer a course on
XML-based Multimedia Technology, to be developed by dr. Z. Huang, to make
students familiar with advanced topics in XML-based
information processing.
As a remark, our platform does already support the use of
XML and XSLT stylesheets, [Huang et al. (2003b)],
and we are migrating to a DLP+X3D platform,
with X3D
as the XML-based successor of VRML.
research directions
Apart from the issues involved in the modelling
and realization of embodied agents in rich media
3D environments, there are also issues
with regard to the architecture and implementation
of our DLP+X3D platform.
parallelism and synchronization
Complex humanoid gestures are of a highly parallel nature.
The STEP scripting language
supports a direct way of modelling parallel gestures
by offering a parallel construct
(par), which results in the simultaneous
execution of (possibly compound) actions.
To avoid unconstrained thread creation,
the STEP engine makes use of a thread pool,
containing a fixed number of threads,
from which threads are allocated to actions.
Once the action is finished, the thread is put back in
the pool.
This approach works well for most examples.
However when many threads are needed,
as in the conductor example
(which requires approximately 60 threads),
problems may occur, in particular when there are may background jobs.
modeling and representation
Our agent model may be characterized as a
BDI-model, extended with sensors and effectors
needed for the interaction in a virtual environment.
The STEP scripting language has been developed to
facilitate the specification of communicative acts,
like gestures.
However, we would also like to explore text-to-speech
synthesis as an extra modality of communication.
One interesting research issue is how to specify
a reusable library of gestures, accomodating for differences
in (personal) style.
This is currently being investigated by Z. Ruttkay from CWI.
Another intersting issue is the use of inverse kinematics
to grasp objects. However, when an object is not within
reach, the agent has to reason about the best way to get near
to the object, to be able to reach it.
architecture and implementation
To solve the problem of reliable timing would require not
only a modification of the STEP engine,
but also a rather different implementation
of the DLP threads supporting the parallelism in STEP.
Currently, the implementation only allows for
best effort parallelism and does not provide
the means for deadline scheduling.
However, it is our impression that we have reached
the utmost efficiency feasible within the Java
platform.
Therefore we have been considering to redevelop the
DLP+X3D platform in a .NET environment.
An additional advantage of migrating to the .NET environment would be
the possible integration of functionality such as
text-to-speech synthesis which is not readily available
in the Java environment.
Eliëns (1992)
Eliëns (1992)
Eliëns A. (1992),
DLP -- A language for Distributed Logic Programming,
Wiley
Eliëns (2000)
Eliëns (2000)
Eliëns A. (2000),
Principles of Object-Oriented Software Development,
Addison-Wesley Longman, 2nd edn.
Eliëns et al. (2003)
Eliëns et al. (2003)
Eliëns A., Dormann C., Huang Z. and Visser C. (2003),
A framework for mixed media -- emotive dialogs, rich media and virtual environments,
Proc. TIDSE03, 1st Int. Conf. on Technologies for Interactive Digital Storytelling and Entertainment, Göobel S. Braun N.,n Spierling U., Dechau J. and Diener H. (eds>), Fraunhofer IRB Verlag, Darmstadt Germany, March 24-26, 2003
Eliëns et al. (2002)
Eliëns et al. (2002)
Eliëns A., Huang Z., and Visser C. (2002),
A platform for Embodied Conversational Agents based on Distributed Logic Programming,
AAMAS Workshop -- Embodied conversational agents - let's specify and evaluate them!, Bologna 17/7/2002
Hildebrand et al. (2003)
Hildebrand et al. (2003)
Hildebrand M., Eliëns A., Huang Z. and Visser C. (2003),
Interactive Agents Learning their Environment,
Proc. Intelligent Virtual Agents 2003, Irsee, September 15-17, 2003 J.G. Carbonell and J.Siekmann (eds.), LNAI 2792, Springer, pp. 13-17
Huang et al. (2001)
Huang et al. (2001)
Huang Z., Eliëns A., and De Bra P. (2001),
An Architecture for Web Agents,
Proceedings of the Conference EUROMEDIA 2001, 2001.
Huang et al. (2000)
Huang et al. (2000)
Huang Z., Eliëns A., van Ballegooij A., De Bra P. (2000),
A Taxonomy of Web Agents,
IEEE Proceedings of the First International Workshop on Web Agent Systems and Applications (WASA '2000), 2000.
Huang et al. (2001)
Huang et al. (2001)
Huang Z., Eliëns A., Visser C. (2001),
Programmability of Intelligent Agent Avatars,
Proceedings of the Agent'01 Workshop on Embodied Agents, June 2001, Montreal, Canada
Huang et al. (2002)
Huang et al. (2002)
Huang Z., Eliëns A., Visser C. (2002),
3D Agent-based Virtual Communities.
In: Proc. Int. Web3D Symposium, Wagner W. and Beitler M.( eds), ACM Press, pp. 137-144
Huang et al. (2002b)
Huang et al. (2002b)
Huang Z., Eliëns A., Visser C. (2002b),
STEP -- a scripting language for Embodied Agents,
PRICAI-02 Workshop -- Lifelike Animated Agents: Tools, Affective Functions, and Applications, Tokyo, 19/8/2002
Huang et al. (2003)
Huang et al. (2003)
Huang Z., Eliëns A., Visser C. (2003),
Intelligent Multimedia Technology: An Approach to Combine Agent Technologies with Multimedia,
in preparation
Huang et al. (2003a)
Huang et al. (2003a)
Huang Z., Eliëns A., Visser C. (2003a),
Implementation of a scripting language for VRML/X3D-based embodied agents,
Proc. Web3D 2003 Symposium, Saint Malo France, S. Spencer (ed.) ACM Press, pp. 91-100
Huang et al. (2003b)
Huang et al. (2003b)
Huang Z., Eliëns A., Visser C. (2003b),
XSTEP: A Markup Language for Embodied Agents,
Proc. CASA03, The 16th Int. Conf. on Computer Animation and Social Agents
Huang et al. (2003c)
Huang et al. (2003c)
Huang, Z., Eliëns, A., and Visser, C. (2003c),
STEP: a Scripting Language for Embodied Agents,
in: Helmut Prendinger and Mitsuru Ishizuka (eds.), Life-like Characters, Tools, Affective Functions and Applications, Springer-Verlag, (to appear).
Ruttkay et al. (2003a)
Ruttkay et al. (2003a)
Ruttkay Z., Huang Z. and Eliëns A. (2003a),
The Conductor: Gestures for Embodied Agents with Logic Programming,
Joint Annual ERCIM/CoLogNet Workshop on Constraint and Logic Programming, Budapest, Hungary, 30 June - 2 July, 2003
van Ballegooij and Eliëns (2001)
van Ballegooij and Eliëns (2001)
van Ballegooij and Eliëns A. (2001),
Navigation by Query in Virtual Worlds,
Web3D 2001 Conference, Paderborn, Germany, 19-22 Feb 2001
Visser and Eliëns (2000)
Visser and Eliëns (2000)
Visser C. and Eliëns A. (2000),
A High-Level Symbolic Language for Distributed Web Programming.
Internet Computing 2000, June 26-29, Las Vegas
[]
readme
preface
1
2
3
4
5
6
7
appendix
checklist
powerpoint
resources
director
eliens@cs.vu.nl

draft version 1 (16/5/2003)