standards
- XML -- eXtensible Markup Language (SGML)
- MPEG-4 -- coding audio-visual information
- SMIL -- Synchronized Multimedia Integration Language
- RM3D -- (Web3D) Rich Media 3D (extensions of X3D/VRML)

"Perhaps the most immediate need for MPEG-4 is defensive.
It supplies tools with which to create uniform (and top-quality)
audio and video encoders on the Internet,
preempting what may become an unmanageable tangle
of proprietary formats."
MPEG-4
a toolbox of advanced compression algorithms for audiovisual information
scalability
- bitrate -- switching to lower bitrates
- bandwidth -- dynamically discard data
- encoder and decoder complexity -- signal quality
audiovisual information
- still images, video, audio, text
- (synthetic) talking heads and synthesized speech
- synthetic graphics and 3D scenes
- streamed data applied to media objects
- user interaction -- e.g. changes of viewpoint
example
Imagine, a talking figure standing next to a desk
and a projection screen, explaining the contents of
a video that is being projected
on the screen, pointing at a globe that stands on the desk.
The user that is watching that scene decides to
change from viewpoint to get a better look at the globe ...
media objects
- media objects -- units of aural, visual or audiovisual content
- composition -- to create compound media objects (audiovisual scene)
- transport -- multiplex and synchronize data associated with media objects
- interaction -- feedback from users' interaction with audiovisual scene
composition
- placing media objects anywhere in a given coordinate system
- applying transforms to change the appearance of a media object
- applying streamed data to media objects
- modifying the users viewpoint
transport
The data stream (Elementary Streams)
that result from the coding process can be transmitted
or stored separately and need
to be composed so as to create the actual
multimedia presentation at the receivers side.
scenegraph
- BIFS (Binary Format for Scenes) -- describes spatio-temporal arrangements of (media) objects in the scene
- OD (Object Descriptor) -- defines the relationship between the elementary streams associated with an object
- event routing -- to handle user interaction
DMIF
Delivery Multimedia Integration Framework
|
|
(a) scene graph | (b) sprites |

benefits
- end-users -- interactive media accross all platforms and networks
- providers -- transparent information for transport optimization
- authors -- reusable content, protection and flexibility
managing intellectual property
XMT
- XMT contains a subset of X3D
- SMIL is mapped (incompletely) to XMT
SMIL
TV-like multimedia presentations
parallel and sequential
Authoring a SMIL presentation comes down, basically, to
name media components for text, images,audio and video with URLs, and to schedule their presentation either in parallel or in sequence.
presentation characteristics
- The presentation is composed from several components that are accessible via URL's, e.g. files stored on a Web server.
- The components have different media types, such as audio, video, image or text. The begin and end times of different components are specified relative to events in other media components. For example, in a slide show, a particular slide is displayed when the narrator in the audio starts talking about it.
- Familiar looking control buttons such as stop, fast-forward and rewind allow the user to interrupt the presentation and to move forwards or backwards to another point in the presentation.
- Additional functions are "random access", i.e. the presentation can be started anywhere, and "slow motion", i.e. the presentation is played slower than at its original speed.
- The user can follow hyperlinks embedded in the presentation.

applications
- Photos taken with a digital camera can be coordinated with a commentary
- Training courses can be devised integrating voice and images.
- A Web site showing the items for sale, might show photos of the product range in turn on the screen, coupled with a voice talking about each as it appears.
- Slide presentations on the Web written in HTML might be timed so that bullet points come up in sequence at specified time intervals, changing color as they become the focus of attention.
- On-screen controls might be used to stop and start music.

example
<par>
<a href="#Story"> <img src="button1.jpg"/> </a>
<a href="#Weather"> <img src="button2.jpg"/></a>
<excl>
<par id="Story" begin="0s">
<video src="video1.mpg"/>
<text src="captions.html"/>
</par>
<par id="Weather">
<img src="weather.jpg"/>
<audio src="weather-rpt.mp3"/>
</par>
</excl>
</par>

history
Experience from both the CD-ROM community and from the Web multimedia community suggested that it would be beneficial to adopt a declarative format for expressing media synchronization on the Web as an alternative and complementary approach to scripting languages.
Following a workshop in October 1996, W3C established a first working group on synchronized multimedia in March 1997. This group focused on the design of a declarative language and the work gave rise to SMIL 1.0 becoming a W3C Recommendation in June 1998.
SMIL 2.0 Modules
- The Animation Modules
- The Content Control Modules
- The Layout Modules
- The Linking Modules
- The Media Object Modules
- The Metainformation Module
- The Structure Module
- The Timing and Synchronization Module
- The Time Manipulations Module
- The Transition Effects Module

module-based reuse
- SMIL modules could be used to provide lightweight multimedia functionality on mobile phones, and to integrate timing into profiles such as the WAP forum's WML language, or XHTML Basic.
- SMIL timing, content control, and media objects could be used to coordinate broadcast and Web content in an enhanced-TV application.
- SMIL Animation is being used to integrate animation into W3C's Scalable Vector Graphics language (SVG).
- Several SMIL modules are being considered as part of a textual representation for MPEG4.

www.web3d.org
- VRML 1.0 -- static 3D worlds
- VRML 2.0 or VRML97 -- dynamic behaviors
- VRML200x -- extensions
- X3D -- XML syntax
- RM3D -- Rich Media in 3D
groups.yahoo.com/group/rm3d/
The Web3D Rich Media Working Group was formed to develop a Rich Media standard format (RM3D) for use in next-generation media devices. It is a highly active group with participants from a broad range of companies including 3Dlabs, ATI, Eyematic, OpenWorlds, Out of the Blue Design, Shout Interactive, Sony, Uma, and others.
RM3D
The Web3D Consortium initiative is fueled by a clear need for a standard high performance Rich Media format. Bringing together content creators with successful graphics hardware and software experts to define RM3D will ensure that the new standard addresses authoring and delivery of a new breed of interactive applications.
requirements
- rich media -- audio, video, images, 2D & 3D graphics
(with support for temporal behavior, streaming and synchronisation)
- applicability -- specific application areas, as determined by
commercial needs and experience of working group members
- interoperability -- VRML97, X3D, MPEG-4, XML (DOM access)
- object model -- common model for representation of objects and capabilities
- extensibility -- integration of new objects (defined in Java or C++), scripting capabilities and declarative content
- high-quality realtime rendering -- realtime interactive media experiences
- platform adaptability -- query function for programmatic behavior selection
- predictable behavior -- well-defined order of execution
- high precision number systems -- greater than single-precision IEEE floating point numbers
- minimal size -- download and memory footprint
SMIL is closer to the author
and RM3D is closer to the implementer.
working draft
Since there are three vastly different proposals for this section (time model), the original <RM3D> 97 text
is kept. Once the issues concerning time-dependent nodes are resolved, this section can be
modified appropriately.
time model
- MPEG-4 -- spring metaphor
- SMIL -- cascading time
- RM3D/VRML -- event routing
MPEG-4 -- spring metaphor
- duration -- minimal, maximal, optimal
SMIL -- cascading time
- time container -- speed, accelerate, decelerate, reverse, synchronize
<seq speed="2.0">
<video src="movie1.mpg" dur="10s"/>
<video src="movie2.mpg" dur="10s"/>
<img src="img1.jpg" begin="2s" dur="10s">
<animateMotion from="-100,0" to="0,0" dur="10s"/>
</img>
<video src="movie4.mpg" dur="10s"/>
</seq>
RM3D/VRML -- event routing
- TimeSensor -- isActive, start, end, cycleTime, fraction, loop