introduction multimedia
[] readme preface 1 2 3 4 5 6 7 appendix checklist powerpoint resources director

talk show tell print

MPEG-4

The MPEG standards (in particular 1,2 and 3) have been a great success, as testified by the popularity of mp3 and DVD video.

Now, what can we expect from MPEG-4? Will MPEG-4 provide multimedia for our time, as claimed in  [Time]. The author, Rob Koenen, is senior consultant at the dutch KPN telecom research lab, active member of the MPEG-4 working group and editor of the MPEG-4 standard document.

"Perhaps the most immediate need for MPEG-4 is defensive. It supplies tools with which to create uniform (and top-quality) audio and video encoders on the Internet, preempting what may become an unmanageable tangle of proprietary formats."

Indeed, if we are looking for a general characterization it would be that MPEG-4 is primarily

MPEG-4


a toolbox of advanced compression algorithms for audiovisual information

and, moreover, one that is suitable for a variety of display devices and networks, including low bitrate mobile networks. MPEG-4 supports scalability on a variety of levels:

scalability

Dependent on network resources and platform capabilities, the 'right' level of signal quality can be determined by selecting the optimal codec, dynamically.

media objects

It is fair to say that MPEG-4 is a rather ambitious standard. It aims at offering support for a great variety of audiovisual information, including still images, video, audio, text, (synthetic) talking heads and synthesized speech, synthetic graphics and 3D scenes, streamed data applied to media objects, and user interaction -- e.g. changes of viewpoint.

audiovisual information


Let's give an example, taken from the MPEG-4 standard document.

example


Imagine, a talking figure standing next to a desk and a projection screen, explaining the contents of a video that is being projected on the screen, pointing at a globe that stands on the desk. The user that is watching that scene decides to change from viewpoint to get a better look at the globe ...

How would you describe such a scene? How would you encode it? And how would you approach decoding and user interaction?

The solution lies in defining media objects and a suitable notion of composition of media objects.

media objects


For 3D-scene description, MPEG-4 builds on concepts taken from VRML (Virtual Reality Modeling Language, discussed in chapter 7).

Composition, basically, amounts to building a scene graph, that is a tree-like structure that specifies the relationship between the various simple and compound media objects. Composition allows for placing media objects anywhere in a given coordinate system, applying transforms to change the appearance of a media object, applying streamed data to media objects, and modifying the users viewpoint.

composition


So, when we have a multimedia presentation or audiovisual scene, we need to get it accross some network and deliver it to the end-user, or as phrased in  [MPEG-4]:

transport


The data stream (Elementary Streams) that result from the coding process can be transmitted or stored separately and need to be composed so as to create the actual multimedia presentation at the receivers side.

At a system level, MPEG-4 offers the following functionalities to achieve this:

scenegraph


In addition, MPEG-4 defines a set of functionalities For the delivery of streamed data, DMIF, which stands for

DMIF


Delivery Multimedia Integration Framework

that allows for transparent interaction with resources, irrespective of whether these are available from local storage, come from broadcast, or must be obtained from some remote site. Also transparency with respect to network type is supported. Quality of Service is only supoorted to the extent that it ispossible to indicate needs for bandwidth and transmission rate. It is however the responsability of the network provider to realize any of this.

authoring

What MPEG-4 offers may be summarized as follows

benefits


In effect, although MPEG-4 is primarily concerned with efficient encoding and scalable transport and delivery, the object-based approach has also clear advantages from an authoring perspective.

One advantage is the possibility of reuse. For example, one and the same background can be reused for multiplepresentations or plays, so you could imagine that even an amateur game might be 'located' at the centre-court of Roland Garros or Wimbledon.

Another, perhaps not so obvious, advantage is that provisions have been made for

managing intellectual property

Of media objects.

And finally, media objects may potentially be annotated with meta-information to facilitate information retrieval.

syntax

In addition to the binary formats, MPEG-4 also specifies a syntactical format, called XMT, which stands for eXtensible MPEG-4 Textual format.

XMT


when discussing RM3D, we will further establish whatthe relations between, respectively MPEG-4, SMIL and RM3D are, and in particular where there is disagreement, for example with respect to the timing model underlying animations and the temporal control of media objects.

the press

Now to conclude our discussion of MPEG-4, let's see what the press has to say about it.

www.eetimes.com/story/OEG20010220S0065


MPEG-4 is "a big standard," said Tim Schaaff, vice president of engineering for Apple Computer Inc.'s Interactive Media Group. "It's got tons of tools inside." Its success, he said, will depend on the industry's willingness to home in on a small subset, winnowing from a number of profiles and levels designed for streaming a slew of digital multimedia types -- audio, several types of video, still images, and 2-D and 3-D graphics.

Some may find it to ambitious.

unfocused ambition

"MPEG-4 is a very ambitious standard, but its biggest problem is that it wasn't focused on anything," said Didier LeGall, vice president for R&D and chief technology officer at chip house C-Cube Microsystems Inc. LeGall dismissed MPEG-4's vaunted object-based coding -- one of the technologies that sets it apart from earlier MPEG spins -- as "science fiction" and "nothing more than a gadget" at this point. "I haven't seen any content with objects that really makes sense," he said.

But, then again, what it offers is clearly worthwhile.

coding


MPEG-4's chief features include highly efficient compression, error resilience, bandwidth scalability ranging from 5 kbits to 20 Mbits/second, network and transport-protocol independence, content security and object-based interactivity, or the ability to pluck a lone image -- say, the carrot Bugs Bunny is about to chomp -- out of a video scene and move it around independently.

And, not altogether unimportant, it may offer significant commercial benefits.

Broadband service providers, such as cable and DSL companies, are right behind wireless in sizing up MPEG-4, largely because its low bit rate could help them add channels in their broadband pipes while incorporating interactive features in the content. Possibilities include multiple video streams, clickable video, real-time 3-D animation and interactive advertising.



[] readme preface 1 2 3 4 5 6 7 appendix checklist powerpoint resources director
eliens@cs.vu.nl

draft version 1 (16/5/2003)