3

codecs and standards

without compression delivery is virtually impossible

codecs and standards

learning objectives

After reading this chapter you should be able to demonstrate the necessity of compression, to discuss criteria for the selection of codecs and mention some of the alternatives, to characterize the MPEG-4 and SMIL standards, to explain the difference between MPEG-4 and MPEG-2, and to speculate about the feasibility of a semantic multimedia web.

Without compression and decompression, digital information delivery would be virtually impossible. In this chapter we will take a more detailed look at compression and decompression. It contains the information that you may possibly need to decide on a suitable compression and decompression scheme (codec) for your future multimedia productions.

We will also discuss the standards that may govern the future (multimedia) Web, including MPEG-4, SMIL and RM3D. We will explore to what extent these standards allow us to realize the optimal multimedia platform, that is one that embodies digital convergence in its full potential. Finally, we will investigate how these ideas may ultimately lead to a (multimedia) semantic web.

...



compression is the key to effective delivery

mediauncompressedcompressed
voice 8k samples/sec, 8 bits/sample64 kbps2-4 kbps
slow motion video 10fps 176x120 8 bits5.07 Mbps8-16 kbps
audio conference 8k samples/sec 8bits64 kbps16-64 kbps
video conference 15 fps 352x240 8bits30.4 Mbps64-768 kbps
audio (stereo) 44.1 k samples/s 16 bits1.5 Mbps128k-1.5Mbps
video 15 fps 352x240 15 fps 8 bits30.4 Mbps384 kbps
video (CDROM) 30 fps 352x240 8 bits60.8 Mbps1.5-4 Mbps
video (broadcast) 30 fps 720x480 8 bits248.8 Mbps3-8 Mbps
HDTV 59.9 fps 1280x720 8 bits1.3 Gbps20 Mbps

(phone: 56 Kb/s, ISDN: 64-128 Kb/s, cable: 0.5-1 Mb/s, DSL: 0.5-2 Mb/s)

images, video and audio are amenable to compression

statistical redundancy in signal


irrelevant information


B. Vasudev & W. Li, Memory management: Codecs


codec = (en)coder + decoder



  signal  -> source coder   ->  channel coder    (encoding)
  
  signal  <- source decoder <-  channel decoder  (decoding)
  

codec design problem


From a systems design viewpoint, one can restate the codec design problem as a bit rate minimization problem, meeting (among others) constraints concerning:

  • specified levels of signal quality,
  • implementation complexity, and
  • communication delay (start coding -- end decoding).

...



tradeoffs

  • resilience to transmission errors
  • degradations in decoder output -- lossless or lossy
  • data representation -- browsing & inspection
  • data modalities -- audio & video.
  • transcoding to other formats -- interoperability
  • coding efficiency -- compression ratio
  • coder complexity -- processor and memory requirements
  • signal quality -- bit error probability, signal/noise ratio

MPEG-1 video compression uses both intra-frame analysis, for the compression of individual frames (which are like images), as well as. inter-frame analysis, to detect redundant blocks or invariants between frames.

frames


GigaPort


system spatial resolution frame rate mbps
NTSC704 x 480 30 243 mbps
PAL/SECAM 720 x 576 25 249 mbps

item streaming downloaded
bandwidth equal to the display rate may be arbitrarily small
disk storage none the entire file must be stored
startup delay almost none equal to the download time
resolution depends on available bandwidth depends on available disk storage

formats


Quicktime, introduced by Apple, early 1990s, for local viewing; RealVideo, streaming video from RealNetworks; and Windows Media, a proprietary encoding scheme fromMicrosoft.

Examples of these formats, encoded for various bitrates are available at Video at VU.

...



standards


...



"Perhaps the most immediate need for MPEG-4 is defensive. It supplies tools with which to create uniform (and top-quality) audio and video encoders on the Internet, preempting what may become an unmanageable tangle of proprietary formats."

MPEG-4


a toolbox of advanced compression algorithms for audiovisual information

scalability

...



audiovisual information


example


Imagine, a talking figure standing next to a desk and a projection screen, explaining the contents of a video that is being projected on the screen, pointing at a globe that stands on the desk. The user that is watching that scene decides to change from viewpoint to get a better look at the globe ...

media objects


composition


transport


The data stream (Elementary Streams) that result from the coding process can be transmitted or stored separately and need to be composed so as to create the actual multimedia presentation at the receivers side.

scenegraph


...



DMIF


Delivery Multimedia Integration Framework

...


(a) scene graph (b) sprites

benefits


managing intellectual property

...



XMT


...



SMIL


TV-like multimedia presentations

parallel and sequential


Authoring a SMIL presentation comes down, basically, to

name media components for text, images,audio and video with URLs, and to schedule their presentation either in parallel or in sequence.

presentation characteristics


applications


example



   <par>
      <a href="#Story"> <img src="button1.jpg"/> </a>
      <a href="#Weather"> <img src="button2.jpg"/></a>
       <excl>
           <par id="Story" begin="0s">
             <video src="video1.mpg"/>
             <text src="captions.html"/>
           </par>
  
           <par id="Weather">
             <img src="weather.jpg"/>
             <audio src="weather-rpt.mp3"/>
           </par>
       </excl>
   </par>
  

history


Experience from both the CD-ROM community and from the Web multimedia community suggested that it would be beneficial to adopt a declarative format for expressing media synchronization on the Web as an alternative and complementary approach to scripting languages.

Following a workshop in October 1996, W3C established a first working group on synchronized multimedia in March 1997. This group focused on the design of a declarative language and the work gave rise to SMIL 1.0 becoming a W3C Recommendation in June 1998.

SMIL 2.0 Modules


module-based reuse


...



www.web3d.org


groups.yahoo.com/group/rm3d/


The Web3D Rich Media Working Group was formed to develop a Rich Media standard format (RM3D) for use in next-generation media devices. It is a highly active group with participants from a broad range of companies including 3Dlabs, ATI, Eyematic, OpenWorlds, Out of the Blue Design, Shout Interactive, Sony, Uma, and others.

RM3D


The Web3D Consortium initiative is fueled by a clear need for a standard high performance Rich Media format. Bringing together content creators with successful graphics hardware and software experts to define RM3D will ensure that the new standard addresses authoring and delivery of a new breed of interactive applications.

...



requirements


SMIL is closer to the author and RM3D is closer to the implementer.

...



working draft


Since there are three vastly different proposals for this section (time model), the original <RM3D> 97 text is kept. Once the issues concerning time-dependent nodes are resolved, this section can be modified appropriately.

time model


MPEG-4 -- spring metaphor


SMIL -- cascading time



  <seq speed="2.0">
     <video src="movie1.mpg" dur="10s"/>
     <video src="movie2.mpg" dur="10s"/>
     <img src="img1.jpg" begin="2s" dur="10s">
                 <animateMotion from="-100,0" to="0,0" dur="10s"/>
     </img>
     <video src="movie4.mpg" dur="10s"/>
  </seq>
  

RM3D/VRML -- event routing


...



...



web content

structure to the meaningful content of web pages,

...



meta data


Metadata is data about data. Specifically, the term refers to data used to identify, describe, or locate information resources, whether these resources are physical or electronic. While structured metadata processed by computers is relatively new, the basic concept of metadata has been used for many years in helping manage and use large collections of information. Library card catalogs are a familiar example of such metadata.

Dublin Core example



  <rdf:RDF
      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:dc="http://purl.org/dc/elements/1.1/"
      xmlns:dcterms="http://purl.org/dc/terms/">
      <rdf:Description rdf:about="http://www.dlib.org/dlib/may98/miller/05miller.html">
        <dc:title>An Introduction to the Resource Description Framework</dc:title>
        <dc:creator>Eric J. Miller</dc:creator>
        <dc:description>The Resource Description Framework (RDF) is an
         infrastructure that enables the encoding, exchange and reuse of
         structured metadata. rdf is an application of xml that imposes needed
         structural constraints to provide unambiguous methods of expressing
         semantics. rdf additionally provides a means for publishing both
         human-readable and machine-processable vocabularies designed to
         encourage the reuse and extension of metadata semantics among
         disparate information communities. the structural constraints rdf
         imposes to support the consistent encoding and exchange of
         standardized metadata provides for the interchangeability of separate
         packages of metadata defined by different resource description
         communities. </dc:description>
        <dc:publisher>Corporation for National Research Initiatives</dc:publisher>
        <dc:subject>
          <rdf:Bag>
            <rdf:li>machine-readable catalog record formats</rdf:li>
            <rdf:li>applications of computer file organization and
             access methods</rdf:li>
          </rdf:Bag>
        </dc:subject>
        <dc:rights>Copyright © 1998 Eric Miller</dc:rights>
        <dc:type>Electronic Document</dc:type>
        <dc:format>text/html</dc:format>
        <dc:language>en</dc:language>
        <dcterms:isPartOf rdf:resource="http://www.dlib.org/dlib/may98/05contents.html"/>
      </rdf:Description>
  </rdf:RDF>
  

Dublin Core


...



information repository


The Web is becoming a universal repository of human knowledge and culture, which has allowed unprecedented sharing of ideas and information in a scale never seen before.

browsing & navigation


To satisfy his information need, the user might navigate the hyperspace of web links searching for information of interest. However, since the hyperspace is vast and almost unknown, such a navigation task is usually inefficient.

information agent


presentation agent


...



PERsonal and SOcial NAvigation through information spaceS

PERSONAS


investigating a new approach to navigation through information spaces, based on a personalised and social navigational paradigm.

Agneta & Frieda


The AGNETA & FRIDA system seeks to integrate web-browsing and narrative into a joint mode. Below the browser window (on the desktop) are placed two female characters, sitting in their livingroom chairs, watching the browser during the session (more or less like watching television). Agneta and Frida (mother and daughter) physically react, comment, make ironic remarks about and develop stories around the information presented in the browser (primarily to each other), but are also sensitive to what the navigator is doing and possible malfunctions of the browser or server.

Agneta & Frieda


In this way they seek to attach emotional, comical or anecdotal connotations to the information and happenings in the browsing session. Through an activity slider, the navigator can decide on how active she wants the characters to be, depending on the purpose of the browsing session (serious information seeking, wayfinding, exploration or entertainment browsing).

game as social system


actorsrule(s)resource(s)
players eventsgame space
rolesevaluationsituation
goalsfacilitator(s)context

criteria


climate star


simulation parameters


...


game play, model-based simulation, exploration

game elements


  1. game cycle -- turns in subsequent rounds (G)
  2. simulation(s) -- based on (world) climate model (W)
  3. exploration -- by means of interactive video (E)

argument(s)


...



3. codecs and standards

concepts


technology


projects & further reading

As a project, you may think of implementing for example JPEG compression, following  [Fundamentals], or a SMIL-based application for cultural heritage.

You may further explore the technical issues on authoring DV material, using any of the Adobe, mentioned in appendix E. or compare

For further reading I advice you to take a look at the respective specifications of MPEG-4 and SMIL, and compare the functionality of MPEG-4 and SMIL-based presentation environments. An invaluable book dealing with the many technical aspects of compression and standards in  [Fundamentals].

the artwork

  1. costume designs -- photographed from Die Russchische Avantgarde und die Buhne 1890-1930
  2. theatre scene design, also from (above)
  3. dance Erica Russel,  [Animovie]
  4. MPEG-4 -- bits rates, from  [MPEG-4].
  5. MPEG-4 -- scene positioning, from  [MPEG-4].
  6. MPEG-4 -- up and downstream data, from  [MPEG-4].
  7. MPEG-4 -- left: scene graph; right: sprites, from  [MPEG-4].
  8. MPEG-4 -- syntax, from  [MPEG-4].
  9. MIT Media Lab web site.
  10. student work -- multimedia authoring I, dutch windmill.
  11. student work -- multimedia authoring I, Schröder house.
  12. student work -- multimedia authoring I, train station.
  13. animation -- Joan Gratch, from  [Animovie].
  14. animation -- Joan Gratch, from  [Animovie].
  15. animation -- Joan Gratch, from  [Animovie].
  16. animation -- Joan Gratch, from  [Animovie].
  17. Agneta and Frieda example.
  18. diagram (Clima Futura) game elements
  19. signs -- people,  [Signs], p. 246, 247.