The ACOI framework
What are stored are not the actual multimedia objects themselves, but structural descriptions of these objects (including their location) that may be used for retrieval.
The ACOI model is based on the assumption that indexing an arbitrary multimedia object is equivalent to deriving a grammatical structure that provides a namespace to reason about the object and to access its components. However, there is an important difference with ordinary parsing in that the lexical and grammatical items corresponding to the components of the multimedia object must be created dynamically by inspecting the actual object. Moreover, in general, there is not a fixed sequence of lexicals as in the case of natural or formal languages. To allow for the dynamic creation of lexical and grammatical items the ACOI framework supports both black-box and white-box (feature) detectors. Black-box detectors are algorithms, usually developed by a specialist in the media domain, that extract properties from the media object by some form of analysis. White-box detectors, on the other hand, are created by defining logical or mathematical expressions over the grammar itself. In this paper we will focus on black-box detectors only.
As an example, look at the (simple) feature grammar below, specifying the structure of a hypothetical community.
A community has a name. The actual purpose of this grammar is to select the persons that belong to a particular community from the input, which consists of names of potential community members. Note that the grammar specifies three detectors. These detectors correspond to functions that are invoked when expanding the corresponding non-terminal in the grammar. An example of a detector function is the personDetector function partially specified below.
put name(person) on tokenstream
putAtom(tks,"name",t);
}
...
}
The companyDetector differs from the personDetector in that it needs to inspect the complete parse tree to see whether the (implicit) company predicate is satisfied.
When parsing succeeds and the company predicate is satisfied a given input may result in a sequence of updates of the underlying database, as illustrated below.
The overall architecture of the ACOI framework is depicted in slide acoi.
Taking a feature grammar specification, such as the simple
community grammar, as a point of reference, we see
that it is related to an actual feature detector
(possibly containing an embedded logic component)
that is invoked by the Feature Detector Engine (FDE)
when an appropriate media object is presented for indexing.
The feature grammar and its associated detector
further result in updating respectively the data schemas
and the actual information stored in the (Monet) database.
The Monet database, which underlies the ACOI framework,
is a customizable, high-performance, main-memory database
developed at the CWI and the University of Amsterdam, see
At the user end, a feature grammar is related to
a View, Query
and Report component,
that respectively allow for inspecting a feature grammar,
expressing a query, and delivering a response
to a query.
Some examples of these components are currently implemented as applets
in Java 1.1 with Swing, as described in
Formal specification
The anatomy of a MIDI feature detector
The grammar given below corresponds in an obvious way with the structure depicted in slide midi-structure.
To extract relevant fragments of the melody we use the melody detector, of which a partial listing is given below.
Parsing a given MIDI file, for example kortjakje.mid,
results in updating the Monet database.
The updates reflect the structure of
the musical information object that corresponds to
the properties defined in the grammar.
Implementation status
Queries -- the user interface
(C) Æliens 04/09/2009
You may not copy or print any of this material without explicit permission of the author or the publisher. In case of other copyright issues, contact the author.