Musical feature detection in ACOI

Anton Eliëns & Martin Kersten
CWI email: eliens@cs.vu.nl, M.Kersten@cwi.nl

Introduction

Reader's Guide


contents abstract intro web ACOI detector query retrieval conclusions References
With the growth of information spaces the retrieval of information, based on indexing schemes, becomes increasingly important. As it comes to information embedded in multimedia objects, we must observe that progress in automatic indexing is rather limited. Obviously, taking the World Wide Web as our information space, manual classification schemes do not suffice, simply because they do not scale.

The ACOI project  [ACOI] provides a large scale experimentation platform to study issues in the indexing and retrieval of multimedia objects. The resulting ACOI framework is intended to provide a sound model for indexing and retrieval based on feature detection, as well as an effective system architecture accomodating a variety of algorithms to extract relevant properties from multimedia objects. The ACOI approach to multimedia feature detection is based on the deployment of high-level feature grammars augmented with media-specific feature detectors to describe the structural properties of multimedia objects. The structured objects that correspond to the parse trees may be used for the retrieval of information. Key challenges here are to find sufficiently selective properties for a broad range of multimedia objects and realistic similarity measures for the retrieval of information.

In this report, we will look at the indexing and retrieval of musical fragments. We aim at providing suitable support for a user to find a musical piece of his likening, by lyrics, by genre, by musical instruments, tempo, similarity to other pieces, melody and mood. We propose an indexing scheme that allows for the efficient retrieval of musical objects, using descriptive properties, as well as content-based properties, including lyrics and melody.

This study is primarily aimed at establishing the architectural requirements for the detection of musical features and to indicate directions for exploring the inherently difficult problem of finding proper discriminating features and similarity measures in the musical domain. In this study we have limited ourselves to the analysis of music encoded in MIDI, to avoid the technical difficulties involved in extracting basic musical properties from raw sound material. Currently we have a simple running prototype for extracting higher level features from MIDI files. In our approach to musical feature detection, we extended the basic grammar-based ACOI framework with an embedded logic component to facilitate the formulation of predicates and constraints over the musical structure obtained from the input.

The prototype does at this stage not include actual query facilities. However, we will discuss what query facilities need to be incorporated and how to approach similarity matching for musical structures to achieve efficient retrieval. We will also look at the issues that play a role in content-based retrieval by briefly reviewing what we consider to be the most significant attempts in this direction.

Structure

The structure of this report is as follows. First we will discuss search facilities for music on the Web. We will then look at the ACOI framework and the interaction of components supporting grammar-based feature detection. We will describe a grammar for musical fragments and a corresponding feature detector for the extraction of features from a MIDI file or MIDI fragment. Also, we will discuss the options for processing queries and give a brief review of the results that have been achieved for content-based retrieval, in particular the recognition of melody based on similarity metrics. Finally, we will draw some conclusions and indicate directions for further research.