Thesica.org, the #1 open access web portal for PhD theses...

Why PhD theses...

PhD thesis is the result of years of hard work.

keyword researchMeasured by download count PhD theses are one of the most popular items world wide on open access repositories. But unless a thesis is published, it is very difficult for other researchers to find out about it and get access to it. Theses are often under-used by other researchers. Thesica.org attempts to address this issue by making it easy to identify and locate copies of many theses in various disciplines.

Automatic annotation of musical audio for interactive applications

Automatic annotation of musical audio for interactive applications
Paul M. Brossier

2006

Centre for Digital Music, Queen Mary, University of London, London E1 4NS, UNITED KINGDOM.

ABSTRACT

As machines become more and more portable, and part of our everyday life, it becomes apparent that developing interactive and ubiquitous systems is an important aspect of new music applications created by the research community. We are interested in developing a robust layer for the automatic annotation of audio signals, to be used in various applications, from music search engines to interactive installations, and in various contexts, from embedded devices to audio content servers. We propose adaptations of existing signal processing techniques to a real time context. Amongst these annotation techniques, we concentrate on low and mid-level tasks such as onset detection, pitch tracking, tempo extraction and note modelling. We present a framework to extract these annotations and evaluate the performances of different algorithms.

The first task is to detect onsets and offsets in audio streams within short latencies. The segmentation of audio streams into temporal objects enables various manipulation and analysis of metrical structure. Evaluation of different algorithms and their adaptation to real time are described. We then tackle the problem of fundamental frequency estimation, again trying to reduce both the delay and the computational cost. Different algorithms are implemented for real time and experimented on monophonic recordings and complex signals. Spectral analysis can be used to label the temporal segments; the estimation of higher level descriptions is approached. Techniques for modelling of note objects and localisation of beats are implemented and discussed.

Applications of our framework include live and interactive music installations, and more generally tools for the composers and sound engineers. Speed optimisations may bring a significant improvement to various automated tasks, such as automatic classification and recommendation systems. We describe the design of our software solution, for our research purposes and in view of its integration within other systems.