MIR PhD Thesis: Emmanuel Vincent (2004)

Instrument Models for Source Separation and Transcription of Music Recordings

Emmanuel Vincent
University of Paris 6, France (December, 2004)

ABSTRACT

For about fifteen years the study of chamber music recordings has focused on two distinct viewpoints: source separation and polyphonic transcription. Source separation tries to extract from a recording the signals corresponding to each musical instrument playing. Polyphonic transcription aims to describe a recording by a set of parameters: instrument names, pitch and loudness of the notes, etc. Existing methods, based on spatial and spectro-temporal analysis of the recordings, provide satisfying results in simple cases. But their performance generally degrades quickly in the presence of reverberation, instruments of similar pitch range or notes at harmonic intervals.

Our hypothesis is that these methods often suffer from too generic models of instrumental sources. We propose to address this by creating specific instrument models based on a learning framework.

In this dissertation, we justify this hypothesis by studying the relevant information present in musical recordings and its use by existing methods. Then we describe new probabilistic instrument models inspired from Independent Subspace Analysis (ISA) and we give a few examples of learnt instruments. Finally we exploit these models to separate and transcribe realistic recordings, among which CD tracks and synthetic convolutive or underdetermined mixtures of these tracks.

[BibTex, PDF, Return]