Automatic Transcription of Piano Music with Neural Networks

Matija Marolt
University of Ljubljana, Slovenia (January, 2002)


The thesis presents a system for transcription of polyphonic piano music. Music transcription could be defined as an act of listening to a piece of music and writing down music notation for the piece; it is a process of converting an acoustical waveform to a parametric representation, where notes, their pitches, starting times and durations are extracted from the musical signal.

Our main motivation for this work was to evaluate connectionist approaches in different parts of the transcription system.

We developed a new partial tracking technique, based on a combination of a psychoacoustic time-frequency transform and adaptive oscillators. The technique exploits the synchronization ability of adaptive oscillators to track partials in outputs of the psychoacoustic transform. We showed that the algorithm successfully tracks partials in signals with diverse characteristics, including frequency modulation and beating. We extended the method for tracking individual partials to a method for tracking groups of harmonically related partials by joining adaptive oscillators into networks. Oscillator networks produce a very clear time-frequency representation and we show that it significantly improves the accuracy of transcription. We show how different types of neural networks perform for the task of note recognition and how onset detection and repeated note detection can be successfully performed by connectionist approaches.

We tested our system on several synthesized and natural recordings of piano pieces and for most pieces, the accuracy of transcription ranged between 80 and 90 percent. When compared to several other systems, our system achieved similar or better results and we believe that neural networks represent a viable alternative in building transcription systems and should be further studied.

[BibTex, Return]