Thesis defence : Giovanni Bindi

  • Research
  • these

Giovanni Bindi, doctoral student at Sorbonne University in the Computer Science, Telecommunications and Electronics (EDITE) graduate school in Paris, carried out her research entitled ‘Compositional learning of audio representations’ at the STMS laboratory (Ircam - Sorbonne University - CNRS - Ministry of Culture), as part of the Sound Analysis and Synthesis team, under the supervision of Philippe Esling.

The defence will take place in English, in the Shannon room ( to be confirm) at IRCAM, on Tuesday 18 March 2025.

It will be registred on YouTube: https://youtube.com/live/e6SWXCkd68w

The jury will be composed of:

- George Fazekas, Queen Mary University of London (Reviewer)

- Magdalena Fuentes, New York University (Reviewer)

- Ashley Burgoyne, Universiteit van Amsterdam (Examiner)

- Mark Sandler, Queen Mary University of London (Examiner)

- Geoffroy Peeters, Télécom Paris  (Examiner)

- Philippe Esling, Sorbonne University (Director)

Abstract :

This thesis explores the intersection of machine learning, generative models, and music composition. While machine learning has transformed many fields, its application to music presents unique challenges. We focus on compositional learning, which involves constructing complex musical structures from simpler, reusable components. Our goal is to provide an initial analysis of how this concept applies to musical audio.

 Our framework consists of two phases: decomposition and recomposition. In the decomposition phase, we extract meaningful representations of instruments from polyphonic mixtures without requiring labeled data. This allows us to identify and separate different sound sources. In the recomposition phase, we introduce a generative approach that builds on these representations to create new musical arrangements. By structuring the process hierarchically—starting with drums and progressively adding other elements like bass and piano—we explore a flexible way to generate accompaniment.

Our findings suggest that compositional learning can improve source separation and structured music generation. While our approach shows promise, further work is needed to assess its broader applicability and generalization. We hope this research contributes to a better understanding of generative models in music and inspires future developments in computational creativity.

En poursuivant votre navigation sur ce site, vous acceptez l'utilisation de cookies pour nous permettre de mesurer l'audience, et pour vous permettre de partager du contenu via les boutons de partage de réseaux sociaux. En savoir plus.