Alice COHEN-HADRIA 's thesis

  • Research
  • these
In Music Information Retrieval (MIR) and voice processing, the use of machine learning tools has become in the last few years more and more standard. Especially, many state-of-the-art systems now rely on the use of Neural Networks.In this thesis, we propose a wide overview of four different MIR and voice processing tasks, using systems built with neural networks. More precisely,  we will use convolutional neural networks, an image designed class neural networks.

The first task presented is music structure estimation. For this task, we will show how the choice of input representation can be critical, when using convolutional neural networks. 
The second task is singing voice detection. We will present how to use a voice detection system to automatically align lyrics and audio tracks.With this alignment mechanism, we have created the largest synchronized audio and speech data set, called DALI. 
Singing voice separation is the third task.
For this task, we will present a data augmentation strategy, a way to significantly increase the size of a training set.
Finally, we tackle voice anonymization. We will present an anonymization method that both obfuscate content and mask the speaker identity, while preserving the acoustic scene.

En poursuivant votre navigation sur ce site, vous acceptez l'utilisation de cookies pour nous permettre de mesurer l'audience, et pour vous permettre de partager du contenu via les boutons de partage de réseaux sociaux. En savoir plus.