Hugo CARACALLA' s Thesis

  • these
Sound Texture Synthesis from Summary Statistics
The jury is:
Joshua Reiss, Centre for Digital Music (C4DM), Queen Mary University of London (rapporteur)
Lonce Wyse, Communications and New Media Department, National University of Singapore (rapporteur en visioconf)
Carlos Agon, Université Paris VI
Elaine Chew, IRCAM
Patrick Perez,
Axel Roebel, IRCAM


Sound textures are a wide class of sounds that includes the sound of the rain falling, the hubbub of a crowd and the chirping of flocks of birds. All these sounds present an element of unpredictability which is not commonly sought after in sound synthesis, requiring the use of dedicated algorithms. However, the diverse audio properties of sound textures make the designing of an algorithm able to convincingly recreate varied textures a complex task.

Our work focuses on parametric sound texture synthesis. In this paradigm, a set of summary statistics are extracted from a target texture and iteratively imposed onto a white noise. If the set of statistics is appropriate, the white noise is modified until it resemble the target, sounding as if it had been recorded moments later.

In a first part, we propose improvements to perceptual-based parametric method. These improvements aim at making its synthesis of sharp and salient events by mainly altering and simplifying its imposition process.

In a second, we adapt a parametric visual texture synthesis method based statistics extracted by a Convolutional Neural Networks (CNN) to work on sound textures. We modify the computation of its statistics to fit the properties of sound signals, alter the architecture of the CNN to best fit audio elements present in sound textures and use a time-frequency representation taking both magnitude and phase into account.


En poursuivant votre navigation sur ce site, vous acceptez l'utilisation de cookies pour nous permettre de mesurer l'audience, et pour vous permettre de partager du contenu via les boutons de partage de réseaux sociaux. En savoir plus.