Aaron EINBOND will présent "Machine Learning of Corpus-Based Spatial Sound Synthesis in Prestidigitation for Percussion and 3D Electronics"
Aaron Einbond, composer-researcher
in collaboration with Maxime Echardour, percussion, Ensemble L’Instant Donné,
Thibaut Carpentier, EAC, and Diemo Schwarz, ISMM
Final presentation of artistic research residency MusAI “Music and Artificial Intelligence: Building Critical Interdisciplinary Studies” supported by the European Research Council
Please note: performance and seminar in Studio 5 - the live is on YouTube channel: https://youtu.be/j0kywabT0WQ
Abstract:
How can machine learning of rich spatial data from acoustic instruments be applied to re-embody the spatial presence of the live instrument and performer and embed the listener in a 3D experience? We present a sketch of a new composition Prestidigitation for percussion and 3D electronics in which a sculptural setup of small and homemade instruments is amplified and resynthesized using spherical microphones and loudspeakers to situate the audience amidst a virtual reality performance for the ears. The live percussionist is accompanied by the IKO compact spherical loudspeaker array leveraging natural acoustic data from two sources: 3D amplification with the Eigenmike 32-channel microphone array and a database of 3D instrumental radiation patterns measured at Technische Universität Berlin. We build on the composition and research project Cosmologies for piano and 3D electronics, which in 2020 was the first to connect approaches to audio feature analysis and corpus-based synthesis (CBS) with machine learning (ML) and sound spatialization using higher order ambisonics (HOA). While Cosmologies was diffused through an ambisonic loudspeaker dome, we now turn the situation “inside-out” by situating the IKO in the middle of the performance space. Using IRCAM packages Spat5 and CataRT-MuBu we apply ML of HOA models to a corpus of samples in response to their timbral descriptors. A computer improvisation algorithm then organizes the resulting spatialized samples temporally following an Audio Oracle model. Both approaches offer a critical perspective on existing ML tools that overlook spatial listening. They also represent a novel spatialization paradigm that significantly differs from virtual speakers or focused beams: rather than point sources, sounds are resynthesized in dynamic 3D patterns with theoretically unlimited spatial polyphony. In combination, these two approaches present promising possibilities for training and continuation of spatial gestures.