Philippe ESLING has defended his habilitation for directing researches at IRCAM, which title is: Probabilistic Generetive Models for Artificial Creative Intelligence, on the 14th of January 22 at 4PM.
You can follow this habilitation at: https://youtu.be/S0Gys2PmFYQ
et it is on the Ircam's MEDIA site: https://medias.ircam.fr/xd90aa1
His jury is:
Yann LE CUN, professor, New York University, USA
Robert STURM, professor, Royal Institute de Technology de Stockholm, Suède
Michèle SEBAG, professor, Laboratoire de Recherche en Informatique (LRI - CNRS)
Douglas ECK, professor, DIRO, Université de Montréal, Google Brain, Canada
Stephen McADAMS, professor, CIRMMT, MacGill University, Montréal, Canada
Patrick GALLINARI, professor, LIP6 Sorbonne Université, Paris, France
Abstract :
In recent years, advances in artificial intelligence, and most notably deep machine learning, have reshaped our everyday life. Indeed, deep learning has provided astonishing results, strongly outperforming state-of-the-art models for information retrieval [6]. These models are now pervasive in modern technology, with a plethora of architectures developed for tasks in almost all domains of scientific inquiry. Despite its substantial contributions to research, deep learning has first focused on a mathematico-logical approach aimed to solve formal problems through a set of supervised goals. Very recently, the community has started shifting its attention towards generative models, which can be defined as a form of unsupervised representation learning [8]. Although these approaches try to address some limitations of existing models, there is a relative scarcity of research trying to understand creative intelligence, which is the core of our project. The study of this new paradigm proves to be crucial through two main aspects. On the one hand, it aims to understand creativity, this aspect that so fundamentally distinguishes the human being from other branches of the tree of life. On the other hand, its objective is to be able to model cognitive and perceptual phenomena, which are still particularly elusive. The growing interest for these issues is reflected in the increasing use of generative systems by a wide variety of researchers from diverse horizons (from industrial to fundamental science). This trend underlines the need to study this approach for future scientific discoveries. For that aim, music provides an ideal framework for developing our understanding of the creative mechanisms of intelligence.
Indeed, creativity mecanisms, in particular in musical improvisation and audio synthesis, bring together stimulating theoretical questions and cognitive processes that are difficult to model. Specifically, the notion of musical time is a primordial component, indissociable from music that develops on multiple scales. Thus, through the understanding of musical creativity, most of the current challenges in the machine learning area are found: the question of temporality, multimodal information, data scarcity, hierarchical structures and the lack of a formal goal. These exciting issues are also reflected in a tremendous array of research areas. The objective of this project is therefore to bring new answers to the field of artificial intelligence thanks to a bilateral approach. First, to make new discoveries by exploiting the latest advances in artificial intelligence for musical data. Second, by applying these innovative methods to other areas of research through partnerships in perception and environmental monitoring that face the same scientific barriers. Finally, this project aims to develop the relationship between humans and AI by targeting situations of partnership and co-creativity in musical improvisation.
Over the past years, our project has seeked to extend deep learning approaches towards the use of multivariate and multimodal data, through the analysis of musical orchestration, auditory perception and audio synthesis. In this context, the multivariate analysis of temporal processes is required given the inherent multidimensional nature of instrumental mixtures. Furthermore, time series need to be scrutinized at variable time scales (termed here granularities) as a wealth of time scales co-exist in music (from the identity of single notes up to the structure of entire pieces). Furthermore, orchestration lies at the exact intersection between symbol (musical writing) and signal (audio recording) representations. This warrants the need for multimodal approaches that could work simultaneously on both hierarchies of information. Our work has aimed to adress these issues by relying on generative models, notably by trying to construct organized latent spaces [45]. These spaces can provide a simple way to understand underlying factors of variation in music, but also to control these in order to generate novel musical content. These researches have produced multiple new creative systems and musical software, developed directly through existing industrial partnerships. We also performed an epistemological loop by applying similar approaches to auditory perception and metagenomics (as we will detail later on). We believe that adressing the question of creative intelligence through the analysis of orchestration could give rise to a whole new category of generic creative learning systems.