Survol

[ English ]

Cette activité thématique se concentre sur les difficultés mathématiques de l'apprentissage machine. Les succès spectaculaires de l'apprentissage machine dans des domaines très variés ouvrent la porte à d'importantes questions théoriques dans divers domaines des mathématiques, en particulier probabilités, statistiques, combinatoires, optimisation et géométrie. Le CRM rassemblera des chercheurs en apprentissage machine et en mathématiques pour réfléchir à ces nouvelles problématiques. Les sujets principaux seront la statistique combinatoire, l'apprentissage en ligne, et les réseaux de neurones profonds.

Le programme inclura deux ateliers portant sur « Statistique combinatoire » et « Nouveaux problèmes en théorie de l'apprentissage » ainsi que des séminaires données par les chercheurs invités et ceux en résidence.

Conférence inaugurale lundi 16 avril avec Yoshua Bengio.
11h30 - 12h30

Université de Montréal, Pavillon André-Aisenstadt, salle 1360
Deep Learning for AI

There has been rather impressive progress recently with brain-inspired statistical learning algorithms based on the idea of learning multiple levels of representation, also known as neural networks or deep learning. They shine in artificial intelligence tasks involving perception and generation of sensory data like images or sounds and to some extent in understanding and generating natural language. We have proposed new generative models which lead to training frameworks very different from the traditional maximum likelihood framework, and borrowing from game theory. Theoretical understanding of the success of deep learning is work in progress but relies on representation aspects as well as optimization aspects, which interact. At the heart is the ability of these learning mechanisms to capitalize on the compositional nature of the underlying data distributions, meaning that some functions can be represented exponentially more efficiently with deep distributed networks compared to approaches like standard non-parametric methods which lack both depth and distributed representations. On the optimization side, we now have evidence that local minima (due to the highly non-convex nature of the training objective) may not be as much of a problem as thought a few years ago, and that training with variants of stochastic gradient descent actually helps to quickly find better-generalizing solutions. Finally, new interesting questions and answers are arising regarding learning theory for deep networks, why even very large networks do not necessarily overfit and how the representation-forming structure of these networks may give rise to better error bounds which do not absolutely depend on the iid data hypothesis.

Lundi 23 avril – jeudi 26 avril
Atelier sur la théorie de l'apprentissage
24 conférenciers invités.
Ouvert à tous les chercheurs. Inscription complète.

Lundi 30 avril – vendredi 4 mai
Atelier sur les statistiques combinatoires (sur invitation seulement).
Un mini-cours par Yuval Peres (Microsoft Research).

Conférence de clôture vendredi 11 mai.