The Statistical Physics of Deep Learning: on the Beneficial Roles of Dynamic Criticality, Random Landscapes, and the Reversal of Time

Monday, October 5, 2015 - 4:15pm
Room #1 LeConte Hall
Public Lecture: 

Neuronal networks have enjoyed a resurgence both in the worlds of neuroscience, where they yield mathematical frameworks for thinking about complex neural datasets, and in machine learning, where they achieve state of the art results on a variety of tasks, including machine vision, speech recognition, and language translation.   Despite their empirical success, a mathematical theory of how deep neural circuits, with many layers of cascaded nonlinearities, learn and compute remains elusive.  We will discuss three recent vignettes in which ideas from statistical physics can shed light on this issue.  In particular, we show how dynamical criticality can help in neural learning, how the non-intuitive geometry of high dimensional error landscapes can be exploited to speed up learning, and how modern ideas from non-equilibrium statistical physics, like the Jarzynski equality, can be extended to yield powerful algorithms for modeling complex probability distributions.  Time permitting, we will also discuss the relationship between neural network learning dynamics and the developmental time course of semantic concepts in infants.