"Why deep learning works? 

Matthieu Wyart | Professor of Theoretical Physics in the School of Basic Sciences at EPFL.

“General arguments suggest that classifying data in large dimension (for example a picture can live in 100 pixels *100 pixels=10000 dimensions, compared to us who live in 3 Should be impossible: the number of data to train on to achieve good performance should be larger than the number atoms in the universe. Yet, it works: with 1000 000 data you can learn to recognise cats from dogs. It means that pictures are not generic: they have a lot of symmetry and structure in them, that we do not know yet how to describe mathematically. We would like to build a theory that can understand the number of necessary data to learn a rule. Currently we do not even know what fixes its order of magnitude: for cats and dogs, why isn’t it one hundred, or instead one billion billions? “

I am attaching a paper we are writing on this topic, maybe the intro is readable for a non-specialist:

︎︎︎ Matthieu Wyart Publications