Math Department Invited Talks

[Math Department Invited Talk] Modern Statistical Theory Inspired by Deep Learning

Abstract: Modern learning algorithms, such as deep learning, have gained great successes in real applications. However, some of their empirical behaviors may not be interpreted within the classical statistical learning framework. For example, deep learning algorithms achieve small testing error even when the training error is zero, i.e., over-fitting. Another phenomenon is observed in image recognition applications where a hardly noticeable change of data may lead to a dramatic increase in misclassification rates. Inspired by these observations, we attempt to illustrate new theoretical insights for data-interpolation and adversarial testing using the very simple nearest neighbor algorithms. In particular, we prove statistical optimality of interpolated nearest neighbor algorithms. More surprisingly, it is discovered that the classification performance, under a proper interpolation, is even better than the best kNN in terms of multiplicative constant. As for adversarial testing, we demonstrate that different adversarial mechanisms lead to different phase transition phenomena of the misclassification rate in terms of its upper bound. Additionally, our technical analysis invented to deal with adversarial samples can also be applied to other variants of kNN, e.g. pre-processed 1NN and distributed-NN.


Bio: Guang Cheng is a Professor of Statistics at Purdue University.  He received his Ph.D. in Statistics from the University of Wisconsin-Madison in 2006.  His research interests include Big Data and High Dimensional Statistical Inferences, and more recently turned to Deep Learning and Reinforcement Learning.  Cheng is the recipient of the NSF CAREER award, Noether Young Scholar Award and Simons Fellowship in Mathematics. Please visit his big data theory research group at http://www.science.purdue.edu/bigdata/