演讲者:许志钦(纽约大学阿布扎比分校)
时间:2019-01-08 16:00-17:00
地点:慧园3栋 415报告厅
In this work, from the Fourier analysis perspective, we propose a unified mechanism – Frequency Principle (F-Principle), that is, Deep Neural Networks (DNNs)s trained by gradient-based methods implicitly endows low frequencies with higher priority during the training process – to understand strengths and limitations of DNN, which is important to both its theoretical study and applications. We show that F-principle exists not only for synthetic dataset of low dimensional functions, but also for high dimensional real datasets, e.g., MNIST, CIFAR10. Moreover, we found that F-principle could provide insight into how success and failure of DNNs can be differentiated using a Fourier domain characterization of the target dataset. Based on F-principle, we further propose that DNN can be incorporated into numerical schemes to accelerate the convergence of low frequencies for a variety of computational problems, in which most of the conventional methods (e.g., Jacobi method) converge slowly for low frequencies. Finally, we provide theoretical results for DNNs of one hidden layer, which shed light into the key mechanism underlying the F-Principle.
This is a joint work with Yaoyu Zhang, Tao Luo, Yanyang Xiao, Zheng Ma.
Special thanks goes to Weinan E, David W. McLaughlin and Wei Cai for helpful discussions.