no code implementations • 25 Sep 2020 • Mohammad Keshavarzi, Aakash Parikh, Xiyu Zhai, Melody Mao, Luisa Caldas, Allen Y. Yang
Spatial computing experiences are constrained by the real-world surroundings of the user.
no code implementations • 27 Aug 2019 • Tengyuan Liang, Alexander Rakhlin, Xiyu Zhai
We study the risk of minimum-norm interpolants of data in Reproducing Kernel Hilbert Spaces.
no code implementations • 26 Jun 2019 • Tiancheng Yu, Xiyu Zhai, Suvrit Sra
The performance of a machine learning system is usually evaluated by using i. i. d.\ observations with true labels.
no code implementations • 28 Dec 2018 • Alexander Rakhlin, Xiyu Zhai
We show that minimum-norm interpolation in the Reproducing Kernel Hilbert Space corresponding to the Laplace kernel is not consistent if input dimension is constant.
no code implementations • NeurIPS 2018 • Simon S. Du, Yining Wang, Xiyu Zhai, Sivaraman Balakrishnan, Ruslan R. Salakhutdinov, Aarti Singh
We show that for an $m$-dimensional convolutional filter with linear activation acting on a $d$-dimensional input, the sample complexity of achieving population prediction error of $\epsilon$ is $\widetilde{O(m/\epsilon^2)$, whereas the sample-complexity for its FNN counterpart is lower bounded by $\Omega(d/\epsilon^2)$ samples.
no code implementations • 9 Nov 2018 • Simon S. Du, Jason D. Lee, Haochuan Li, Li-Wei Wang, Xiyu Zhai
Gradient descent finds a global minimum in training deep neural networks despite the objective function being non-convex.
no code implementations • ICLR 2019 • Simon S. Du, Xiyu Zhai, Barnabas Poczos, Aarti Singh
One of the mysteries in the success of neural networks is randomly initialized first order methods like gradient descent can achieve zero training loss even though the objective function is non-convex and non-smooth.
no code implementations • NeurIPS 2018 • Simon S. Du, Yining Wang, Xiyu Zhai, Sivaraman Balakrishnan, Ruslan Salakhutdinov, Aarti Singh
It is widely believed that the practical success of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) owes to the fact that CNNs and RNNs use a more compact parametric representation than their Fully-Connected Neural Network (FNN) counterparts, and consequently require fewer training examples to accurately estimate their parameters.
no code implementations • 19 Jul 2017 • Wenlong Mou, Li-Wei Wang, Xiyu Zhai, Kai Zheng
This is the first algorithm-dependent result with reasonable dependence on aggregated step sizes for non-convex learning, and has important implications to statistical learning aspects of stochastic gradient methods in complicated models such as deep learning.