no code implementations • 25 Sep 2020 • Mohammad Keshavarzi, Aakash Parikh, Xiyu Zhai, Melody Mao, Luisa Caldas, Allen Y. Yang

Spatial computing experiences are constrained by the real-world surroundings of the user.

no code implementations • 27 Aug 2019 • Tengyuan Liang, Alexander Rakhlin, Xiyu Zhai

We study the risk of minimum-norm interpolants of data in Reproducing Kernel Hilbert Spaces.

no code implementations • 26 Jun 2019 • Tiancheng Yu, Xiyu Zhai, Suvrit Sra

The performance of a machine learning system is usually evaluated by using i. i. d.\ observations with true labels.

no code implementations • 28 Dec 2018 • Alexander Rakhlin, Xiyu Zhai

We show that minimum-norm interpolation in the Reproducing Kernel Hilbert Space corresponding to the Laplace kernel is not consistent if input dimension is constant.

no code implementations • NeurIPS 2018 • Simon S. Du, Yining Wang, Xiyu Zhai, Sivaraman Balakrishnan, Ruslan R. Salakhutdinov, Aarti Singh

We show that for an $m$-dimensional convolutional filter with linear activation acting on a $d$-dimensional input, the sample complexity of achieving population prediction error of $\epsilon$ is $\widetilde{O(m/\epsilon^2)$, whereas the sample-complexity for its FNN counterpart is lower bounded by $\Omega(d/\epsilon^2)$ samples.

no code implementations • 9 Nov 2018 • Simon S. Du, Jason D. Lee, Haochuan Li, Li-Wei Wang, Xiyu Zhai

Gradient descent finds a global minimum in training deep neural networks despite the objective function being non-convex.

no code implementations • ICLR 2019 • Simon S. Du, Xiyu Zhai, Barnabas Poczos, Aarti Singh

One of the mysteries in the success of neural networks is randomly initialized first order methods like gradient descent can achieve zero training loss even though the objective function is non-convex and non-smooth.

no code implementations • NeurIPS 2018 • Simon S. Du, Yining Wang, Xiyu Zhai, Sivaraman Balakrishnan, Ruslan Salakhutdinov, Aarti Singh

It is widely believed that the practical success of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) owes to the fact that CNNs and RNNs use a more compact parametric representation than their Fully-Connected Neural Network (FNN) counterparts, and consequently require fewer training examples to accurately estimate their parameters.

no code implementations • 19 Jul 2017 • Wenlong Mou, Li-Wei Wang, Xiyu Zhai, Kai Zheng

This is the first algorithm-dependent result with reasonable dependence on aggregated step sizes for non-convex learning, and has important implications to statistical learning aspects of stochastic gradient methods in complicated models such as deep learning.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.