no code implementations • 22 Feb 2024 • Jikai Jin, Vasilis Syrgkanis
Average treatment effect estimation is the most central problem in causal inference with application to numerous disciplines.
1 code implementation • 30 Nov 2023 • Kaifeng Lyu, Jikai Jin, Zhiyuan Li, Simon S. Du, Jason D. Lee, Wei Hu
Recent work by Power et al. (2022) highlighted a surprising "grokking" phenomenon in learning arithmetic tasks: a neural net first "memorizes" the training set, resulting in perfect training accuracy but near-random test accuracy, and after training for sufficiently longer, it suddenly transitions to perfect test accuracy.
no code implementations • 21 Nov 2023 • Jikai Jin, Vasilis Syrgkanis
In this work, we provide the first identifiability results based on data that stem from general environments.
no code implementations • 27 Jan 2023 • Jikai Jin, Zhiyuan Li, Kaifeng Lyu, Simon S. Du, Jason D. Lee
It is believed that Gradient Descent (GD) induces an implicit bias towards good generalization in training machine learning models.
no code implementations • 28 Sep 2022 • Jikai Jin, Yiping Lu, Jose Blanchet, Lexing Ying
Learning mappings between infinite-dimensional function spaces has achieved empirical success in many disciplines of machine learning, including generative modeling, functional data analysis, causal inference, and multi-agent reinforcement learning.
no code implementations • 27 May 2022 • Binghui Li, Jikai Jin, Han Zhong, John E. Hopcroft, LiWei Wang
Moreover, we establish an improved upper bound of $\exp({\mathcal{O}}(k))$ for the network size to achieve low robust generalization error when the data lies on a manifold with intrinsic dimension $k$ ($k \ll d$).
no code implementations • 4 Nov 2021 • Jikai Jin, Suvrit Sra
We contribute to advancing the understanding of Riemannian accelerated gradient methods.
no code implementations • NeurIPS 2021 • Jikai Jin, Bohang Zhang, Haiyang Wang, LiWei Wang
Distributionally robust optimization (DRO) is a widely-used approach to learn models that are robust against distribution shift.
no code implementations • 10 Oct 2020 • Jikai Jin
Overall, this paper suggests that \textit{quasar-convexity} allows efficient optimization procedures, and we are looking forward to seeing more problems that demonstrate similar properties in practice.
1 code implementation • NeurIPS 2020 • Bohang Zhang, Jikai Jin, Cong Fang, LiWei Wang
Gradient clipping is commonly used in training deep neural networks partly due to its practicability in relieving the exploding gradient problem.