1 code implementation • 27 Jun 2024 • Ruizhe Shi, Yifang Chen, Yushi Hu, Alisa Liu, Hannaneh Hajishirzi, Noah A. Smith, Simon Du
Unlike traditional methods that require careful curation of a mixture of datasets to achieve comprehensive improvement, we can quickly experiment with preference weightings using MOD to find the best combination of models.
1 code implementation • 1 Oct 2023 • Yuandong Tian, Yiping Wang, Zhenyu Zhang, Beidi Chen, Simon Du
We propose Joint MLP/Attention (JoMA) dynamics, a novel mathematical framework to understand the training procedure of multilayer Transformer architectures.
no code implementations • 22 May 2022 • Haoyuan Cai, Tengyu Ma, Simon Du
In particular, the lower bound implies that our proposed algorithm, Value-Aware Autonomous Exploration, is nearly minimax-optimal when the number of $L$-controllable states grows polynomially with respect to $L$.
no code implementations • 17 Sep 2021 • Xiaoxia Wu, Yuege Xie, Simon Du, Rachel Ward
We propose a computationally-friendly adaptive learning rate schedule, "AdaLoss", which directly uses the information of the loss function to adjust the stepsize in gradient descent methods.
2 code implementations • ICLR 2021 • Zhenggang Tang, Chao Yu, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon Du, Yu Wang, Yi Wu
We propose a simple, general and effective technique, Reward Randomization for discovering diverse strategic policies in complex multi-agent games.
no code implementations • ICML 2018 • Simon Du, Jason Lee
We provide new theoretical insights on why over-parametrization is effective in learning neural networks.
no code implementations • ICML 2018 • Yi Wu, Siddharth Srivastava, Nicholas Hay, Simon Du, Stuart Russell
Despite the recent successes of probabilistic programming languages (PPLs) in AI applications, PPLs offer only limited support for random variables whose distributions combine discrete and continuous elements.
no code implementations • 29 Oct 2017 • Yining Wang, Simon Du, Sivaraman Balakrishnan, Aarti Singh
We consider the problem of optimizing a high-dimensional convex function using stochastic zeroth-order queries.