no code implementations • ICML 2020 • Qianli Shen, Yan Li, Haoming Jiang, Zhaoran Wang, Tuo Zhao
In contrast to policy parameterized by linear/reproducing kernel functions, where simple regularization techniques suffice to control smoothness, for neural network based reinforcement learning algorithms, there is no readily available solution to learn a smooth policy.
no code implementations • 5 Sep 2024 • Zheyuan Hu, Nazanin Ahmadi Daryakenari, Qianli Shen, Kenji Kawaguchi, George Em Karniadakis
We demonstrate Mamba's superior performance in both interpolation and challenging extrapolation tasks.
no code implementations • 20 Jun 2024 • Qianli Shen, Yezhen Wang, Zhouhao Yang, Xiang Li, Haonan Wang, Yang Zhang, Jonathan Scarlett, Zhanxing Zhu, Kenji Kawaguchi
Bi-level optimization (BO) has become a fundamental mathematical framework for addressing hierarchical machine learning problems.
no code implementations • 28 May 2024 • Yang Zhang, Yawei Li, Xinpeng Wang, Qianli Shen, Barbara Plank, Bernd Bischl, Mina Rezaei, Kenji Kawaguchi
Overparametrized transformer networks are the state-of-the-art architecture for Large Language Models (LLMs).
no code implementations • 7 Jan 2024 • Haonan Wang, Qianli Shen, Yao Tong, Yang Zhang, Kenji Kawaguchi
Our method strategically embeds connections between pieces of copyrighted information and text references in poisoning data while carefully dispersing that information, making the poisoning data inconspicuous when integrated into a clean dataset.
1 code implementation • CVPR 2024 • Xiang Li, Qianli Shen, Kenji Kawaguchi
The booming use of text-to-image generative models has raised concerns about their high risk of producing copyright-infringing content.
no code implementations • 28 Nov 2022 • Chen Chen, Hongyao Tang, Yi Ma, Chao Wang, Qianli Shen, Dong Li, Jianye Hao
The key idea of SA-PP is leveraging discounted stationary state distribution ratios between the learning policy and the offline dataset to modulate the degree of behavior regularization in a state-wise manner, so that pessimism can be implemented in a more appropriate way.
no code implementations • 24 Oct 2022 • Dianbo Liu, Moksh Jain, Bonaventure Dossou, Qianli Shen, Salem Lahlou, Anirudh Goyal, Nikolay Malkin, Chris Emezue, Dinghuai Zhang, Nadhir Hassen, Xu Ji, Kenji Kawaguchi, Yoshua Bengio
These methods face two important challenges: (a) the posterior distribution over masks can be highly multi-modal which can be difficult to approximate with standard variational inference and (b) it is not trivial to fully utilize sample-dependent information and correlation among dropout masks to improve posterior estimation.
no code implementations • 21 Mar 2020 • Qianli Shen, Yan Li, Haoming Jiang, Zhaoran Wang, Tuo Zhao
Deep reinforcement learning (RL) has achieved great empirical successes in various domains.