1 code implementation • 18 Jul 2024 • Yingru Li, Jiawei Xu, Zhi-Quan Luo
Foundation models often struggle with uncertainty when faced with new situations in online decision-making, necessitating scalable and efficient exploration to resolve this uncertainty.
no code implementations • 17 Mar 2024 • Yingru Li, Zhi-Quan Luo
This work advances randomized exploration in reinforcement learning (RL) with function approximation modeled by linear mixture MDPs.
no code implementations • 26 Feb 2024 • Liangqi Liu, Wenqiang Pu, Yingru Li, Bo Jiu, Zhi-Quan Luo
The dynamic competition between radar and jammer systems presents a significant challenge for modern Electronic Warfare (EW), as current active learning approaches still lack sample efficiency and fail to exploit jammer's characteristics.
no code implementations • 16 Feb 2024 • Yingru Li
We introduce the first probabilistic framework tailored for sequential random projection, an approach rooted in the challenges of sequential decision-making under uncertainty.
no code implementations • 10 Feb 2024 • Yingru Li
We provide the first rigorous proof of the spherical construction's effectiveness and introduce a general class of sub-Gaussian constructions within this simplified framework.
no code implementations • 7 Feb 2024 • Yingru Li, Liangqi Liu, Wenqiang Pu, Hao Liang, Zhi-Quan Luo
This work tackles the complexities of multi-player scenarios in \emph{unknown games}, where the primary challenge lies in navigating the uncertainty of the environment through bandit feedback alongside strategic decision-making.
2 code implementations • 5 Feb 2024 • Yingru Li, Jiawei Xu, Lei Han, Zhi-Quan Luo
We propose HyperAgent, a reinforcement learning (RL) algorithm based on the hypermodel framework for exploration in RL.
1 code implementation • ICLR 2022 • Ziniu Li, Yingru Li, Yushun Zhang, Tong Zhang, Zhi-Quan Luo
However, it is limited to the case where 1) a good feature is known in advance and 2) this feature is fixed during the training: if otherwise, RLSVI suffers an unbearable computational burden to obtain the posterior samples of the parameter in the $Q$-value function.
1 code implementation • NeurIPS 2019 • Qing Wang, Yingru Li, Jiechao Xiong, Tong Zhang
In deep reinforcement learning, policy optimization methods need to deal with issues such as function approximation and the reuse of off-policy data.
4 code implementations • 24 Feb 2017 • Kun He, Yingru Li, Sucheta Soundarajan, John E. Hopcroft
We introduce a new paradigm that is important for community detection in the realm of network analysis.