1 code implementation • 4 Mar 2024 • Changyu Chen, Xiting Wang, Ting-En Lin, Ang Lv, Yuchuan Wu, Xin Gao, Ji-Rong Wen, Rui Yan, Yongbin Li
In reasoning tasks, even a minor error can cascade into inaccurate results, leading to suboptimal performance of large language models in such domains.
1 code implementation • 7 Dec 2023 • Yuhan Chen, Ang Lv, Ting-En Lin, Changyu Chen, Yuchuan Wu, Fei Huang, Yongbin Li, Rui Yan
Specifically, the crucial information in the context will be potentially overlooked by model when it is positioned in the trough zone of the attention waveform, leading to decreased performance.
Ranked #2 on Trajectory Planning on ToolBench
1 code implementation • NeurIPS 2023 • Changyu Chen, Ramesha Karunasena, Thanh Hong Nguyen, Arunesh Sinha, Pradeep Varakantham
Many problems in Reinforcement Learning (RL) seek an optimal policy with large discrete multidimensional yet unordered action spaces; these include problems in randomized allocation of resources such as placements of multiple security resources and emergency response units, etc.
no code implementations • 25 Oct 2023 • Jixiang Hong, Quan Tu, Changyu Chen, Xing Gao, Ji Zhang, Rui Yan
With in-context learning (ICL) as the core of the cycle, the black-box models are able to rank the model-generated responses guided by human-craft instruction and demonstrations about their preferences.
1 code implementation • 16 Jun 2023 • Changyu Chen, Xiting Wang, Yiqiao Jin, Victor Ye Dong, Li Dong, Jie Cao, Yi Liu, Rui Yan
In reinforcement learning (RL), there are two major settings for interacting with the environment: online and offline.
1 code implementation • 24 Jan 2022 • Changyu Chen, Avinandan Bose, Shih-Fen Cheng, Arunesh Sinha
Recent work has used generative models (GANs in particular) for providing high-fidelity simulation of real-world systems.
no code implementations • ACL 2021 • Chongyang Tao, Changyu Chen, Jiazhan Feng, Ji-Rong Wen, Rui Yan
Recently, many studies are emerging towards building a retrieval-based dialogue system that is able to effectively leverage background knowledge (e. g., documents) when conversing with humans.