no code implementations • 29 Jan 2024 • Chenyang Zhao, Guozhong Zheng, Chun Zhang, Jiqiang Zhang, Li Chen
Punishment is a common tactic to sustain cooperation and has been extensively studied for a long time.
no code implementations • 22 Dec 2023 • Yin Luo, Qingchao Kong, Nan Xu, Jia Cao, Bao Hao, Baoyu Qu, Bo Chen, Chao Zhu, Chenyang Zhao, Donglei Zhang, Fan Feng, Feifei Zhao, Hailong Sun, Hanxuan Yang, Haojun Pan, Hongyu Liu, Jianbin Guo, Jiangtao Du, Jingyi Wang, Junfeng Li, Lei Sun, Liduo Liu, Lifeng Dong, Lili Liu, Lin Wang, Liwen Zhang, Minzheng Wang, Pin Wang, Ping Yu, Qingxiao Li, Rui Yan, Rui Zou, Ruiqun Li, Taiwen Huang, Xiaodong Wang, Xiaofei Wu, Xin Peng, Xina Zhang, Xing Fang, Xinglin Xiao, Yanni Hao, Yao Dong, Yigang Wang, Ying Liu, Yongyu Jiang, Yungan Wang, Yuqi Wang, Zhangsheng Wang, Zhaoxin Yu, Zhen Luo, Wenji Mao, Lei Wang, Dajun Zeng
As the latest advancements in natural language processing, large language models (LLMs) have achieved human-level language understanding and generation abilities in many real-world tasks, and even have been regarded as a potential path to the artificial general intelligence.
no code implementations • 30 Nov 2023 • Tianli Liao, Chenyang Zhao, Lei LI, Heling Cao
However, the effectiveness of seam-cutting usually depends on that images can be roughly aligned such that there exists a local region where a plausible seam can be found.
1 code implementation • 22 Nov 2023 • ZiHao Zhou, Bin Hu, Chenyang Zhao, Pu Zhang, Bin Liu
By incorporating the guidance from the teacher agent, the student agent can distill the prior knowledge of the LLM into its own model.
no code implementations • 29 Oct 2023 • Nan He, Hanyu Lai, Chenyang Zhao, Zirui Cheng, Junting Pan, Ruoyu Qin, Ruofan Lu, Rui Lu, Yunchen Zhang, Gangming Zhao, Zhaohui Hou, Zhiyuan Huang, Shaoqing Lu, Ding Liang, Mingjie Zhan
Based on TeacherLM-7. 1B, we augmented 58 NLP datasets and taught various student models with different parameters from OPT and BLOOM series in a multi-task setting.
1 code implementation • 23 Aug 2023 • Vijay Viswanathan, Chenyang Zhao, Amanda Bertsch, Tongshuang Wu, Graham Neubig
In this paper, we propose Prompt2Model, a general-purpose method that takes a natural language task description like the prompts provided to LLMs, and uses it to train a special-purpose model that is conducive to deployment.
1 code implementation • 6 Jun 2023 • Bin Hu, Chenyang Zhao, Pu Zhang, ZiHao Zhou, Yuanhang Yang, Zenglin Xu, Bin Liu
In this paper, we explore how to enable intelligent cost-effective interactions between the agent and an LLM.
no code implementations • 6 May 2023 • Zhoujian Sun, Chenyang Zhao, Zhengxing Huang, Nai Ding
Policy learning (PL) is a module of a task-oriented dialogue system that trains an agent to make actions in each dialogue turn.
no code implementations • 13 Apr 2023 • Chenyang Zhao, Antoni B. Chan
We propose the gradient-weighted Object Detector Activation Maps (ODAM), a visualized explanation technique for interpreting the predictions of object detectors.
1 code implementation • 1 Apr 2023 • Chenyang Zhao, ZiHao Zhou, Bin Liu
Offline Meta Reinforcement Learning (OMRL) aims to learn transferable knowledge from offline datasets to enhance the learning process for new target tasks.
no code implementations • 25 Oct 2022 • Chenyang Zhao, Chuanfei Hu, Hang Shao, Zhe Wang, Yongxiong Wang
An automatic vision-based sewer inspection plays a key role of sewage system in a modern city.
no code implementations • 9 Dec 2020 • Chenyang Zhao, Timothy Hospedales
In reinforcement learning, domain randomisation is an increasingly popular technique for learning more general policies that are robust to domain-shifts at deployment.
no code implementations • 19 Feb 2019 • Chenyang Zhao, Olivier Sigaud, Freek Stulp, Timothy M. Hospedales
Deep Reinforcement Learning has shown great success in a variety of control tasks.