1 code implementation • 20 Jan 2025 • Jinyu Wang, Jingjing Fu, Rui Wang, Lei Song, Jiang Bian
Despite notable advancements in Retrieval-Augmented Generation (RAG) systems that expand large language model (LLM) capabilities through external retrieval, these systems often struggle to meet the complex and diverse needs of real-world industrial applications.
1 code implementation • 10 Dec 2024 • Shukuan Wang, Ke Xue, Lei Song, Xiaobin Huang, Chao Qian
MCTS-transfer can not only provide a well-performing search space for warm-start but also adaptively identify and leverage the information of similar source tasks to reconstruct the search space during the optimization process.
no code implementations • 3 Oct 2024 • Ruohong Liu, Yuxin Pan, Linjie Xu, Lei Song, Pengcheng You, Yize Chen, Jiang Bian
Multi-objective reinforcement learning (MORL) excels at handling rapidly changing preferences in tasks that involve multiple criteria, even for unseen preferences.
no code implementations • 11 Sep 2024 • Wenhao Zhao, Qiushui Xu, Linjie Xu, Lei Song, Jinyu Wang, Chunlai Zhou, Jiang Bian
Although this cross-domain pre-training approach achieves superior performance compared to training from scratch in environments required short-term planning ability, the mechanisms by which pre-training benefits the fine-tuning phase remain unclear.
1 code implementation • 12 Aug 2024 • Sam Khallaghi, Rahebe Abedi, Hanan Abou Ali, Mary Dziedzorm Asipunu, Ismail Alatise, Nguyen Ha, Boka Luo, Cat Mai, Lei Song, Amos Wussah, Sitian Xiong, Qi Zhang, Lyndon D. Estes
The combination of photometric augmentation, TFL loss, and MC-dropout produced the best results, although dropout alone led to more false negatives in subsequent year predictions.
1 code implementation • 20 Jul 2024 • Yunseon Choi, Sangmin Bae, Seonghyun Ban, Minchan Jeong, Chuheng Zhang, Lei Song, Li Zhao, Jiang Bian, Kee-Eung Kim
With the advent of foundation models, prompt tuning has positioned itself as an important technique for directing model behaviors and eliciting desired responses.
no code implementations • 3 Jun 2024 • Zijian Li, Qingyan Guo, Jiawei Shao, Lei Song, Jiang Bian, Jun Zhang, Rui Wang
A graph neural network (GNN) is then leveraged to exploit the relationships between passages and improve the retrieval of supporting passages.
1 code implementation • 2 Jun 2024 • Yifan Xia, Xianliang Yang, Zichuan Liu, Zhihao Liu, Lei Song, Jiang Bian
Recent advancements in solving large-scale traveling salesman problems (TSP) utilize the heatmap-guided Monte Carlo tree search (MCTS) paradigm, where machine learning (ML) models generate heatmaps, indicating the probability distribution of each edge being part of the optimal solution, to guide MCTS in solution finding.
2 code implementations • 15 May 2024 • Zichuan Liu, Tianchun Wang, Jimeng Shi, Xu Zheng, Zhuomin Chen, Lei Song, Wenqian Dong, Jayantha Obeysekera, Farhad Shirani, Dongsheng Luo
The design of the objective function builds upon the principle of information bottleneck (IB), and modifies the IB objective function to avoid trivial solutions and distributional shift issues.
2 code implementations • 22 Apr 2024 • Zichuan Liu, Zefan Wang, Linjie Xu, Jinyu Wang, Lei Song, Tianchun Wang, Chunlin Chen, Wei Cheng, Jiang Bian
The advent of large language models (LLMs) has revolutionized the field of natural language processing, yet they might be attacked to produce harmful content.
no code implementations • 15 Apr 2024 • Linjie Xu, Zichuan Liu, Alexander Dockhorn, Diego Perez-Liebana, Jinyu Wang, Lei Song, Jiang Bian
One of the notorious issues for Reinforcement Learning (RL) is poor sample efficiency.
1 code implementation • 27 Feb 2024 • Lei Song, Chenxiao Gao, Ke Xue, Chenyang Wu, Dong Li, Jianye Hao, Zongzhang Zhang, Chao Qian
In this paper, we propose RIBBO, a method to reinforce-learn a BBO algorithm from offline data in an end-to-end fashion.
1 code implementation • 16 Dec 2023 • Xiaobin Huang, Lei Song, Ke Xue, Chao Qian
Considering that the estimated PDF may have high estimation error when the true distribution is complicated, we further propose the second algorithm that optimizes the distributionally robust objective.
no code implementations • 6 Aug 2023 • Lei Song, Chuheng Zhang, Li Zhao, Jiang Bian
2)~How well can GPT-4 generalize to different scenarios for HVAC control?
1 code implementation • 13 Jun 2023 • Xianliang Yang, Zhihao Liu, Wei Jiang, Chuheng Zhang, Li Zhao, Lei Song, Jiang Bian
Multi-agent reinforcement learning (MARL) models multiple agents that interact and learn within a shared environment.
1 code implementation • 6 Jun 2023 • Linjie Xu, Zhengyao Jiang, Jinyu Wang, Lei Song, Jiang Bian
Offline reinforcement learning (RL) methodologies enforce constraints on the policy to adhere closely to the behavior policy, thereby stabilizing value learning and mitigating the selection of out-of-distribution (OOD) actions during test time.
1 code implementation • 7 May 2023 • Mingrui Ma, Tao Wang, Lei Song, Weijie Wang, Guixia Liu
Furthermore, shifted window partitioning operations are inflexible, indicating that they cannot perceive the semantic information over uncertain distances and automatically bridge the global connections between windows.
1 code implementation • 19 Apr 2023 • Xuanhao Pan, Yan Jin, Yuandong Ding, Mingxiao Feng, Li Zhao, Lei Song, Jiang Bian
We propose an end-to-end learning framework based on hierarchical reinforcement learning, called H-TSP, for addressing the large-scale Travelling Salesman Problem (TSP).
Deep Reinforcement Learning
Hierarchical Reinforcement Learning
+1
2 code implementations • 19 Apr 2023 • Yan Jin, Yuandong Ding, Xuanhao Pan, Kun He, Li Zhao, Tao Qin, Lei Song, Jiang Bian
Traveling Salesman Problem (TSP), as a classic routing optimization problem originally arising in the domain of transportation and logistics, has become a critical task in broader domains, such as manufacturing and biology.
no code implementations • 15 Dec 2022 • Yuandong Ding, Mingxiao Feng, Guozi Liu, Wei Jiang, Chuheng Zhang, Li Zhao, Lei Song, Houqiang Li, Yan Jin, Jiang Bian
In this paper, we consider the inventory management (IM) problem where we need to make replenishment decisions for a large number of stock keeping units (SKUs) to balance their supply and demand.
1 code implementation • 5 Dec 2022 • Yuanying Cai, Chuheng Zhang, Li Zhao, Wei Shen, Xuyun Zhang, Lei Song, Jiang Bian, Tao Qin, TieYan Liu
There are two challenges for this setting: 1) The optimal trade-off between optimizing the RL signal and the behavior cloning (BC) signal changes on different states due to the variation of the action coverage induced by different behavior policies.
1 code implementation • 4 Oct 2022 • Lei Song, Ke Xue, Xiaobin Huang, Chao Qian
Bayesian optimization (BO) is a class of popular methods for expensive black-box optimization, and has been widely applied to many scenarios.
1 code implementation • 28 Apr 2022 • Mingrui Ma, Lei Song, Yuanbo Xu, Guixia Liu
Medical image registration is a fundamental and critical task in medical image analysis.
no code implementations • 29 Sep 2021 • Mingxiao Feng, Guozi Liu, Li Zhao, Lei Song, Jiang Bian, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu
We consider inventory management (IM) problem for a single store with a large number of SKUs (stock keeping units) in this paper, where we need to make replenishment decisions for each SKU to balance its supply and demand.