no code implementations • 4 Nov 2024 • Bingyi Kang, Yang Yue, Rui Lu, Zhijie Lin, Yang Zhao, Kaixin Wang, Gao Huang, Jiashi Feng
Our scaling experiments show perfect generalization within the distribution, measurable scaling behavior for combinatorial generalization, but failure in out-of-distribution scenarios.
1 code implementation • 4 Sep 2024 • Junwei Liu, Kaixin Wang, Yixuan Chen, Xin Peng, Zhenpeng Chen, Lingming Zhang, Yiling Lou
The recent advance in Large Language Models (LLMs) has shaped a new paradigm of AI agents, i. e., LLM-based agents.
no code implementations • 17 Jun 2024 • Xueying Du, Geng Zheng, Kaixin Wang, Jiayi Feng, Wentai Deng, Mingwei Liu, Bihuan Chen, Xin Peng, Tao Ma, Yiling Lou
In addition, our user study shows that the vulnerability knowledge generated by Vul-RAG can serve as high-quality explanations which can improve the manual detection accuracy from 0. 60 to 0. 77.
1 code implementation • 8 Feb 2024 • Lior Cohen, Kaixin Wang, Bingyi Kang, Shie Mannor
We incorporate POP in a novel TBWM agent named REM (Retentive Environment Model), showcasing a 15. 4x faster imagination compared to prior TBWMs.
no code implementations • 13 Nov 2023 • Zhenxiong Tan, Kaixin Wang, Xinchao Wang
We present C-Procgen, an enhanced suite of environments on top of the Procgen benchmark.
1 code implementation • 3 Aug 2023 • Xueying Du, Mingwei Liu, Kaixin Wang, Hanlin Wang, Junwei Liu, Yixuan Chen, Jiayi Feng, Chaofeng Sha, Xin Peng, Yiling Lou
Third, we find that generating the entire class all at once (i. e. holistic generation strategy) is the best generation strategy only for GPT-4 and GPT-3. 5, while method-by-method generation (i. e. incremental and compositional) is better strategies for the other models with limited ability of understanding long instructions and utilizing the middle information.
no code implementations • 9 Jun 2023 • Kaixin Wang, Uri Gadot, Navdeep Kumar, Kfir Levy, Shie Mannor
Robust Markov Decision Processes (RMDPs) provide a framework for sequential decision-making that is robust to perturbations on the transition kernel.
no code implementations • 31 Jan 2023 • Navdeep Kumar, Kfir Levy, Kaixin Wang, Shie Mannor
We present an efficient robust value iteration for \texttt{s}-rectangular robust Markov Decision Processes (MDPs) with a time complexity comparable to standard (non-robust) MDPs which is significantly faster than any existing method.
1 code implementation • 13 Nov 2022 • Kaixin Wang, Cheng Long, Da Yan, Jie Zhang, H. V. Jagadish
Specifically, we propose a weighted sampling algorithm called WSD for estimating the subgraph count in a fully dynamic graph stream, which samples the edges based on their weights that indicate their importance and reflect their properties.
no code implementations • 24 Oct 2022 • Kaixin Wang, Kuangqi Zhou, Jiashi Feng, Bryan Hooi, Xinchao Wang
In Reinforcement Learning (RL), Laplacian Representation (LapRep) is a task-agnostic state representation that encodes the geometry of the environment.
no code implementations • 3 Oct 2022 • Navdeep Kumar, Kaixin Wang, Kfir Levy, Shie Mannor
The policy gradient theorem proves to be a cornerstone in Linear RL due to its elegance and ease of implementability.
no code implementations • 20 Sep 2022 • Fengzhuo Zhang, Boyi Liu, Kaixin Wang, Vincent Y. F. Tan, Zhuoran Yang, Zhaoran Wang
The cooperative Multi-A gent R einforcement Learning (MARL) with permutation invariant agents framework has achieved tremendous empirical successes in real-world applications.
1 code implementation • 28 May 2022 • Navdeep Kumar, Kfir Levy, Kaixin Wang, Shie Mannor
But we don't have a clear understanding to exploit this equivalence, to do policy improvement steps to get the optimal value function or policy.
no code implementations • 23 May 2022 • Kuangqi Zhou, Kaixin Wang, Jiashi Feng, Jian Tang, Tingyang Xu, Xinchao Wang
However, existing best deep AL methods are mostly developed for a single type of learning task (e. g., single-label classification), and hence may not perform well in molecular property prediction that involves various task types.
no code implementations • 30 Jan 2022 • Kaixin Wang, Navdeep Kumar, Kuangqi Zhou, Bryan Hooi, Jiashi Feng, Shie Mannor
The key of this perspective is to decompose the value space, in a state-wise manner, into unions of hypersurfaces.
1 code implementation • 12 Jul 2021 • Kaixin Wang, Kuangqi Zhou, Qixin Zhang, Jie Shao, Bryan Hooi, Jiashi Feng
It enables learning high-quality Laplacian representations that faithfully approximate the ground truth.
2 code implementations • NeurIPS 2020 • Kaixin Wang, Bingyi Kang, Jie Shao, Jiashi Feng
Deep reinforcement learning (RL) agents trained in a limited set of environments tend to suffer overfitting and fail to generalize to unseen testing environments.
2 code implementations • 12 Jun 2020 • Kuangqi Zhou, Yanfei Dong, Kaixin Wang, Wee Sun Lee, Bryan Hooi, Huan Xu, Jiashi Feng
In this work, we study performance degradation of GCNs by experimentally examining how stacking only TRANs or PROPs works.
5 code implementations • ICCV 2019 • Kaixin Wang, Jun Hao Liew, Yingtian Zou, Daquan Zhou, Jiashi Feng
In this paper, we tackle the challenging few-shot segmentation problem from a metric learning perspective and present PANet, a novel prototype alignment network to better utilize the information of the support set.
no code implementations • ICLR 2020 • Daquan Zhou, Xiaojie Jin, Qibin Hou, Kaixin Wang, Jianchao Yang, Jiashi Feng
The recent WSNet [1] is a new model compression method through sampling filterweights from a compact set and has demonstrated to be effective for 1D convolutionneural networks (CNNs).