no code implementations • 3 Oct 2024 • Wanpeng Zhang, Zilong Xie, Yicheng Feng, Yijiang Li, Xingrun Xing, Sipeng Zheng, Zongqing Lu
Multimodal Large Language Models have made significant strides in integrating visual and textual information, yet they often struggle with effectively aligning these modalities.
no code implementations • 3 Oct 2024 • Haoqi Yuan, Bohan Zhou, Yuhui Fu, Zongqing Lu
While recent studies have primarily focused on learning policies for specific robotic hands, the development of a universal policy that controls diverse dexterous hands remains largely unexplored.
no code implementations • 3 Oct 2024 • Ziye Huang, Haoqi Yuan, Yuhui Fu, Zongqing Lu
To overcome these challenges, we introduce ResDex, a novel approach that integrates residual policy learning with a mixture-of-experts (MoE) framework.
no code implementations • 3 Oct 2024 • Bohan Zhou, Haoqi Yuan, Yuhui Fu, Zongqing Lu
With BiDexHD, scalable learning of numerous bimanual dexterous skills from auto-constructed tasks becomes feasible, offering promising advances toward universal bimanual dexterous manipulation.
no code implementations • 11 Aug 2024 • Zhirui Fang, Ming Yang, Weishuai Zeng, Boyu Li, Junpeng Yue, Ziluo Ding, Xiu Li, Zongqing Lu
LMMs excel in planning long-horizon tasks over symbolic abstractions but struggle with grounding in the physical world, often failing to accurately identify object positions in images.
1 code implementation • 4 Aug 2024 • Haobin Jiang, Zongqing Lu
To approach this goal, we leverage a vision-language model (VLM) for visual grounding and transfer its vision-language knowledge into reinforcement learning (RL) for object-centric tasks, which makes the agent capable of zero-shot generalization to unseen objects and instructions.
no code implementations • 2 Aug 2024 • Bohan Zhou, Jiangxing Wang, Zongqing Lu
The in-context learning ability of Transformer models has brought new possibilities to visual navigation.
no code implementations • 1 Jul 2024 • Siwei Li, Yifan Yang, Yifei Shen, Fangyun Wei, Zongqing Lu, Lili Qiu, Yuqing Yang
Efficient fine-tuning plays a fundamental role in modern large models, with low-rank adaptation emerging as a particularly promising approach.
no code implementations • 30 May 2024 • Yujiao Jiang, Qingmin Liao, Zhaolong Wang, Xiangru Lin, Zongqing Lu, Yuxi Zhao, Hanqing Wei, Jingrui Ye, Yu Zhang, Zhijing Shao
We present SMPLX-Lite dataset, the most comprehensive clothing avatar dataset with multi-view RGB sequences, keypoints annotations, textured scanned meshes, and textured SMPLX-Lite-D models.
1 code implementation • 24 May 2024 • Jiafei Lyu, Chenjia Bai, Jingwen Yang, Zongqing Lu, Xiu Li
We perform representation learning only in the target domain and measure the representation deviations on the transitions from the source domain, which we show can be a signal of dynamics mismatch.
no code implementations • 12 Apr 2024 • Jingrui Ye, Zongkai Zhang, Yujiao Jiang, Qingmin Liao, Wenming Yang, Zongqing Lu
OccGaussian initializes 3D Gaussian distributions in the canonical space, and we perform occlusion feature query at occluded regions, the aggregated pixel-align feature is extracted to compensate for the missing information.
no code implementations • 1 Apr 2024 • Liwen Zhu, Peixi Peng, Zongqing Lu, Yonghong Tian
Traffic signal control has a great impact on alleviating traffic congestion in modern cities.
no code implementations • 18 Mar 2024 • Yujiao Jiang, Qingmin Liao, Xiaoyu Li, Li Ma, Qi Zhang, Chaopeng Zhang, Zongqing Lu, Ying Shan
Therefore, we propose UV Gaussians, which models the 3D human body by jointly learning mesh deformations and 2D UV-space Gaussian textures.
no code implementations • 14 Mar 2024 • Sipeng Zheng, Bohan Zhou, Yicheng Feng, Ye Wang, Zongqing Lu
In this paper, we propose \textbf{UniCode}, a novel approach within the domain of multimodal large language models (MLLMs) that learns a unified codebook to efficiently tokenize visual, text, and potentially other types of signals.
1 code implementation • 5 Mar 2024 • Weihao Tan, Wentao Zhang, Xinrun Xu, Haochong Xia, Ziluo Ding, Boyu Li, Bohan Zhou, Junpeng Yue, Jiechuan Jiang, Yewen Li, Ruyi An, Molei Qin, Chuqiao Zong, Longtao Zheng, Yujie Wu, Xiaoqiang Chai, Yifei Bi, Tianbao Xie, Pengjie Gu, Xiyun Li, Ceyao Zhang, Long Tian, Chaojie Wang, Xinrun Wang, Börje F. Karlsson, Bo An, Shuicheng Yan, Zongqing Lu
To handle this issue, we propose the General Computer Control (GCC) setting to restrict foundation agents to interact with software through the most unified and standardized interface, i. e., using screenshots as input and keyboard and mouse actions as output.
no code implementations • 29 Feb 2024 • Shaoteng Liu, Haoqi Yuan, Minda Hu, Yanwei Li, Yukang Chen, Shu Liu, Zongqing Lu, Jiaya Jia
To seamlessly integrate both modalities, we introduce a two-level hierarchical framework, RL-GPT, comprising a slow agent and a fast agent.
1 code implementation • 6 Feb 2024 • Jiafei Lyu, Xiaoteng Ma, Le Wan, Runze Liu, Xiu Li, Zongqing Lu
Offline reinforcement learning (RL) has attracted much attention due to its ability in learning from static offline datasets and eliminating the need of interacting with the environment.
no code implementations • 5 Feb 2024 • Jiafei Lyu, Le Wan, Xiu Li, Zongqing Lu
Recently, there are many efforts attempting to learn useful policies for continuous control in visual reinforcement learning (RL).
1 code implementation • 3 Feb 2024 • Haobin Jiang, Ziluo Ding, Zongqing Lu
The other is how agents can explore in a coordinated way.
no code implementations • 10 Jan 2024 • Jiechuan Jiang, Kefan Su, Zongqing Lu
Cooperative multi-agent reinforcement learning is a powerful tool to solve many real-world cooperative tasks, but restrictions of real-world applications may require training the agents in a fully decentralized manner.
Multi-agent Reinforcement Learning reinforcement-learning +1
1 code implementation • 5 Dec 2023 • Chi Zhang, Penglin Cai, Yuhui Fu, Haoqi Yuan, Zongqing Lu
We benchmark creative tasks with the challenging open-world game Minecraft, where the agents are asked to create diverse buildings given free-form language instructions.
no code implementations • 20 Oct 2023 • Sipeng Zheng, Jiazheng Liu, Yicheng Feng, Zongqing Lu
Steve-Eye integrates the LLM with a visual encoder which enables it to process visual-text inputs and generate multimodal feedback.
no code implementations • 13 Oct 2023 • Yicheng Feng, Yuxuan Wang, Jiazheng Liu, Sipeng Zheng, Zongqing Lu
Recently, various studies have leveraged Large Language Models (LLMs) to help decision-making and planning in environments, and try to align the LLMs' knowledge with the world conditions.
1 code implementation • 29 Sep 2023 • Wanpeng Zhang, Zongqing Lu
Large Language Models (LLMs) have demonstrated significant success across various domains.
1 code implementation • 12 Sep 2023 • Qinpeng Cui, Xinyi Zhang, Zongqing Lu, Qingmin Liao
In this work, we formulate the sampling process as an extended reverse-time SDE (ER SDE), unifying prior explorations into ODEs and SDEs.
no code implementations • NeurIPS 2023 • Bohan Zhou, Ke Li, Jiechuan Jiang, Zongqing Lu
Learning from visual observation (LfVO), aiming at recovering policies from only visual observation data, is promising yet a challenging problem.
1 code implementation • 5 Jun 2023 • Wanpeng Zhang, Yilin Li, Boyu Yang, Zongqing Lu
COREP primarily employs a guided updating mechanism to learn a stable graph representation for the state, termed as causal-origin representation.
no code implementations • 29 May 2023 • Jiafei Lyu, Le Wan, Zongqing Lu, Xiu Li
Empirical results show that SMR significantly boosts the sample efficiency of the base methods across most of the evaluated tasks without any hyperparameter tuning or additional tricks.
no code implementations • 17 Apr 2023 • Jiawei Xu, Zongqing Lu, Qingmin Liao
Lack of texture often causes ambiguity in matching, and handling this issue is an important challenge in optical flow estimation.
no code implementations • 29 Mar 2023 • Haoqi Yuan, Chi Zhang, Hongcheng Wang, Feiyang Xie, Penglin Cai, Hao Dong, Zongqing Lu
Our method outperforms baselines by a large margin and is the most sample-efficient demonstration-free RL method to solve Minecraft Tech Tree tasks.
1 code implementation • 19 Mar 2023 • Haobin Jiang, Junpeng Yue, Hao Luo, Ziluo Ding, Zongqing Lu
One of the essential missions in the AI research community is to build an autonomous embodied agent that can achieve high-level performance across a wide spectrum of tasks.
no code implementations • 16 Feb 2023 • Hao Luo, Jiechuan Jiang, Zongqing Lu
To help the policy improvement be stable and monotonic, we propose model-based decentralized policy optimization (MDPO), which incorporates a latent variable function to help construct the transition and reward function from an individual perspective.
no code implementations • 16 Feb 2023 • Yicheng Feng, Boshi An, Zongqing Lu
The study of emergent communication has been dedicated to interactive artificial intelligence.
no code implementations • 2 Feb 2023 • Jiechuan Jiang, Zongqing Lu
To tackle this challenge, we propose best possible operator, a novel decentralized operator, and prove that the policies of agents will converge to the optimal joint policy if each agent independently updates its individual state-action value by the operator.
no code implementations • 8 Jan 2023 • Wenzhe Li, Hao Luo, Zichuan Lin, Chongjie Zhang, Zongqing Lu, Deheng Ye
Transformer has been considered the dominating neural architecture in NLP and CV, mostly under supervised settings.
no code implementations • ICCV 2023 • Jun Hoong Chan, Bohan Yu, Heng Guo, Jieji Ren, Zongqing Lu, Boxin Shi
Illumination planning in photometric stereo aims to find a balance between tween surface normal estimation accuracy and image capturing efficiency by selecting optimal light configurations.
no code implementations • 6 Nov 2022 • Kefan Su, Zongqing Lu
In this paper, we propose \textit{decentralized policy optimization} (DPO), a decentralized actor-critic algorithm with monotonic improvement and convergence guarantee.
no code implementations • 25 Oct 2022 • Ziluo Ding, Wanpeng Zhang, Junpeng Yue, Xiangjun Wang, Tiejun Huang, Zongqing Lu
We investigate the use of natural language to drive the generalization of policies in multi-agent settings.
Multi-agent Reinforcement Learning reinforcement-learning +2
no code implementations • CVPR 2023 • Zhaozhi Wang, Kefan Su, Jian Zhang, Huizhu Jia, Qixiang Ye, Xiaodong Xie, Zongqing Lu
In this paper, we propose multi-agent automated machine learning (MA2ML) with the aim to effectively handle joint optimization of modules in automated machine learning (AutoML).
no code implementations • 9 Oct 2022 • Jiafei Lyu, Aicheng Gong, Le Wan, Zongqing Lu, Xiu Li
We present state advantage weighting for offline reinforcement learning (RL).
no code implementations • 26 Sep 2022 • Ziluo Ding, Kefan Su, Weixin Hong, Liwen Zhu, Tiejun Huang, Zongqing Lu
Communication helps agents to obtain information about others so that better coordinated behavior can be learned.
1 code implementation • 26 Sep 2022 • Jiangxing Wang, Deheng Ye, Zongqing Lu
To this end, we propose multi-agent conditional policy factorization (MACPF), which takes more centralized training but still enables decentralized execution.
no code implementations • 17 Sep 2022 • Kefan Su, Siyuan Zhou, Jiechuan Jiang, Chuang Gan, Xiangjun Wang, Zongqing Lu
Decentralized learning has shown great promise for cooperative multi-agent reinforcement learning (MARL).
1 code implementation • 21 Jun 2022 • Haoqi Yuan, Zongqing Lu
We study offline meta-reinforcement learning, a practical reinforcement learning paradigm that learns from offline data to adapt to new tasks.
1 code implementation • 17 Jun 2022 • Yuanpei Chen, Tianhao Wu, Shengjie Wang, Xidong Feng, Jiechuang Jiang, Stephen Marcus McAleer, Yiran Geng, Hao Dong, Zongqing Lu, Song-Chun Zhu, Yaodong Yang
In this study, we propose the Bimanual Dexterous Hands Benchmark (Bi-DexHands), a simulator that involves two dexterous hands with tens of bimanual manipulation tasks and thousands of target objects.
1 code implementation • 16 Jun 2022 • Jiafei Lyu, Xiu Li, Zongqing Lu
Model-based RL methods offer a richer dataset and benefit generalization by generating imaginary trajectories with either trained forward or reverse dynamics model.
3 code implementations • 9 Jun 2022 • Jiafei Lyu, Xiaoteng Ma, Xiu Li, Zongqing Lu
The distribution shift between the learned policy and the behavior policy makes it necessary for the value function to stay conservative such that out-of-distribution (OOD) actions will not be severely overestimated.
2 code implementations • 16 Dec 2021 • Yuxuan Yi, Ge Li, YaoWei Wang, Zongqing Lu
Inspired by the fact that sharing plays a key role in human's learning of cooperation, we propose LToS, a hierarchically decentralized MARL framework that enables agents to learn to dynamically share reward with neighbors so as to encourage agents to cooperate on the global objective through collectives.
Multi-agent Reinforcement Learning reinforcement-learning +2
no code implementations • 24 Nov 2021 • Jiacheng Chen, Bin-Bin Gao, Zongqing Lu, Jing-Hao Xue, Chengjie Wang, Qingmin Liao
In practice, it can adaptively generate multiple class-agnostic prototypes for query images and learn feature alignment in a self-contrastive manner.
Ranked #48 on Few-Shot Semantic Segmentation on COCO-20i (1-shot)
no code implementations • 1 Oct 2021 • Kefan Su, Zongqing Lu
Though divergence regularization has been proposed to settle this problem, it cannot be trivially applied to cooperative multi-agent reinforcement learning (MARL).
Multi-agent Reinforcement Learning reinforcement-learning +3
no code implementations • 29 Sep 2021 • Yicheng Feng, Zongqing Lu
We find that symbolic mapping learned in simple referential games can notably promote language learning in difficult tasks.
no code implementations • 29 Sep 2021 • Ziluo Ding, Weixin Hong, Liwen Zhu, Tiejun Huang, Zongqing Lu
Agents determine the priority of decision-making by comparing the value of intention.
no code implementations • 29 Sep 2021 • Jiechuan Jiang, Zongqing Lu
OTC is simple yet effective to increase data efficiency and improve agent policies in online tuning.
Multi-agent Reinforcement Learning reinforcement-learning +2
no code implementations • 4 Aug 2021 • Jiechuan Jiang, Zongqing Lu
In this paper, we propose a framework for offline decentralized multi-agent reinforcement learning, which exploits value deviation and transition normalization to deliberately modify the transition probabilities.
no code implementations • 4 Aug 2021 • Xiaopeng Yu, Jiechuan Jiang, Wanpeng Zhang, Haobin Jiang, Zongqing Lu
When one agent interacts with a multi-agent environment, it is challenging to deal with various opponents unseen before.
no code implementations • 10 Jun 2021 • Haobin Jiang, Yifan Yu, Zongqing Lu
In multi-agent reinforcement learning, the inherent non-stationarity of the environment caused by other agents' actions posed significant difficulties for an agent to learn a good policy independently.
no code implementations • 19 Apr 2021 • Jiacheng Chen, Bin-Bin Gao, Zongqing Lu, Jing-Hao Xue, Chengjie Wang, Qingmin Liao
To this end, we generate self-contrastive background prototypes directly from the query image, with which we enable the construction of complete sample pairs and thus a complementary and auxiliary segmentation task to achieve the training of a better segmentation model.
2 code implementations • 5 Feb 2021 • Ang A. Li, Zongqing Lu, Chenglin Miao
Furthermore, we successfully extend our theoretical framework to maximum-entropy RL by deriving the lower and upper bounds of these value metrics for soft Q-learning, which turn out to be the product of $|\text{TD}|$ and "on-policyness" of the experiences.
no code implementations • 26 Jan 2021 • Kai Lv, Zongqing Lu, Qingmin Liao
By the new descriptor, we can obtain more high confidence matching points without extremum operation.
3 code implementations • 4 Jan 2021 • Liwen Zhu, Peixi Peng, Zongqing Lu, Xiangqian Wang, Yonghong Tian
To make the policy learned from a training scenario generalizable to new unseen scenarios, a novel Meta Variationally Intrinsic Motivated (MetaVIM) RL method is proposed to learn the decentralized policy for each intersection that considers neighbor information in a latent way.
no code implementations • 1 Jan 2021 • Ziluo Ding, Tiejun Huang, Zongqing Lu
The emergence of language is a mystery.
no code implementations • 1 Jan 2021 • Jiechuan Jiang, Zongqing Lu
In multi-agent reinforcement learning (MARL), the learning rates of actors and critic are mostly hand-tuned and fixed.
Multi-agent Reinforcement Learning reinforcement-learning +2
no code implementations • 28 Sep 2020 • Jiechuan Jiang, Zongqing Lu
EOI learns a probabilistic classifier that predicts a probability distribution over agents given their observation and gives each agent an intrinsic reward of being correctly predicted by the classifier.
Multi-agent Reinforcement Learning reinforcement-learning +2
1 code implementation • NeurIPS 2020 • Ziluo Ding, Tiejun Huang, Zongqing Lu
Empirically, we show that I2C can not only reduce communication overhead but also improve the performance in a variety of multi-agent cooperative scenarios, comparing to existing methods.
2 code implementations • 10 Jun 2020 • Jiechuan Jiang, Zongqing Lu
EOI learns a probabilistic classifier that predicts a probability distribution over agents given their observation and gives each agent an intrinsic reward of being correctly predicted by the classifier.
no code implementations • 18 May 2020 • Yang Chen, Zongqing Lu, Xuechen Zhang, Lei Chen, Qingmin Liao
Recent end-to-end deep neural networks for disparity regression have achieved the state-of-the-art performance.
no code implementations • 27 Feb 2020 • Jiarong Chen, Zongqing Lu, Jing-Hao Xue, Qingmin Liao
Depthwise convolution has gradually become an indispensable operation for modern efficient neural networks and larger kernel sizes ($\ge5$) have been applied to it recently.
2 code implementations • NeurIPS 2019 • Jiechuan Jiang, Zongqing Lu
Fairness is essential for human society, contributing to stability and productivity.
1 code implementation • 30 Jul 2019 • Huy Phan, Oliver Y. Chén, Philipp Koch, Zongqing Lu, Ian McLoughlin, Alfred Mertins, Maarten De Vos
We employ the Montreal Archive of Sleep Studies (MASS) database consisting of 200 subjects as the source domain and study deep transfer learning on three different target domains: the Sleep Cassette subset and the Sleep Telemetry subset of the Sleep-EDF Expanded database, and the Surrey-cEEGrid database.
Ranked #1 on Multimodal Sleep Stage Detection on Surrey-PSG
Automatic Sleep Stage Classification Multimodal Sleep Stage Detection +2
no code implementations • 21 Apr 2019 • Jiechuan Jiang, Zongqing Lu
Sparse reward is one of the biggest challenges in reinforcement learning (RL).
4 code implementations • ICLR 2020 • Jiechuan Jiang, Chen Dun, Tiejun Huang, Zongqing Lu
The key is to understand the mutual interplay between agents.
no code implementations • 3 Sep 2018 • Liping Zhang, Zongqing Lu, Qingmin Liao
With the motivation of various convolutional neural network(CNN) structures succeeded in single image super-resolution(SISR) task, an end-to-end convolutional neural network is proposed to reconstruct the high resolution(HR) optical flow field from initial LR optical flow with the guidence of the first frame used in optical flow estimation.
no code implementations • NeurIPS 2018 • Jiechuan Jiang, Zongqing Lu
Our model leads to efficient and effective communication for large-scale multi-agent cooperation.
no code implementations • 27 Sep 2017 • Zongqing Lu, Swati Rallapalli, Kevin Chan, Thomas La Porta
In doing so Augur tackles several challenges: (i) how to overcome pro ling and measurement overhead; (ii) how to capture the variance in different mobile platforms with different processors, memory, and cache sizes; and (iii) how to account for the variance in the number, type and size of layers of the different CNN configurations.