no code implementations • ECCV 2020 • Sinan Tan, Weilai Xiang, Huaping Liu, Di Guo, Fuchun Sun
We investigate a new AI task --- Multi-Agent Interactive Question Answering --- where several agents explore the scene jointly in interactive environments to answer a question.
no code implementations • 10 Jan 2025 • Duanpeng Shi, Wendong Zheng, Di Guo, Huaping Liu
The method consists of a pre-imaging module, a conditional diffusion model for reconstruction, a forward voltage constraint network and a scheme of voltage consistency constraint during sampling process.
no code implementations • 18 Dec 2024 • Xinghang Li, Peiyan Li, Minghuan Liu, Dong Wang, Jirong Liu, Bingyi Kang, Xiao Ma, Tao Kong, Hanbo Zhang, Huaping Liu
The obtained results convince us firmly to explain why we need VLA and develop a new family of VLAs, RoboVLMs, which require very few manual designs and achieve a new state-of-the-art performance in three simulation tasks and real-world experiments.
no code implementations • 11 Dec 2024 • Hang Gao, Chenhao Zhang, Fengge Wu, Junsuo Zhao, Changwen Zheng, Huaping Liu
To address these limitations, we propose a novel method that leverages the strengths of both LLM and GNN, allowing for the processing of graph data with any format and type of nodes and edges without the need for type information or special preprocessing.
no code implementations • 23 Oct 2024 • Bang You, Huaping Liu
We argue that compressing information in the learned joint representations about raw multimodal observations is helpful, and propose a multimodal information bottleneck model to learn task-relevant joint representations from egocentric images and proprioception.
no code implementations • 26 Sep 2024 • Nan Sun, Bo Mao, Yongchang Li, Lumeng Ma, Di Guo, Huaping Liu
The increasing demand for intelligent assistants in human-populated environments has motivated significant research in autonomous robotic systems.
no code implementations • 23 Sep 2024 • Guokang Wang, Hang Li, Shuyuan Zhang, Yanhong Liu, Huaping Liu
In real-world scenarios, many robotic manipulation tasks are hindered by occlusions and limited fields of view, posing significant challenges for passive observation-based models that rely on fixed or wrist-mounted cameras.
1 code implementation • CVPR 2024 • He Liu, Yikai Wang, Huaping Liu, Fuchun Sun, Anbang Yao
In this line of research, existing methods typically follow an inversion-and-distillation paradigm in which a generative adversarial network on-the-fly trained with the guidance of the pre-trained teacher network is used to synthesize a large-scale sample set for knowledge distillation.
Data-free Knowledge Distillation Generative Adversarial Network +2
1 code implementation • 28 May 2024 • Kangyao Huang, Di Guo, Xinyu Zhang, Xiangyang Ji, Huaping Liu
Training an agent to adapt to specific tasks through co-optimization of morphology and control has widely attracted attention.
no code implementations • 24 May 2024 • Jialin Zhao, Yingtao Zhang, Xinghang Li, Huaping Liu, Carlo Vittorio Cannistraci
The growing computational demands posed by increasingly number of neural network's parameters necessitate low-memory-consumption training approaches.
no code implementations • 7 May 2024 • Zhiwei Li, Bozhen Zhang, Lei Yang, Tianyu Shen, Nuo Xu, Ruosen Hao, Weiting Li, Tao Yan, Huaping Liu
V2X cooperation, through the integration of sensor data from both vehicles and infrastructure, is considered a pivotal approach to advancing autonomous driving technology.
1 code implementation • 8 Apr 2024 • Chenxu Wang, Bin Dai, Huaping Liu, Baoyuan Wang
To gauge the significance of agent architecture, we implement a target-driven planning (TDP) module as an adjunct to the existing agent.
no code implementations • 22 Mar 2024 • Jun Guo, Xiaojian Ma, Yue Fan, Huaping Liu, Qing Li
Unlike existing methods, we design a versatile projection approach that maps various 2D semantic features from pre-trained image encoders into a novel semantic component of 3D Gaussians, which is based on spatial relationship and need no additional training.
no code implementations • 18 Mar 2024 • Hang Gao, Jiaguo Yuan, Jiangmeng Li, Peng Qiao, Fengge Wu, Changwen Zheng, Huaping Liu
To address this issue, we have introduced Partial Label Learning (PLL) into graph representation learning.
no code implementations • 15 Mar 2024 • Kangyao Huang, Di Guo, Xinyu Zhang, Xiangyang Ji, Huaping Liu
It is common for us to feel pressure in a competition environment, which arises from the desire to obtain success comparing with other individuals or opponents.
1 code implementation • 11 Mar 2024 • Zilong Chen, Yikai Wang, Feng Wang, Zhengyi Wang, Huaping Liu
To fully unleash the potential of video diffusion to perceive the 3D world, we further introduce geometrical consistency prior and extend the video diffusion model to a multi-view consistent 3D generator.
1 code implementation • CVPR 2024 • Xin Gao, Tianheng Qiu, Xinyu Zhang, Hanlin Bai, Kang Liu, Xuan Huang, Hu Wei, Guoying Zhang, Huaping Liu
Coarse-to-fine schemes are widely used in traditional single-image motion deblur; however, in the context of deep learning, existing multi-scale algorithms not only require the use of complex modules for feature fusion of low-scale RGB images and deep semantics, but also manually generate low-resolution pairs of images that do not have sufficient confidence.
Ranked #1 on Deblurring on RSBlur
1 code implementation • 15 Dec 2023 • Hang Gao, Chengyu Yao, Jiangmeng Li, Lingyu Si, Yifan Jin, Fengge Wu, Changwen Zheng, Huaping Liu
In order to comprehensively analyze various GNN models from a causal learning perspective, we constructed an artificially synthesized dataset with known and controllable causal relationships between data and labels.
1 code implementation • CVPR 2024 • Sijie Cheng, Zhicheng Guo, Jingwen Wu, Kechen Fang, Peng Li, Huaping Liu, Yang Liu
However, the capability of VLMs to "think" from a first-person perspective, a crucial attribute for advancing autonomous agents and robotics, remains largely unexplored.
1 code implementation • CVPR 2024 • YiWen Chen, Zilong Chen, Chi Zhang, Feng Wang, Xiaofeng Yang, Yikai Wang, Zhongang Cai, Lei Yang, Huaping Liu, Guosheng Lin
3D editing plays a crucial role in many areas such as gaming and virtual reality.
no code implementations • 2 Nov 2023 • Xinghang Li, Minghuan Liu, Hanbo Zhang, Cunjun Yu, Jie Xu, Hongtao Wu, Chilam Cheang, Ya Jing, Weinan Zhang, Huaping Liu, Hang Li, Tao Kong
We believe RoboFlamingo has the potential to be a cost-effective and easy-to-use solution for robotics manipulation, empowering everyone with the ability to fine-tune their own robotics policy.
no code implementations • 21 Oct 2023 • Li Wang, Xinyu Zhang, Fachuan Zhao, Chuze Wu, Yichen Wang, Ziying Song, Lei Yang, Jun Li, Huaping Liu
The proposed Fuzzy-NMS module combines the volume and clustering density of candidate bounding boxes, refining them with a fuzzy classification method and optimizing the appropriate suppression thresholds to reduce uncertainty in the NMS process.
1 code implementation • CVPR 2024 • Zilong Chen, Feng Wang, Yikai Wang, Huaping Liu
Specifically, our method adopts a progressive optimization strategy, which includes a geometry optimization stage and an appearance refinement stage.
no code implementations • 24 Aug 2023 • Yan Gong, Xinyu Zhang, Hao liu, Xinmin Jiang, Zhiwei Li, Xin Gao, Lei Lin, Dafeng Jin, Jun Li, Huaping Liu
Specifically, skip-cross fusion strategy connects each layer to each layer in a feed-forward manner, and for each layer, the feature maps of all previous layers are used as input and its own feature maps are used as input to all subsequent layers for the other modality, enhancing feature propagation and multi-modal features fusion.
1 code implementation • 18 Aug 2023 • Yixuan Li, Huaping Liu, Qiang Jin, Miaomiao Cai, Peng Li
Optical Music Recognition (OMR) is an important technology in music and has been researched for a long time.
no code implementations • 26 Jul 2023 • Xinzhu Liu, Di Guo, Huaping Liu
To study collaboration among heterogeneous agents, we propose the heterogeneous multi-agent tidying-up task, in which multiple heterogeneous agents with different capabilities collaborate with each other to detect misplaced objects and place them in reasonable locations.
1 code implementation • 16 May 2023 • Xiaoheng Sun, Yuejie Gao, Hanyao Lin, Huaping Liu
In this paper, a data-driven model TG-Critic is proposed to introduce timbre embeddings as one of the model inputs to guide the evaluation of singing quality.
no code implementations • 14 May 2023 • Xiaoyu Wang, Kangyao Huang, Xinyu Zhang, Honglin Sun, Wenzhuo LIU, Huaping Liu, Jun Li, Pingping Lu
A robot for the field application environment was proposed, and a lightweight global spatial planning technique for the robot based on the graph-search algorithm taking mode switching point optimization into account, with an emphasis on energy efficiency, searching speed, and the viability of real deployment.
no code implementations • 23 Apr 2023 • Xinyu Zhang, Zhiwei Li, Zhenhong Zou, Xin Gao, Yijin Xiong, Dafeng Jin, Jun Li, Huaping Liu
To quantify the correlation in multi-modal information, we model the uncertainty, as the inverse of data information, in different modalities and embed it in the bounding box generation.
no code implementations • 21 Feb 2023 • Yuhong Deng, Xiaofeng Guo, Yixuan Wei, Kai Lu, Bin Fang, Di Guo, Huaping Liu, Fuchun Sun
A composite robotic hand composed of a suction cup and a gripper is designed for grasping the object stably.
1 code implementation • ICCV 2023 • Feng Wang, Sinan Tan, Xinghang Li, Zeyue Tian, Yafei Song, Huaping Liu
In this paper, we present a novel method named MixVoxels to better represent the dynamic scenes with fast training speed and competitive rendering qualities.
no code implementations • 6 Sep 2022 • Li Wang, Xinyu Zhang, Wenyuan Qin, Xiaoyu Li, Lei Yang, Zhiwei Li, Lei Zhu, Hong Wang, Jun Li, Huaping Liu
As such, we propose a novel camera-LiDAR fusion 3D MOT framework based on the Combined Appearance-Motion Optimization (CAMO-MOT), which uses both camera and LiDAR data and significantly reduces tracking failures caused by occlusion and false detection.
no code implementations • 1 Feb 2022 • Chengliang Zhong, Chao Yang, Jinshan Qi, Fuchun Sun, Huaping Liu, Xiaodong Mu, Wenbing Huang
Keypoint detection and description play a central role in computer vision.
no code implementations • ICLR 2022 • Hao liu, Huaping Liu
Learning multiple tasks sequentially without forgetting previous knowledge, called Continual Learning(CL), remains a long-standing challenge for neural networks.
no code implementations • 26 Jan 2022 • Sinan Tan, Hui Xue, Qiyu Ren, Huaping Liu, Jing Bai
Our framework is based on an innovative evolution algorithm, which is stable and suitable for multiple dataset scenario.
no code implementations • 26 Jan 2022 • Sinan Tan, Mengmeng Ge, Di Guo, Huaping Liu, Fuchun Sun
In the Vision-and-Language Navigation task, the embodied agent follows linguistic instructions and navigates to a specific goal.
1 code implementation • 23 Jan 2022 • Xiaoli Liu, Jianqin Yin, Di Guo, Huaping Liu
Next, we build a bi-directional semantic graph for the teacher network and a single-directional semantic graph for the student network to model rich ASCK among partial videos.
2 code implementations • 14 Oct 2021 • Feng Wang, Tao Kong, Rufeng Zhang, Huaping Liu, Hang Li
To solve this problem, we propose to maximize the mutual information between the input and the class predictions.
Ranked #1 on Image Classification on Oxford-IIIT Pet Dataset
Fine-Grained Image Classification Representation Learning +5
no code implementations • 20 Sep 2021 • Xinzhu Liu, Di Guo, Huaping Liu, Fuchun Sun
In this paper, we propose the multi-agent visual semantic navigation, in which multiple agents collaborate with others to find multiple target objects.
no code implementations • 20 Mar 2021 • Zhenhong Zou, Xinyu Zhang, Huaping Liu, Zhiwei Li, Amir Hussain, Jun Li
There has recently been growing interest in utilizing multimodal sensors to achieve robust lane line segmentation.
no code implementations • CVPR 2021 • Feng Wang, Huaping Liu
We will show that the contrastive loss is a hardness-aware loss function, and the temperature {\tau} controls the strength of penalties on hard negative samples.
no code implementations • NeurIPS 2020 • Feng Wang, Huaping Liu, Di Guo, Sun Fuchun
In this paper, we propose Invariance Propagation to focus on learning representations invariant to category-level variations, which are provided by different instances from the same category.
no code implementations • 17 Nov 2020 • Fan Yang, Chao Yang, Di Guo, Huaping Liu, Fuchun Sun
Robots have limited adaptation ability compared to humans and animals in the case of damage.
1 code implementation • 7 Oct 2020 • Feng Wang, Huaping Liu, Di Guo, Fuchun Sun
In this paper, we propose Invariance Propagation to focus on learning representations invariant to category-level variations, which are provided by different instances from the same category.
1 code implementation • 25 May 2020 • Xiaoli Liu, Jianqin Yin, Huaping Liu, Jun Liu
In contrast to prior works, we improve the multi-order modeling ability of human motion systems for more accurate predictions by building a deep state-space model (DeepSSM).
no code implementations • 30 Apr 2020 • Sinan Tan, Huaping Liu, Di Guo, Xin-Yu Zhang, Fuchun Sun
Embodiment is an important characteristic for all intelligent agents (creatures and robots), while existing scene description tasks mainly focus on analyzing images passively and the semantic understanding of the scenario is separated from the interaction between the agent and the environment.
no code implementations • 15 Mar 2020 • Jianqin Yin, Yanchun Wu, Huaping Liu, Yonghao Dang, Zhiyi Liu, Jun Liu
Our work features two-fold: 1) An important insight that deep features extracted for action recognition can well model the self-similarity periodicity of the repetitive action is presented.
1 code implementation • 10 Mar 2020 • Yuhong Deng, Di Guo, Xiaofeng Guo, Naifu Zhang, Huaping Liu, Fuchun Sun
In this paper, we propose a novel task, Manipulation Question Answering (MQA), where the robot performs manipulation actions to change the environment in order to answer a given question.
no code implementations • ICLR 2020 • Tao Kong, Fuchun Sun, Huaping Liu, Yuning Jiang, Lei LI, Jianbo Shi
While almost all state-of-the-art object detectors utilize predefined anchors to enumerate possible locations, scales and aspect ratios for the search of the objects, their performance and generalization ability are also limited to the design of anchors.
no code implementations • 16 Nov 2019 • Mingxuan Jing, Xiaojian Ma, Wenbing Huang, Fuchun Sun, Chao Yang, Bin Fang, Huaping Liu
In this paper, we study Reinforcement Learning from Demonstrations (RLfD) that improves the exploration efficiency of Reinforcement Learning (RL) by providing expert demonstrations.
no code implementations • 15 Oct 2019 • Xiaoli Liu, Jianqin Yin, Jin Liu, Pengxiang Ding, Jun Liu, Huaping Liu
And the global temporal co-occurrence features represent the co-occurrence relationship that different subsequences in a complex motion sequence are appeared simultaneously, which can be obtained automatically with our proposed TrajectoryNet by reorganizing the temporal information as the depth dimension of the input tensor.
no code implementations • NeurIPS 2019 • Chao Yang, Xiaojian Ma, Wenbing Huang, Fuchun Sun, Huaping Liu, Junzhou Huang, Chuang Gan
This paper studies Learning from Observations (LfO) for imitation learning with access to state-only demonstrations.
1 code implementation • 17 Sep 2019 • Luxuan Li, Tao Kong, Fuchun Sun, Huaping Liu
Detecting actions in videos is an important yet challenging task.
no code implementations • arXiv:1909.01818 2019 • Xiaoli Liu, Jianqin Yin, Huaping Liu, Yilong Yin
Specifically, a skeletal representation is proposed by transforming the joint coordinate sequence into an image sequence, which can model the different correlations of different joints.
Ranked #1 on Pose Prediction on Filtered NTU RGB+D
7 code implementations • 8 Apr 2019 • Tao Kong, Fuchun Sun, Huaping Liu, Yuning Jiang, Lei LI, Jianbo Shi
In FoveaBox, an instance is assigned to adjacent feature levels to make the model more accurate. We demonstrate its effectiveness on standard benchmarks and report extensive experimental analysis.
Ranked #91 on Object Detection on COCO test-dev (APM metric)
no code implementations • 19 Jan 2019 • Tao Kong, Fuchun Sun, Huaping Liu, Yuning Jiang, Jianbo Shi
We present consistent optimization for single stage object detection.
no code implementations • ECCV 2018 • Tao Kong, Fuchun Sun, Wenbing Huang, Huaping Liu
In this paper, we begin by investigating current feature pyramids solutions, and then reformulate the feature pyramid construction as the feature reconfiguration process.
no code implementations • 18 May 2018 • Mingxuan Jing, Xiaojian Ma, Fuchun Sun, Huaping Liu
Learning and inference movement is a very challenging problem due to its high dimensionality and dependency to varied environments or tasks.
no code implementations • 12 May 2018 • Mingxuan Jing, Xiaojian Ma, Wenbing Huang, Fuchun Sun, Huaping Liu
The goal of task transfer in reinforcement learning is migrating the action policy of an agent to the target task from the source task.
1 code implementation • CVPR 2017 • Tao Kong, Fuchun Sun, Anbang Yao, Huaping Liu, Ming Lu, Yurong Chen
To address (a), we design the reverse connection, which enables the network to detect objects on multi-levels of CNNs.
no code implementations • CVPR 2016 • Wenbing Huang, Fuchun Sun, Lele Cao, Deli Zhao, Huaping Liu, Mehrtash Harandi
To enhance the performance of LDSs, in this paper, we address the challenging issue of performing sparse coding on the space of LDSs, where both data and dictionary atoms are LDSs.