no code implementations • 27 May 2024 • Jian Zhao, Lei Jin, Jianshu Li, Zheng Zhu, Yinglei Teng, Jiaojiao Zhao, Sadaf Gulshad, Zheng Wang, Bo Zhao, Xiangbo Shu, Yunchao Wei, Xuecheng Nie, Xiaojie Jin, Xiaodan Liang, Shin'ichi Satoh, Yandong Guo, Cewu Lu, Junliang Xing, Jane Shen Shengmei
The SkatingVerse Workshop & Challenge aims to encourage research in developing novel and accurate methods for human action understanding.
no code implementations • 12 May 2024 • Yang Jin, Jun Lv, Shuqiang Jiang, Cewu Lu
In this paper, we propose DiffGen, a novel framework that integrates differentiable physics simulation, differentiable rendering, and a vision-language model to enable automatic and efficient generation of robot demonstrations.
no code implementations • 16 Apr 2024 • Pengfei Xie, Wenqiang Xu, Tutian Tang, Zhenjun Yu, Cewu Lu
To address this, we integrate a musculoskeletal system with a learnable parametric hand model, MANO, to create a new model, MS-MANO.
no code implementations • 4 Apr 2024 • Kailin Li, Jingbo Wang, Lixin Yang, Cewu Lu, Bo Dai
We introduce a discrete representation that aligns the grasp space with semantic space, enabling the generation of grasp postures in accordance with language instructions.
no code implementations • 28 Mar 2024 • Xinyu Zhan, Lixin Yang, Yifei Zhao, Kangrui Mao, Hanlin Xu, Zenan Lin, Kailin Li, Cewu Lu
Based on the 3-level abstraction of OAKINK2, we explore a task-oriented framework for Complex Task Completion (CTC).
no code implementations • 28 Mar 2024 • Zeren Chen, Zhelun Shi, Xiaoya Lu, Lehan He, Sucheng Qian, Hao Shu Fang, Zhenfei Yin, Wanli Ouyang, Jing Shao, Yu Qiao, Cewu Lu, Lu Sheng
The ultimate goals of robotic learning is to acquire a comprehensive and generalizable robotic system capable of performing both seen skills within the training distribution and unseen skills in novel environments.
no code implementations • 24 Mar 2024 • JunBo Wang, Wenhai Liu, Qiaojun Yu, Yang You, Liu Liu, Weiming Wang, Cewu Lu
Our primary contribution is a Robust Articulation Network (RoArtNet) that is able to predict both joint parameters and affordable points robustly by local feature learning and point tuple voting.
2 code implementations • 21 Mar 2024 • Sanqing Qu, Tianpei Zou, Florian Röhrbein, Cewu Lu, Guang Chen, DaCheng Tao, Changjun Jiang
GLC++ enhances the novel category clustering accuracy of GLC by 4. 3% in open-set scenarios on Office-Home.
no code implementations • 20 Mar 2024 • Qiaojun Yu, Ce Hao, JunBo Wang, Wenhai Liu, Liu Liu, Yao Mu, Yang You, Hengxu Yan, Cewu Lu
Robotic manipulation in everyday scenarios, especially in unstructured environments, requires skills in pose-aware object manipulation (POM), which adapts robots' grasping and handling according to an object's 6D pose.
no code implementations • 2 Mar 2024 • Siyuan Bian, Jiefeng Li, Jiasheng Tang, Cewu Lu
Accurate human shape recovery from a monocular RGB image is a challenging task because humans come in different shapes and sizes and wear different clothes.
1 code implementation • 26 Dec 2023 • Tai Wang, Xiaohan Mao, Chenming Zhu, Runsen Xu, Ruiyuan Lyu, Peisen Li, Xiao Chen, Wenwei Zhang, Kai Chen, Tianfan Xue, Xihui Liu, Cewu Lu, Dahua Lin, Jiangmiao Pang
In the realm of computer vision and robotics, embodied agents are expected to explore their environment and carry out human instructions.
1 code implementation • 23 Dec 2023 • Yang You, Kai Xiong, Zhening Yang, Zhengxiang Huang, Junwei Zhou, Ruoxi Shi, Zhou Fang, Adam W. Harley, Leonidas Guibas, Cewu Lu
We introduce PACE (Pose Annotations in Cluttered Environments), a large-scale benchmark designed to advance the development and evaluation of pose estimation methods in cluttered scenarios.
no code implementations • 17 Dec 2023 • SiQi Liu, Yong-Lu Li, Zhou Fang, Xinpeng Liu, Yang You, Cewu Lu
To explore an effective embedding of HAOI for the machine, we build a new benchmark on 3D HAOI consisting of primitives together with their images and propose a task requiring machines to recover 3D HAOI using primitives from images.
no code implementations • 5 Dec 2023 • Xinpeng Liu, Haowen Hou, Yanchao Yang, Yong-Lu Li, Cewu Lu
Human-scene Interaction (HSI) generation is a challenging task and crucial for various downstream tasks.
1 code implementation • 1 Dec 2023 • Ziyu Wang, Yue Xu, Cewu Lu, Yong-Lu Li
It first distills the videos into still images as static memory and then compensates the dynamic and motion information with a learnable dynamic memory block.
no code implementations • NeurIPS 2023 • Xiaoqian Wu, Yong-Lu Li, Jianhua Sun, Cewu Lu
One possible path of activity reasoning is building a symbolic system composed of symbols and rules, where one rule connects multiple symbols, implying human knowledge and reasoning abilities.
no code implementations • 21 Nov 2023 • Tutian Tang, Jiyu Liu, Jieyi Zhang, Haoyuan Fu, Wenqiang Xu, Cewu Lu
By leveraging refractive flow as an intermediate representation, the proposed method circumvents the drawbacks of directly predicting the geometry (e. g. surface normal) from images and helps bridge the sim-to-real gap.
no code implementations • 2 Nov 2023 • Han Xue, Yutong Li, Wenqiang Xu, Huanyu Li, Dongzhe Zheng, Cewu Lu
Training data is collected via a human-centric process with offline and online stages.
no code implementations • 6 Oct 2023 • Xinpeng Liu, Yong-Lu Li, Ailing Zeng, Zizheng Zhou, Yang You, Cewu Lu
The goal of motion understanding is to establish a reliable mapping between motion and action semantics, while it is a challenging many-to-many problem.
1 code implementation • 28 Sep 2023 • Qiaojun Yu, JunBo Wang, Wenhai Liu, Ce Hao, Liu Liu, Lin Shao, Weiming Wang, Cewu Lu
Results show that GAMMA significantly outperforms SOTA articulation modeling and manipulation algorithms in unseen and cross-category articulated objects.
no code implementations • ICCV 2023 • Yue Xu, Yong-Lu Li, Zhemin Huang, Michael Xu Liu, Cewu Lu, Yu-Wing Tai, Chi-Keung Tang
With the surge in attention to Egocentric Hand-Object Interaction (Ego-HOI), large-scale datasets such as Ego4D and EPIC-KITCHENS have been proposed.
no code implementations • ICCV 2023 • Kailin Li, Lixin Yang, Haoyu Zhen, Zenan Lin, Xinyu Zhan, Licheng Zhong, Jian Xu, Kejian Wu, Cewu Lu
This can be attributed to the fact that humans have mastered the shape prior of the 'mug' category, and can quickly establish the corresponding relations between different mug instances and the prior, such as where the rim and handle are located.
no code implementations • ICCV 2023 • Bingyang Zhou, Haoyu Zhou, Tianhai Liang, Qiaojun Yu, Siheng Zhao, Yuwei Zeng, Jun Lv, Siyuan Luo, Qiancai Wang, Xinyuan Yu, Haonan Chen, Cewu Lu, Lin Shao
We present ClothesNet: a large-scale dataset of 3D clothes objects with information-rich annotations.
1 code implementation • 14 Aug 2023 • Licheng Zhong, Lixin Yang, Kailin Li, Haoyu Zhen, Mei Han, Cewu Lu
Mesh is extracted from the signed distance function (SDF) network for the surface, and color for each surface vertex is drawn from the global color network.
no code implementations • 2 Jul 2023 • Hao-Shu Fang, Hongjie Fang, Zhenyu Tang, Jirong Liu, Chenxi Wang, JunBo Wang, Haoyi Zhu, Cewu Lu
A key challenge in robotic manipulation in open domains is how to acquire diverse and generalizable skills for robots.
1 code implementation • 28 May 2023 • Yue Xu, Yong-Lu Li, Kaitong Cui, Ziyu Wang, Cewu Lu, Yu-Wing Tai, Chi-Keung Tang
Our method consistently enhances the distillation algorithms, even on much larger-scale and more heterogeneous datasets, e. g. ImageNet-1K and Kinetics-400.
1 code implementation • CVPR 2023 • Jiefeng Li, Siyuan Bian, Qi Liu, Jiasheng Tang, Fan Wang, Cewu Lu
In this work, we present NIKI (Neural Inverse Kinematics with Invertible Neural Network), which models bi-directional errors to improve the robustness to occlusions and obtain pixel-aligned accuracy.
Ranked #1 on 3D Human Pose Estimation on AGORA
1 code implementation • 12 Apr 2023 • Jiefeng Li, Siyuan Bian, Chao Xu, Zhicun Chen, Lixin Yang, Cewu Lu
To address these issues, this paper presents a novel hybrid inverse kinematics solution, HybrIK, that integrates the merits of 3D keypoint estimation and body mesh recovery in a unified framework.
Ranked #1 on 3D Human Reconstruction on AGORA
1 code implementation • CVPR 2023 • Lixin Yang, Jian Xu, Licheng Zhong, Xinyu Zhan, Zhicheng Wang, Kejian Wu, Cewu Lu
Enable neural networks to capture 3D geometrical-aware features is essential in multi-view based vision tasks.
no code implementations • 2 Apr 2023 • Yong-Lu Li, Xiaoqian Wu, Xinpeng Liu, Zehao Wang, Yiming Dou, Yikun Ji, Junyi Zhang, Yixing Li, Jingru Tan, Xudong Lu, Cewu Lu
By aligning the classes of previous datasets to our semantic space, we gather (image/video/skeleton/MoCap) datasets into a unified database in a unified label system, i. e., bridging "isolated islands" into a "Pangea".
no code implementations • CVPR 2023 • Wenqiang Xu, Zhenjun Yu, Han Xue, Ruolin Ye, Siqiong Yao, Cewu Lu
We propose a simulation environment, VT-Sim, which supports generating hand-object interaction for both rigid and deformable objects.
1 code implementation • CVPR 2023 • Han Xue, Wenqiang Xu, Jieyi Zhang, Tutian Tang, Yutong Li, Wenxin Du, Ruolin Ye, Cewu Lu
In this work, we present a complete package to address the category-level garment pose tracking task: (1) A recording system VR-Garment, with which users can manipulate virtual garment models in simulation through a VR interface.
3 code implementations • CVPR 2023 • Sanqing Qu, Tianpei Zou, Florian Roehrbein, Cewu Lu, Guang Chen, DaCheng Tao, Changjun Jiang
We examine the superiority of our GLC on multiple benchmarks with different category shift scenarios, including partial-set, open-set, and open-partial-set DA.
Ranked #2 on Universal Domain Adaptation on VisDA2017
1 code implementation • 6 Mar 2023 • Yujing Lou, Zelin Ye, Yang You, Nianjuan Jiang, Jiangbo Lu, Weiming Wang, Lizhuang Ma, Cewu Lu
CRIN directly takes the coordinates of points as input and transforms local points into rotation-invariant representations via centrifugal reference frames.
no code implementations • CVPR 2023 • Jirong Liu, Ruo Zhang, Hao-Shu Fang, Minghao Gou, Hongjie Fang, Chenxi Wang, Sheng Xu, Hengxu Yan, Cewu Lu
Reactive grasping, which enables the robot to successfully grasp dynamic moving objects, is of great interest in robotics.
no code implementations • CVPR 2023 • Jianhua Sun, YuXuan Li, Liang Chai, Cewu Lu
To comprehensively cover the uncertainty of the future, the common practice of multi-modal human trajectory prediction is to first generate a set/distribution of candidate future trajectories and then sample required numbers of trajectories from them as final predictions.
no code implementations • ICCV 2023 • Wenqiang Xu, Wenxin Du, Han Xue, Yutong Li, Ruolin Ye, Yan-Feng Wang, Cewu Lu
In this work, we propose a recording system, GarmentTwin, which can track garment poses in dynamic settings such as manipulation.
no code implementations • CVPR 2023 • Bo Pang, Hongchi Xia, Cewu Lu
In this paper, we design the Triangle Constrained Contrast (TriCC) framework tailored for autonomous driving scenes which learns 3D unsupervised representations through both the multimodal information and dynamic of temporal sequences.
no code implementations • ICCV 2023 • Yong-Lu Li, Yue Xu, Xinyu Xu, Xiaohan Mao, Yuan YAO, SiQi Liu, Cewu Lu
To support OCL, we build a densely annotated knowledge base including extensive labels for three levels of object concept (category, attribute, affordance), and the causal relations of three levels.
1 code implementation • 24 Nov 2022 • Yang You, Zhuochen Miao, Kai Xiong, Weiming Wang, Cewu Lu
In contrast, our proposed OneLoc algorithm efficiently finds the object center and bounding box size by a special voting scheme.
2 code implementations • 24 Nov 2022 • Yang You, Wenhao He, Jin Liu, Hongkai Xiong, Weiming Wang, Cewu Lu
We introduce a novel method, CPPF++, designed for sim-to-real pose estimation.
1 code implementation • 14 Nov 2022 • Yong-Lu Li, Hongwei Fan, Zuoyu Qiu, Yiming Dou, Liang Xu, Hao-Shu Fang, Peiyang Guo, Haisheng Su, Dongliang Wang, Wei Wu, Cewu Lu
In daily HOIs, humans often interact with a variety of objects, e. g., holding and touching dozens of household items in cleaning.
7 code implementations • 7 Nov 2022 • Hao-Shu Fang, Jiefeng Li, Hongyang Tang, Chao Xu, Haoyi Zhu, Yuliang Xiu, Yong-Lu Li, Cewu Lu
Accurate whole-body multi-person pose estimation and tracking is an important yet challenging topic in computer vision.
no code implementations • 27 Oct 2022 • Jun Lv, Yunhai Feng, Cheng Zhang, Shuang Zhao, Lin Shao, Cewu Lu
Model-based reinforcement learning (MBRL) is recognized with the potential to be significantly more sample-efficient than model-free RL.
Deformable Object Manipulation Model-based Reinforcement Learning +2
1 code implementation • 23 Oct 2022 • Zhijie Deng, Jiaxin Shi, Hao Zhang, Peng Cui, Cewu Lu, Jun Zhu
Unlike prior spectral methods such as Laplacian Eigenmap that operate in a nonparametric manner, Neural Eigenmap leverages NeuralEF to parametrically model eigenfunctions using a neural network.
1 code implementation • 14 Oct 2022 • Daiheng Gao, Yuliang Xiu, Kailin Li, Lixin Yang, Feng Wang, Peng Zhang, Bang Zhang, Cewu Lu, Ping Tan
Unity GUI is also provided to generate synthetic hand data with user-defined settings, e. g., pose, camera, background, lighting, textures, and accessories.
1 code implementation • 11 Oct 2022 • Haoyi Zhu, Hao-Shu Fang, Cewu Lu
In this paper, we focus on a rarely discussed but important setting: can we train one model that can represent multiple scenes, with 360$^\circ $ insufficient views and RGB-D images?
1 code implementation • 19 Sep 2022 • Jiefeng Li, Siyuan Bian, Chao Xu, Gang Liu, Gang Yu, Cewu Lu
In this work, we present D&D (Learning Human Dynamics from Dynamic Camera), which leverages the laws of physics to reconstruct 3D human motion from the in-the-wild videos with a moving camera.
1 code implementation • 4 Aug 2022 • Yue Xu, Yong-Lu Li, Jiefeng Li, Cewu Lu
Previous methods tackle with data imbalance from the viewpoints of data distribution, feature space, and model design, etc.
1 code implementation • 28 Jul 2022 • Xiaoqian Wu, Yong-Lu Li, Xinpeng Liu, Junyi Zhang, Yuzhe Wu, Cewu Lu
Though significant progress has been made, interactiveness learning remains a challenging problem in HOI detection: existing methods usually generate redundant negative H-O pair proposals and fail to effectively extract interactive pairs.
Ranked #9 on Human-Object Interaction Detection on V-COCO
1 code implementation • 13 Jul 2022 • Bo Pang, Yifan Zhang, Yaoyi Li, Jia Cai, Cewu Lu
In this paper, we propose a genuine group-level contrastive visual representation learning method whose linear evaluation performance on ImageNet surpasses the vanilla supervised learning.
Ranked #38 on Self-Supervised Image Classification on <h2>oi</h2>
no code implementations • 23 Jun 2022 • Minghao Gou, Haolin Pan, Hao-Shu Fang, Ziyuan Liu, Cewu Lu, Ping Tan
In this paper, we propose a new task that enables and facilitates algorithms to estimate the 6D pose estimation of novel objects during testing.
1 code implementation • CVPR 2022 • Xinpeng Liu, Yong-Lu Li, Xiaoqian Wu, Yu-Wing Tai, Cewu Lu, Chi-Keung Tang
Human-Object Interaction (HOI) detection plays a core role in activity understanding.
1 code implementation • CVPR 2022 • Xinyu Xu, Yong-Lu Li, Cewu Lu
Anticipating future events is an essential feature for intelligent systems and embodied AI.
1 code implementation • CVPR 2022 • Lixin Yang, Kailin Li, Xinyu Zhan, Fei Wu, Anran Xu, Liu Liu, Cewu Lu
We start to collect 1, 800 common household objects and annotate their affordances to construct the first knowledge base: Oak.
1 code implementation • CVPR 2022 • Yifan Zhang, Bo Pang, Cewu Lu
Typical vision backbones manipulate structured features.
1 code implementation • CVPR 2022 • Yang You, Ruoxi Shi, Weiming Wang, Cewu Lu
Drawing inspirations from traditional point pair features (PPFs), in this paper, we design a novel Category-level PPF (CPPF) voting method to achieve accurate, robust and generalizable 9D pose estimation in the wild.
Ranked #8 on 6D Pose Estimation using RGBD on REAL275
1 code implementation • 19 Feb 2022 • Xinpeng Liu, Yong-Lu Li, Cewu Lu
To achieve OC-immunity, we propose an OC-immune network that decouples the inputs from OC, extracts OC-immune representations, and leverages uncertainty quantification to generalize to unseen objects.
1 code implementation • 17 Feb 2022 • Hongjie Fang, Hao-Shu Fang, Sheng Xu, Cewu Lu
However, the majority of current grasping algorithms would fail in this case since they heavily rely on the depth image, while ordinary depth sensors usually fail to produce accurate depth information for transparent objects owing to the reflection and refraction of light.
Ranked #1 on Transparent Object Depth Estimation on TransCG
no code implementations • CVPR 2022 • Liu Liu, Wenqiang Xu, Haoyuan Fu, Sucheng Qian, Yang Han, Cewu Lu
To bridge the gap, we present AKB-48: a large-scale Articulated object Knowledge Base which consists of 2, 037 real-world 3D articulated object models of 48 categories.
3 code implementations • 14 Feb 2022 • Yong-Lu Li, Xinpeng Liu, Xiaoqian Wu, Yizhuo Li, Zuoyu Qiu, Liang Xu, Yue Xu, Hao-Shu Fang, Cewu Lu
Human activity understanding is of widespread interest in artificial intelligence and spans diverse applications like health care and behavior analysis.
no code implementations • CVPR 2022 • Jianhua Sun, YuXuan Li, Liang Chai, Hao-Shu Fang, Yong-Lu Li, Cewu Lu
Human trajectory prediction task aims to analyze human future movements given their past status, which is a crucial step for many autonomous systems such as self-driving cars and social robots.
no code implementations • 24 Dec 2021 • Sucheng Qian, Liu Liu, Wenqiang Xu, Cewu Lu
It can obtain a satisfied segmentation result with minimal human clicks (< 10).
no code implementations • 14 Dec 2021 • Han Xue, Liu Liu, Wenqiang Xu, Haoyuan Fu, Cewu Lu
With the full representation of the object shape and joint states, we can address several tasks including category-level object pose estimation and the articulated object retrieval.
1 code implementation • 7 Dec 2021 • Shoubin Yu, Zhongyin Zhao, Haoshu Fang, Andong Deng, Haisheng Su, Dongliang Wang, Weihao Gan, Cewu Lu, Wei Wu
Different from pixel-based anomaly detection methods, pose-based methods utilize highly-structured skeleton data, which decreases the computational burden and also avoids the negative impact of background noise.
Anomaly Detection In Surveillance Videos Optical Flow Estimation +1
no code implementations • 29 Nov 2021 • Jun Lv, Qiaojun Yu, Lin Shao, Wenhai Liu, Wenqiang Xu, Cewu Lu
We apply our system to perform articulated object manipulation tasks, both in the simulation and the real world.
no code implementations • 21 Nov 2021 • Yang You, Chengkun Li, Yujing Lou, Zhoujun Cheng, Liangwei Li, Lizhuang Ma, Weiming Wang, Cewu Lu
Pixel-level 2D object semantic understanding is an important topic in computer vision and could help machine deeply understand objects (e. g. functionality and affordance) in our daily life.
no code implementations • 28 Oct 2021 • Liang Xu, Cuiling Lan, Wenjun Zeng, Cewu Lu
Skeleton data carries valuable motion information and is widely explored in human action recognition.
1 code implementation • NeurIPS 2021 • Jiefeng Li, Tong Chen, Ruiqi Shi, Yujing Lou, Yong-Lu Li, Cewu Lu
In this work, we propose sampling-argmax, a differentiable training method that imposes implicit constraints to the shape of the probability map by minimizing the expectation of the localization error.
Ranked #165 on 3D Human Pose Estimation on Human3.6M
1 code implementation • 9 Oct 2021 • Yong-Lu Li, Yue Xu, Xinyu Xu, Xiaohan Mao, Cewu Lu
To model the compositional nature of these concepts, it is a good choice to learn them as transformations, e. g., coupling and decoupling.
2 code implementations • CVPR 2022 • Kailin Li, Lixin Yang, Xinyu Zhan, Jun Lv, Wenqiang Xu, Jiefeng Li, Cewu Lu
In contrast, data synthesis can easily ensure those diversities separately.
Ranked #3 on hand-object pose on HO-3D (using extra training data)
3 code implementations • ICCV 2021 • Jiefeng Li, Siyuan Bian, Ailing Zeng, Can Wang, Bo Pang, Wentao Liu, Cewu Lu
In light of this, we propose a novel regression paradigm with Residual Log-likelihood Estimation (RLE) to capture the underlying output distribution.
Ranked #63 on 3D Human Pose Estimation on Human3.6M
no code implementations • 7 Jun 2021 • Tutian Tang, Wenqiang Xu, Ruolin Ye, Yan-Feng Wang, Cewu Lu
In addition, we specifically select a subset from COCO val2017 named COCO ContourHard-val to further demonstrate the contour quality improvements.
no code implementations • 7 May 2021 • Liu Liu, Han Xue, Wenqiang Xu, Haoyuan Fu, Cewu Lu
This setting allows varied kinematic structures within a semantic category, and multiple instances to co-exist in an observation of real world.
no code implementations • ICCV 2021 • Ruolin Ye, Wenqiang Xu, Zhendong Xue, Tutian Tang, Yanfeng Wang, Cewu Lu
Besides, we also report the hand and object pose errors with existing baselines and show that the dataset can serve as the video demonstrations for robot imitation learning on the handover task.
no code implementations • 21 Apr 2021 • Yunyan Hong, Ailing Zeng, Min Li, Cewu Lu, Li Jiang, Qiang Xu
Video action recognition (VAR) is a primary task of video understanding, and untrimmed videos are more common in real-life scenes.
no code implementations • 23 Mar 2021 • Hanwen Cao, Hao-Shu Fang, Wenhai Liu, Cewu Lu
Meanwhile, we propose a method to predict numerous suction poses from an RGB-D image of a cluttered scene and demonstrate our superiority against several previous methods.
1 code implementation • CVPR 2021 • Bo Pang, Gao Peng, Yizhuo Li, Cewu Lu
This progressive training (PGT) method is able to train long videos end-to-end with limited resources and ensures the effective transmission of information.
1 code implementation • CVPR 2021 • Ruoxi Shi, Zhengrong Xue, Yang You, Cewu Lu
In this paper, we propose an unsupervised aligned keypoint detector, Skeleton Merger, which utilizes skeletons to reconstruct objects.
1 code implementation • ICCV 2021 • Jianhua Sun, YuXuan Li, Hao-Shu Fang, Cewu Lu
Multimodal prediction results are essential for trajectory prediction task as there is no single correct answer for the future.
1 code implementation • 3 Mar 2021 • Minghao Gou, Hao-Shu Fang, Zhanda Zhu, Sheng Xu, Chenxi Wang, Cewu Lu
In the first stage, an encoder-decoder like convolutional neural network Angle-View Net(AVN) is proposed to predict the SO(3) orientation of the gripper at every location of the image.
2 code implementations • 24 Feb 2021 • Yang You, Yujing Lou, Ruoxi Shi, Qi Liu, Yu-Wing Tai, Lizhuang Ma, Weiming Wang, Cewu Lu
Spherical Voxel Convolution and Point Re-sampling are proposed to extract rotation invariant features for each point.
2 code implementations • 18 Feb 2021 • Jun Lv, Wenqiang Xu, Lixin Yang, Sucheng Qian, Chongzhao Mao, Cewu Lu
3D hand pose estimation and shape recovery are challenging tasks in computer vision.
1 code implementation • 25 Jan 2021 • Yong-Lu Li, Xinpeng Liu, Xiaoqian Wu, Xijie Huang, Liang Xu, Cewu Lu
Human-Object Interaction (HOI) detection is an important problem to understand how humans interact with objects.
Ranked #28 on Human-Object Interaction Detection on V-COCO
1 code implementation • ICCV 2021 • Chenxi Wang, Hao-Shu Fang, Minghao Gou, Hongjie Fang, Jin Gao, Cewu Lu
To quickly detect graspness in practice, we develop a neural network named graspness model to approximate the searching process.
Ranked #3 on Robotic Grasping on GraspNet-1Billion
no code implementations • 14 Dec 2020 • Bo Pang, Yizhuo Li, Jiefeng Li, Muchen Li, Hanwen Cao, Cewu Lu
Such spatial and attention features are nested deeply, therefore, the proposed framework works in a mixed top-down and bottom-up manner.
1 code implementation • ICCV 2021 • Lixin Yang, Xinyu Zhan, Kailin Li, Wenqiang Xu, Jiefeng Li, Cewu Lu
In this paper, we present an explicit contact representation namely Contact Potential Field (CPF), and a learning-fitting hybrid framework namely MIHO to Modeling the Interaction of Hand and Object.
1 code implementation • 2 Dec 2020 • Tutian Tang, Wenqiang Xu, Ruolin Ye, Lixin Yang, Cewu Lu
First, it learns a dictionary from a large collection of shape datasets, making any shape being able to be decomposed into a linear combination through the dictionary.
3 code implementations • CVPR 2021 • Jiefeng Li, Chao Xu, Zhicun Chen, Siyuan Bian, Lixin Yang, Cewu Lu
We show that HybrIK preserves both the accuracy of 3D pose and the realistic body structure of the parametric human model, leading to a pixel-aligned 3D body mesh and a more accurate 3D pose than the pure 3D keypoint estimation methods.
Ranked #3 on 3D Human Pose Estimation on EMDB
1 code implementation • CVPR 2022 • Yang You, Zelin Ye, Yujing Lou, Chengkun Li, Yong-Lu Li, Lizhuang Ma, Weiming Wang, Cewu Lu
In the work, we disentangle the direct offset into Local Canonical Coordinates (LCC), box scales and box orientations.
1 code implementation • CVPR 2022 • Yang You, Wenhai Liu, Yanjie Ze, Yong-Lu Li, Weiming Wang, Cewu Lu
Keypoint detection is an essential component for the object registration and alignment.
2 code implementations • NeurIPS 2020 • Yong-Lu Li, Xinpeng Liu, Xiaoqian Wu, Yizhuo Li, Cewu Lu
Meanwhile, isolated human and object can also be integrated into coherent HOI again.
Ranked #20 on Human-Object Interaction Detection on V-COCO
no code implementations • 2 Oct 2020 • Yichen Xie, Hao-Shu Fang, Dian Shao, Yong-Lu Li, Cewu Lu
Human-object interaction (HOI) detection requires a large amount of annotated data.
Ranked #68 on Domain Generalization on PACS
1 code implementation • 2 Oct 2020 • Hao-Shu Fang, Yichen Xie, Dian Shao, Cewu Lu
On the other hand, existing one-stage methods mainly focus on the union regions of interactions, which introduce unnecessary visual information as disturbances to HOI detection.
Ranked #15 on Human-Object Interaction Detection on V-COCO
1 code implementation • 12 Aug 2020 • Hanwen Cao, Yongyi Lu, Cewu Lu, Bo Pang, Gongshen Liu, Alan Yuille
In this paper, we further improve spatio-temporal point cloud feature learning with a flexible module called ASAP considering both attention and structure information across frames, which we find as two important factors for successful segmentation in dynamic point clouds.
1 code implementation • 12 Aug 2020 • Lixin Yang, Jiasen Li, Wenqiang Xu, Yiqun Diao, Cewu Lu
Inside each stage, BiHand adopts a novel bisecting design which allows the networks to encapsulate two closely related information (e. g. 2D keypoints and silhouette in 2D seeding stage, 3D joints, and depth map in 3D lifting stage, joint rotations and shape parameters in the mesh generation stage) in a single forward pass.
no code implementations • ECCV 2020 • Jiefeng Li, Can Wang, Wentao Liu, Chen Qian, Cewu Lu
The HMOR encodes interaction information as the ordinal relations of depths and angles hierarchically, which captures the body-part and joint level semantic and maintains global consistency at the same time.
3D Multi-Person Pose Estimation (absolute) 3D Multi-Person Pose Estimation (root-relative) +2
1 code implementation • ICCV 2019 • Xinqi Zhu, Chang Xu, Langwen Hui, Cewu Lu, DaCheng Tao
Specifically, we show how two-layer subnets in CNNs can be converted to temporal bilinear modules by adding an auxiliary-branch.
1 code implementation • CVPR 2020 • Bo Pang, Yizhuo Li, Yifan Zhang, Muchen Li, Cewu Lu
As deep learning brings excellent performances to object detection algorithms, Tracking by Detection (TBD) has become the mainstream tracking framework.
1 code implementation • 30 May 2020 • Bo Pang, Kaiwen Zha, Hanwen Cao, Jiajun Tang, Minghui Yu, Cewu Lu
Understanding sequential information is a fundamental task for artificial intelligence.
no code implementations • 5 May 2020 • Dario Fuoli, Zhiwu Huang, Martin Danelljan, Radu Timofte, Hua Wang, Longcun Jin, Dewei Su, Jing Liu, Jaehoon Lee, Michal Kudelski, Lukasz Bala, Dmitry Hrybov, Marcin Mozejko, Muchen Li, Si-Yao Li, Bo Pang, Cewu Lu, Chao Li, Dongliang He, Fu Li, Shilei Wen
For track 2, some existing methods are evaluated, showing promising solutions to the weakly-supervised video quality mapping problem.
1 code implementation • 28 Apr 2020 • Xiangyu Chen, Zelin Ye, Jiankai Sun, Yuda Fan, Fang Hu, Chenxi Wang, Cewu Lu
Grasping in cluttered scenes is challenging for robot vision systems, as detection accuracy can be hindered by partial occlusion of objects.
no code implementations • CVPR 2020 • Jianhua Sun, Qinhong Jiang, Cewu Lu
Social interaction is an important topic in human trajectory prediction to generate plausible paths.
1 code implementation • 20 Apr 2020 • Yang You, Chengkun Li, Yujing Lou, Zhoujun Cheng, Lizhuang Ma, Cewu Lu, Weiming Wang
Visual semantic correspondence is an important topic in computer vision and could help machine understand objects in our daily life.
1 code implementation • CVPR 2020 • Yong-Lu Li, Xinpeng Liu, Han Lu, Shiyi Wang, Junqi Liu, Jiefeng Li, Cewu Lu
In light of these, we propose a detailed 2D-3D joint representation learning method.
Ranked #1 on Human-Object Interaction Detection on Ambiguious-HOI
2 code implementations • ECCV 2020 • Jiajun Tang, Jin Xia, Xinzhi Mu, Bo Pang, Cewu Lu
We propose the Asynchronous Interaction Aggregation network (AIA) that leverages different interactions to boost action detection.
2 code implementations • CVPR 2020 • Yong-Lu Li, Liang Xu, Xinpeng Liu, Xijie Huang, Yue Xu, Shiyi Wang, Hao-Shu Fang, Ze Ma, Mingyang Chen, Cewu Lu
In light of this, we propose a new path: infer human part states first and then reason out the activities based on part-level semantics.
Ranked #3 on Human-Object Interaction Detection on HICO
1 code implementation • CVPR 2020 • Yong-Lu Li, Yue Xu, Xiaohan Mao, Cewu Lu
To model the compositional nature of these general concepts, it is a good choice to learn them through transformations, such as coupling and decoupling.
Ranked #1 on Compositional Zero-Shot Learning on MIT-States (Top-1 accuracy % metric)
1 code implementation • CVPR 2020 • Yang You, Yujing Lou, Chengkun Li, Zhoujun Cheng, Liangwei Li, Lizhuang Ma, Weiming Wang, Cewu Lu
Detecting 3D objects keypoints is of great interest to the areas of both graphics and computer vision.
no code implementations • 12 Feb 2020 • Dong Wang, Feng Zhou, Zheng Yan, Guang Yao, Zongxuan Liu, Wennan Ma, Cewu Lu
Our model builds upon an variational encoder which transforms the input video into a latent feature space and a Luenberger-type observer which captures the dynamic evolution of the latent features.
no code implementations • 31 Dec 2019 • Hao-Shu Fang, Chenxi Wang, Minghao Gou, Cewu Lu
Object grasping is critical for many applications, which is also a challenging computer vision problem.
1 code implementation • ECCV 2020 • Yujing Lou, Yang You, Chengkun Li, Zhoujun Cheng, Liangwei Li, Lizhuang Ma, Weiming Wang, Cewu Lu
Semantic understanding of 3D objects is crucial in many applications such as object manipulation.
no code implementations • 5 Dec 2019 • Zelin Ye, Yan Hao, Liang Xu, Rui Zhu, Cewu Lu
Further ablation study also demonstrates the effectiveness of our grouping predictor and regret mechanism.
no code implementations • 30 Nov 2019 • Junfeng Ding, Chen Wang, Cewu Lu
We present a learning-based force-torque dynamics to achieve model-based control for contact-rich peg-in-hole task using force-only inputs.
1 code implementation • 25 Nov 2019 • Chaoqin Huang, Fei Ye, Jinkun Cao, Maosen Li, Ya zhang, Cewu Lu
We here propose to break this equivalence by erasing selected attributes from the original data and reformulate it as a restoration task, where the normal and the anomalous data are expected to be distinguishable based on restoration errors.
Ranked #21 on Anomaly Detection on One-class CIFAR-10
2 code implementations • 23 Oct 2019 • Chen Wang, Roberto Martín-Martín, Danfei Xu, Jun Lv, Cewu Lu, Li Fei-Fei, Silvio Savarese, Yuke Zhu
We present 6-PACK, a deep learning approach to category-level 6D object pose tracking on RGB-D data.
Ranked #1 on 6D Pose Estimation using RGBD on REAL275 (Rerr metric)
no code implementations • 16 Oct 2019 • Wenqiang Xu, Yanjun Fu, Yuchen Luo, Chang Liu, Cewu Lu
Fine-grained recognition task deals with sub-category classification problem, which is important for real-world applications.
no code implementations • 12 Oct 2019 • Yao Xiao, Dan Meng, Cewu Lu, Chi-Keung Tang
The long-standing challenges for offline handwritten Chinese character recognition (HCCR) are twofold: Chinese characters can be very diverse and complicated while similarly looking, and cursive handwriting (due to increased writing speed and infrequent pen lifting) makes strokes and even characters connected together in a flowing manner.
3 code implementations • ICCV 2019 • Hao-Shu Fang, Jianhua Sun, Runzhong Wang, Minghao Gou, Yong-Lu Li, Cewu Lu
With the guidance of such map, we boost the performance of R101-Mask R-CNN on instance segmentation from 35. 7 mAP to 37. 9 mAP without modifying the backbone or network structure.
Ranked #78 on Instance Segmentation on COCO test-dev
no code implementations • ICCV 2019 • Jinkun Cao, Hongyang Tang, Hao-Shu Fang, Xiaoyong Shen, Cewu Lu, Yu-Wing Tai
Therefore, the easily available human pose dataset, which is of a much larger scale than our labeled animal dataset, provides important prior knowledge to boost up the performance on animal pose estimation.
no code implementations • 13 Aug 2019 • Jin Xia, Jiajun Tang, Cewu Lu
We present our three branch solutions for International Challenge on Activity Recognition at CVPR2019.
1 code implementation • ICCV 2019 • Wenqiang Xu, Haiyang Wang, Fubo Qi, Cewu Lu
In this paper, we propose a novel top-down instance segmentation framework based on explicit shape encoding, named \textbf{ESE-Seg}.
Ranked #3 on Semantic Contour Prediction on Sbd val
4 code implementations • 13 Apr 2019 • Yong-Lu Li, Liang Xu, Xinpeng Liu, Xijie Huang, Yue Xu, Mingyang Chen, Ze Ma, Shiyi Wang, Hao-Shu Fang, Cewu Lu
To address these and promote the activity understanding, we build a large-scale Human Activity Knowledge Engine (HAKE) based on the human body part states.
Ranked #2 on Human-Object Interaction Detection on HICO (using extra training data)
1 code implementation • 24 Jan 2019 • Yang You, Liangwei Li, Baisong Guo, Weiming Wang, Cewu Lu
Deep reinforcement learning (DRL) has gained a lot of attention in recent years, and has been proven to be able to play Atari games and Go at or above human levels.
8 code implementations • CVPR 2019 • Chen Wang, Danfei Xu, Yuke Zhu, Roberto Martín-Martín, Cewu Lu, Li Fei-Fei, Silvio Savarese
A key technical challenge in performing 6D object pose estimation from RGB-D image is to fully leverage the two complementary data sources.
Ranked #4 on 6D Pose Estimation on LineMOD
6 code implementations • 4 Dec 2018 • Zelin Zhao, Gao Peng, Haoyu Wang, Hao-Shu Fang, Chengkun Li, Cewu Lu
In this paper, we present an accurate yet effective solution for 6D pose estimation from an RGB image.
Ranked #17 on 6D Pose Estimation using RGB on LineMOD
3 code implementations • CVPR 2019 • Jiefeng Li, Can Wang, Hao Zhu, Yihuan Mao, Hao-Shu Fang, Cewu Lu
In this paper, we propose a novel and efficient method to tackle the problem of pose estimation in the crowd and a new dataset to better evaluate algorithms.
Ranked #6 on Multi-Person Pose Estimation on OCHuman
1 code implementation • CVPR 2019 • Bo Pang, Kaiwen Zha, Hanwen Cao, Chen Shi, Cewu Lu
There are mainly two novel designs in our deep RNN framework: one is a new RNN module called Context Bridge Module (CBM) which splits the information flowing along the sequence (temporal direction) and along depth (spatial representation direction), making it easier to train when building deep by balancing these two directions; the other is the Overlap Coherence Training Scheme that reduces the training complexity for long visual sequential tasks on account of the limitation of computing resources.
1 code implementation • 23 Nov 2018 • Yang You, Yujing Lou, Qi Liu, Yu-Wing Tai, Lizhuang Ma, Cewu Lu, Weiming Wang
Point cloud analysis without pose priors is very challenging in real applications, as the orientations of point clouds are often unknown.
3 code implementations • CVPR 2019 • Yong-Lu Li, Siyuan Zhou, Xijie Huang, Liang Xu, Ze Ma, Hao-Shu Fang, Yan-Feng Wang, Cewu Lu
On account of the generalization of interactiveness, interactiveness network is a transferable knowledge learner and can be cooperated with any HOI detection models to achieve desirable results.
Ranked #29 on Human-Object Interaction Detection on V-COCO
no code implementations • 25 Aug 2018 • He Huang, Yujing Shen, Jiankai Sun, Cewu Lu
Indoor navigation aims at performing navigation within buildings.
1 code implementation • ECCV 2018 • Hao-Shu Fang, Jinkun Cao, Yu-Wing Tai, Cewu Lu
We propose a new pairwise body-part attention model which can learn to focus on crucial parts, and their correlations for HOI recognition.
Ranked #5 on Human-Object Interaction Detection on HICO
2 code implementations • 27 Jul 2018 • Zhenghao Peng, Xuyang Chen, Chengwen Xu, Naifeng Jing, Xiaoyao Liang, Cewu Lu, Li Jiang
To guarantee the approximation quality, existing works deploy two neural networks (NNs), e. g., an approximator and a predictor.
4 code implementations • 2 Jul 2018 • Mingyang Jiang, Yiran Wu, Tianqi Zhao, Zelin Zhao, Cewu Lu
Recently, 3D understanding research sheds light on extracting features from point cloud directly, which requires effective shape pattern description of point clouds.
no code implementations • CVPR 2018 • Yiping Chen, Jingkang Wang, Jonathan Li, Cewu Lu, Zhipeng Luo, Han Xue, Cheng Wang
Learning autonomous-driving policies is one of the most challenging but promising tasks for computer vision.
no code implementations • CVPR 2018 • Shuqin Xie, Zitian Chen, Chao Xu, Cewu Lu
We propose a training algorithm for this framework to address the different training demands of agent and environment.
1 code implementation • CVPR 2018 • Hao-Shu Fang, Guansong Lu, Xiaolin Fang, Jianwen Xie, Yu-Wing Tai, Cewu Lu
In this paper, we present a novel method to generate synthetic human part segmentation data using easily-obtained human keypoint annotations.
Ranked #4 on Human Part Segmentation on PASCAL-Part (using extra training data)
no code implementations • CVPR 2018 • Bowen Pan, Wuwei Lin, Xiaolin Fang, Chaoqin Huang, Bolei Zhou, Cewu Lu
Deep convolutional neural networks (CNNs) have made impressive progress in many video recognition tasks such as video pose estimation and video object detection.
no code implementations • 4 Feb 2018 • Bo Pang, Kaiwen Zha, Cewu Lu
We introduce the first benchmark for a new problem --- recognizing human action adverbs (HAA): "Adverbs Describing Human Actions" (ADHA).
1 code implementation • 3 Feb 2018 • Yuliang Xiu, Jiefeng Li, Haoyu Wang, Yinghong Fang, Cewu Lu
Multi-person articulated pose tracking in unconstrained videos is an important while challenging problem.
Ranked #9 on Pose Tracking on PoseTrack2017 (using extra training data)
no code implementations • 1 Feb 2018 • Zheng Wu, Ruiheng Chang, Jiaxu Ma, Cewu Lu, Chi-Keung Tang
We propose a novel approach for instance segmen- tation given an image of homogeneous object clus- ter (HOC).
1 code implementation • ECCV 2018 • Wenqiang Xu, Yonglu Li, Cewu Lu
Instance segmentation is a problem of significance in computer vision.
no code implementations • ICLR 2018 • Chen Wang, Xiangyu Chen, Zelin Ye, Jialu Wang, Ziruo Cai, Shixiang Gu, Cewu Lu
However, tasks with sparse rewards remain challenging when the state space is large.
no code implementations • ICCV 2017 • Yongyi Lu, Cewu Lu, Chi-Keung Tang
Video object detection is a fundamental tool for many applications.
6 code implementations • 13 Apr 2017 • Xinlei Pan, Yurong You, Ziyan Wang, Cewu Lu
To our knowledge, this is the first successful case of driving policy trained by reinforcement learning that can adapt to real world driving data.
no code implementations • CVPR 2018 • Cewu Lu, Hao Su, Yongyi Lu, Li Yi, Chi-Keung Tang, Leonidas Guibas
Important high-level vision tasks such as human-object interaction, image captioning and robotic manipulation require rich semantic descriptions of objects at part level.
15 code implementations • ICCV 2017 • Hao-Shu Fang, Shuqin Xie, Yu-Wing Tai, Cewu Lu
In this paper, we propose a novel regional multi-person pose estimation (RMPE) framework to facilitate pose estimation in the presence of inaccurate human bounding boxes.
Ranked #1 on Pose Estimation on UAV-Human
no code implementations • 31 Jul 2016 • Cewu Lu, Ranjay Krishna, Michael Bernstein, Li Fei-Fei
We improve on prior work by leveraging language priors from semantic word embeddings to finetune the likelihood of a predicted relationship.
Ranked #2 on Scene Graph Generation on VRD
no code implementations • ICCV 2015 • Cewu Lu, Shu Liu, Jiaya Jia, Chi-Keung Tang
Closed contour is an important objectness indicator.
no code implementations • ICCV 2015 • Shu Liu, Cewu Lu, Jiaya Jia
Regions-with-convolutional-neural-network (RCNN) is now a commonly employed object detection pipeline.
no code implementations • ICCV 2015 • Cewu Lu, Yongyi Lu, Hao Chen, Chi-Keung Tang
In the testing phase, sliding CNN models are applied which produces a set of response maps that can be effectively filtered by the learned co-presence prior to output the final bounding boxes for localizing an object.
no code implementations • CVPR 2015 • Yao Xiao, Cewu Lu, Efstratios Tsougenis, Yongyi Lu, Chi-Keung Tang
Distance metric plays a key role in grouping superpixels to produce object proposals for object detection.
no code implementations • CVPR 2015 • Di Lin, Xiaoyong Shen, Cewu Lu, Jiaya Jia
Our major contribution is to propose a valve linkage function(VLF) for back-propagation chaining and form our deep localization, alignment and classification (LAC) system.
no code implementations • 22 Sep 2014 • Cewu Lu, Hao Chen, Qifeng Chen, Hei Law, Yao Xiao, Chi-Keung Tang
We participated in the object detection track of ILSVRC 2014 and received the fourth place among the 38 teams.
no code implementations • CVPR 2014 • Cewu Lu, Jiaya Jia, Chi-Keung Tang
We propose binary range-sample feature in depth.
no code implementations • CVPR 2014 • Shuai Yi, Xiaogang Wang, Cewu Lu, Jiaya Jia
We tackle stationary crowd analysis in this paper, which is similarly important as modeling mobile groups in crowd scenes and finds many applications in surveillance.
no code implementations • CVPR 2014 • Di Lin, Cewu Lu, Renjie Liao, Jiaya Jia
We address the false response influence problem when learning and applying discriminative parts to construct the mid-level representation in scene classification.
no code implementations • CVPR 2014 • Cewu Lu, Di Lin, Jiaya Jia, Chi-Keung Tang
Given a single outdoor image, this paper proposes a collaborative learning approach for labeling it as either sunny or cloudy.
no code implementations • CVPR 2013 • Cewu Lu, Jiaping Shi, Jiaya Jia
Online dictionary learning is particularly useful for processing large-scale and dynamic data in computer vision.