Search Results for author: Cewu Lu

Found 157 papers, 85 papers with code

MS-MANO: Enabling Hand Pose Tracking with Biomechanical Constraints

no code implementations16 Apr 2024 Pengfei Xie, Wenqiang Xu, Tutian Tang, Zhenjun Yu, Cewu Lu

To address this, we integrate a musculoskeletal system with a learnable parametric hand model, MANO, to create a new model, MS-MANO.

Pose Tracking

SemGrasp: Semantic Grasp Generation via Language Aligned Discretization

no code implementations4 Apr 2024 Kailin Li, Jingbo Wang, Lixin Yang, Cewu Lu, Bo Dai

We introduce a discrete representation that aligns the grasp space with semantic space, enabling the generation of grasp postures in accordance with language instructions.

Grasp Generation Language Modelling +2

RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents

no code implementations28 Mar 2024 Zeren Chen, Zhelun Shi, Xiaoya Lu, Lehan He, Sucheng Qian, Hao Shu Fang, Zhenfei Yin, Wanli Ouyang, Jing Shao, Yu Qiao, Cewu Lu, Lu Sheng

The ultimate goals of robotic learning is to acquire a comprehensive and generalizable robotic system capable of performing both seen skills within the training distribution and unseen skills in novel environments.

Motion Planning

OAKINK2: A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion

no code implementations28 Mar 2024 Xinyu Zhan, Lixin Yang, Yifei Zhao, Kangrui Mao, Hanlin Xu, Zenan Lin, Kailin Li, Cewu Lu

Based on the 3-level abstraction of OAKINK2, we explore a task-oriented framework for Complex Task Completion (CTC).

Motion Synthesis Object

RPMArt: Towards Robust Perception and Manipulation for Articulated Objects

no code implementations24 Mar 2024 JunBo Wang, Wenhai Liu, Qiaojun Yu, Yang You, Liu Liu, Weiming Wang, Cewu Lu

Our primary contribution is a Robust Articulation Network (RoArtNet) that is able to predict both joint parameters and affordable points robustly by local feature learning and point tuple voting.

ManiPose: A Comprehensive Benchmark for Pose-aware Object Manipulation in Robotics

no code implementations20 Mar 2024 Qiaojun Yu, Ce Hao, JunBo Wang, Wenhai Liu, Liu Liu, Yao Mu, Yang You, Hengxu Yan, Cewu Lu

Robotic manipulation in everyday scenarios, especially in unstructured environments, requires skills in pose-aware object manipulation (POM), which adapts robots' grasping and handling according to an object's 6D pose.

Motion Planning Pose Estimation

ShapeBoost: Boosting Human Shape Estimation with Part-Based Parameterization and Clothing-Preserving Augmentation

no code implementations2 Mar 2024 Siyuan Bian, Jiefeng Li, Jiasheng Tang, Cewu Lu

Accurate human shape recovery from a monocular RGB image is a challenging task because humans come in different shapes and sizes and wear different clothes.

Data Augmentation

EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

1 code implementation26 Dec 2023 Tai Wang, Xiaohan Mao, Chenming Zhu, Runsen Xu, Ruiyuan Lyu, Peisen Li, Xiao Chen, Wenwei Zhang, Kai Chen, Tianfan Xue, Xihui Liu, Cewu Lu, Dahua Lin, Jiangmiao Pang

In the realm of computer vision and robotics, embodied agents are expected to explore their environment and carry out human instructions.

Scene Understanding

PACE: A Large-Scale Dataset with Pose Annotations in Cluttered Environments

1 code implementation23 Dec 2023 Yang You, Kai Xiong, Zhening Yang, Zhengxiang Huang, Junwei Zhou, Ruoxi Shi, Zhou Fang, Adam W. Harley, Leonidas Guibas, Cewu Lu

We introduce PACE (Pose Annotations in Cluttered Environments), a large-scale benchmark designed to advance the development and evaluation of pose estimation methods in cluttered scenarios.

Pose Estimation Pose Tracking

Primitive-based 3D Human-Object Interaction Modelling and Programming

no code implementations17 Dec 2023 SiQi Liu, Yong-Lu Li, Zhou Fang, Xinpeng Liu, Yang You, Cewu Lu

To explore an effective embedding of HAOI for the machine, we build a new benchmark on 3D HAOI consisting of primitives together with their images and propose a task requiring machines to recover 3D HAOI using primitives from images.

3D Reconstruction Human-Object Interaction Detection +2

Revisit Human-Scene Interaction via Space Occupancy

no code implementations5 Dec 2023 Xinpeng Liu, Haowen Hou, Yanchao Yang, Yong-Lu Li, Cewu Lu

Human-scene Interaction (HSI) generation is a challenging task and crucial for various downstream tasks.

Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement

1 code implementation1 Dec 2023 Ziyu Wang, Yue Xu, Cewu Lu, Yong-Lu Li

It first distills the videos into still images as static memory and then compensates the dynamic and motion information with a learnable dynamic memory block.


Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human Activity Reasoning

no code implementations NeurIPS 2023 Xiaoqian Wu, Yong-Lu Li, Jianhua Sun, Cewu Lu

One possible path of activity reasoning is building a symbolic system composed of symbols and rules, where one rule connects multiple symbols, implying human knowledge and reasoning abilities.

RFTrans: Leveraging Refractive Flow of Transparent Objects for Surface Normal Estimation and Manipulation

no code implementations21 Nov 2023 Tutian Tang, Jiyu Liu, Jieyi Zhang, Haoyuan Fu, Wenqiang Xu, Cewu Lu

By leveraging refractive flow as an intermediate representation, the proposed method circumvents the drawbacks of directly predicting the geometry (e. g. surface normal) from images and helps bridge the sim-to-real gap.

Surface Normal Estimation Transparent objects

Bridging the Gap between Human Motion and Action Semantics via Kinematic Phrases

no code implementations6 Oct 2023 Xinpeng Liu, Yong-Lu Li, Ailing Zeng, Zizheng Zhou, Yang You, Cewu Lu

The goal of motion understanding is to establish a reliable mapping between motion and action semantics, while it is a challenging many-to-many problem.

GAMMA: Generalizable Articulation Modeling and Manipulation for Articulated Objects

1 code implementation28 Sep 2023 Qiaojun Yu, JunBo Wang, Wenhai Liu, Ce Hao, Liu Liu, Lin Shao, Weiming Wang, Cewu Lu

Results show that GAMMA significantly outperforms SOTA articulation modeling and manipulation algorithms in unseen and cross-category articulated objects.

Manner Of Articulation Detection Robot Manipulation +1

EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding

no code implementations ICCV 2023 Yue Xu, Yong-Lu Li, Zhemin Huang, Michael Xu Liu, Cewu Lu, Yu-Wing Tai, Chi-Keung Tang

With the surge in attention to Egocentric Hand-Object Interaction (Ego-HOI), large-scale datasets such as Ego4D and EPIC-KITCHENS have been proposed.

Action Recognition Temporal Action Localization

CHORD: Category-level Hand-held Object Reconstruction via Shape Deformation

no code implementations ICCV 2023 Kailin Li, Lixin Yang, Haoyu Zhen, Zenan Lin, Xinyu Zhan, Licheng Zhong, Jian Xu, Kejian Wu, Cewu Lu

This can be attributed to the fact that humans have mastered the shape prior of the 'mug' category, and can quickly establish the corresponding relations between different mug instances and the prior, such as where the rim and handle are located.

Object Reconstruction

Color-NeuS: Reconstructing Neural Implicit Surfaces with Color

1 code implementation14 Aug 2023 Licheng Zhong, Lixin Yang, Kailin Li, Haoyu Zhen, Mei Han, Cewu Lu

Mesh is extracted from the signed distance function (SDF) network for the surface, and color for each surface vertex is drawn from the global color network.

Distill Gold from Massive Ores: Efficient Dataset Distillation via Critical Samples Selection

1 code implementation28 May 2023 Yue Xu, Yong-Lu Li, Kaitong Cui, Ziyu Wang, Cewu Lu, Yu-Wing Tai, Chi-Keung Tang

Our method consistently enhances the distillation algorithms, even on much larger-scale and more heterogeneous datasets, e. g. ImageNet-1K and Kinetics-400.

NIKI: Neural Inverse Kinematics with Invertible Neural Networks for 3D Human Pose and Shape Estimation

1 code implementation CVPR 2023 Jiefeng Li, Siyuan Bian, Qi Liu, Jiasheng Tang, Fan Wang, Cewu Lu

In this work, we present NIKI (Neural Inverse Kinematics with Invertible Neural Network), which models bi-directional errors to improve the robustness to occlusions and obtain pixel-aligned accuracy.

3D human pose and shape estimation

HybrIK-X: Hybrid Analytical-Neural Inverse Kinematics for Whole-body Mesh Recovery

1 code implementation12 Apr 2023 Jiefeng Li, Siyuan Bian, Chao Xu, Zhicun Chen, Lixin Yang, Cewu Lu

To address these issues, this paper presents a novel hybrid inverse kinematics solution, HybrIK, that integrates the merits of 3D keypoint estimation and body mesh recovery in a unified framework.

3D Human Pose Estimation 3D Human Reconstruction +1

POEM: Reconstructing Hand in a Point Embedded Multi-view Stereo

1 code implementation CVPR 2023 Lixin Yang, Jian Xu, Licheng Zhong, Xinyu Zhan, Zhicheng Wang, Kejian Wu, Cewu Lu

Enable neural networks to capture 3D geometrical-aware features is essential in multi-view based vision tasks.

From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding

no code implementations2 Apr 2023 Yong-Lu Li, Xiaoqian Wu, Xinpeng Liu, Zehao Wang, Yiming Dou, Yikun Ji, Junyi Zhang, Yixing Li, Jingru Tan, Xudong Lu, Cewu Lu

By aligning the classes of previous datasets to our semantic space, we gather (image/video/skeleton/MoCap) datasets into a unified database in a unified label system, i. e., bridging "isolated islands" into a "Pangea".

Action Understanding Transfer Learning

Visual-Tactile Sensing for In-Hand Object Reconstruction

no code implementations CVPR 2023 Wenqiang Xu, Zhenjun Yu, Han Xue, Ruolin Ye, Siqiong Yao, Cewu Lu

We propose a simulation environment, VT-Sim, which supports generating hand-object interaction for both rigid and deformable objects.

Object Object Reconstruction

GarmentTracking: Category-Level Garment Pose Tracking

1 code implementation CVPR 2023 Han Xue, Wenqiang Xu, Jieyi Zhang, Tutian Tang, Yutong Li, Wenxin Du, Ruolin Ye, Cewu Lu

In this work, we present a complete package to address the category-level garment pose tracking task: (1) A recording system VR-Garment, with which users can manipulate virtual garment models in simulation through a VR interface.

Pose Tracking

Upcycling Models under Domain and Category Shift

3 code implementations CVPR 2023 Sanqing Qu, Tianpei Zou, Florian Roehrbein, Cewu Lu, Guang Chen, DaCheng Tao, Changjun Jiang

We examine the superiority of our GLC on multiple benchmarks with different category shift scenarios, including partial-set, open-set, and open-partial-set DA.

Clustering Source-Free Domain Adaptation +2

CRIN: Rotation-Invariant Point Cloud Analysis and Rotation Estimation via Centrifugal Reference Frame

1 code implementation6 Mar 2023 Yujing Lou, Zelin Ye, Yang You, Nianjuan Jiang, Jiangbo Lu, Weiming Wang, Lizhuang Ma, Cewu Lu

CRIN directly takes the coordinates of points as input and transforms local points into rotation-invariant representations via centrifugal reference frames.

Target-Referenced Reactive Grasping for Dynamic Objects

no code implementations CVPR 2023 Jirong Liu, Ruo Zhang, Hao-Shu Fang, Minghao Gou, Hongjie Fang, Chenxi Wang, Sheng Xu, Hengxu Yan, Cewu Lu

Reactive grasping, which enables the robot to successfully grasp dynamic moving objects, is of great interest in robotics.

Stimulus Verification Is a Universal and Effective Sampler in Multi-Modal Human Trajectory Prediction

no code implementations CVPR 2023 Jianhua Sun, YuXuan Li, Liang Chai, Cewu Lu

To comprehensively cover the uncertainty of the future, the common practice of multi-modal human trajectory prediction is to first generate a set/distribution of candidate future trajectories and then sample required numbers of trajectories from them as final predictions.

Trajectory Prediction

Unsupervised 3D Point Cloud Representation Learning by Triangle Constrained Contrast for Autonomous Driving

no code implementations CVPR 2023 Bo Pang, Hongchi Xia, Cewu Lu

In this paper, we design the Triangle Constrained Contrast (TriCC) framework tailored for autonomous driving scenes which learns 3D unsupervised representations through both the multimodal information and dynamic of temporal sequences.

Autonomous Driving Representation Learning +2

Beyond Object Recognition: A New Benchmark towards Object Concept Learning

no code implementations ICCV 2023 Yong-Lu Li, Yue Xu, Xinyu Xu, Xiaohan Mao, Yuan YAO, SiQi Liu, Cewu Lu

To support OCL, we build a densely annotated knowledge base including extensive labels for three levels of object concept (category, attribute, affordance), and the causal relations of three levels.

Attribute Object +1

One-Shot General Object Localization

1 code implementation24 Nov 2022 Yang You, Zhuochen Miao, Kai Xiong, Weiming Wang, Cewu Lu

In contrast, our proposed OneLoc algorithm efficiently finds the object center and bounding box size by a special voting scheme.

Object Object Localization

Neural Eigenfunctions Are Structured Representation Learners

1 code implementation23 Oct 2022 Zhijie Deng, Jiaxin Shi, Hao Zhang, Peng Cui, Cewu Lu, Jun Zhu

Unlike prior spectral methods such as Laplacian Eigenmap that operate in a nonparametric manner, Neural Eigenmap leverages NeuralEF to parametrically model eigenfunctions using a neural network.

Contrastive Learning Data Augmentation +7

DART: Articulated Hand Model with Diverse Accessories and Rich Textures

1 code implementation14 Oct 2022 Daiheng Gao, Yuliang Xiu, Kailin Li, Lixin Yang, Feng Wang, Peng Zhang, Bang Zhang, Cewu Lu, Ping Tan

Unity GUI is also provided to generate synthetic hand data with user-defined settings, e. g., pose, camera, background, lighting, textures, and accessories.

Hand Pose Estimation Unity

X-NeRF: Explicit Neural Radiance Field for Multi-Scene 360$^{\circ} $ Insufficient RGB-D Views

1 code implementation11 Oct 2022 Haoyi Zhu, Hao-Shu Fang, Cewu Lu

In this paper, we focus on a rarely discussed but important setting: can we train one model that can represent multiple scenes, with 360$^\circ $ insufficient views and RGB-D images?

Novel View Synthesis

D&D: Learning Human Dynamics from Dynamic Camera

1 code implementation19 Sep 2022 Jiefeng Li, Siyuan Bian, Chao Xu, Gang Liu, Gang Yu, Cewu Lu

In this work, we present D&D (Learning Human Dynamics from Dynamic Camera), which leverages the laws of physics to reconstruct 3D human motion from the in-the-wild videos with a moving camera.

3D Human Pose Estimation Human Dynamics

Constructing Balance from Imbalance for Long-tailed Image Recognition

1 code implementation4 Aug 2022 Yue Xu, Yong-Lu Li, Jiefeng Li, Cewu Lu

Previous methods tackle with data imbalance from the viewpoints of data distribution, feature space, and model design, etc.

Mining Cross-Person Cues for Body-Part Interactiveness Learning in HOI Detection

1 code implementation28 Jul 2022 Xiaoqian Wu, Yong-Lu Li, Xinpeng Liu, Junyi Zhang, Yuzhe Wu, Cewu Lu

Though significant progress has been made, interactiveness learning remains a challenging problem in HOI detection: existing methods usually generate redundant negative H-O pair proposals and fail to effectively extract interactive pairs.

Human-Object Interaction Detection

Unsupervised Visual Representation Learning by Synchronous Momentum Grouping

1 code implementation13 Jul 2022 Bo Pang, Yifan Zhang, Yaoyi Li, Jia Cai, Cewu Lu

In this paper, we propose a genuine group-level contrastive visual representation learning method whose linear evaluation performance on ImageNet surpasses the vanilla supervised learning.

Clustering Contrastive Learning +2

Unseen Object 6D Pose Estimation: A Benchmark and Baselines

no code implementations23 Jun 2022 Minghao Gou, Haolin Pan, Hao-Shu Fang, Ziyuan Liu, Cewu Lu, Ping Tan

In this paper, we propose a new task that enables and facilitates algorithms to estimate the 6D pose estimation of novel objects during testing.

6D Pose Estimation

Learning to Anticipate Future with Dynamic Context Removal

1 code implementation CVPR 2022 Xinyu Xu, Yong-Lu Li, Cewu Lu

Anticipating future events is an essential feature for intelligent systems and embodied AI.

OakInk: A Large-scale Knowledge Repository for Understanding Hand-Object Interaction

1 code implementation CVPR 2022 Lixin Yang, Kailin Li, Xinyu Zhan, Fei Wu, Anran Xu, Liu Liu, Cewu Lu

We start to collect 1, 800 common household objects and annotate their affordances to construct the first knowledge base: Oak.

Grasp Generation Object +1

CPPF: Towards Robust Category-Level 9D Pose Estimation in the Wild

1 code implementation CVPR 2022 Yang You, Ruoxi Shi, Weiming Wang, Cewu Lu

Drawing inspirations from traditional point pair features (PPFs), in this paper, we design a novel Category-level PPF (CPPF) voting method to achieve accurate, robust and generalizable 9D pose estimation in the wild.

6D Pose Estimation using RGBD

Highlighting Object Category Immunity for the Generalization of Human-Object Interaction Detection

1 code implementation19 Feb 2022 Xinpeng Liu, Yong-Lu Li, Cewu Lu

To achieve OC-immunity, we propose an OC-immune network that decouples the inputs from OC, extracts OC-immune representations, and leverages uncertainty quantification to generalize to unseen objects.

Human-Object Interaction Detection Object +1

AKB-48: A Real-World Articulated Object Knowledge Base

no code implementations CVPR 2022 Liu Liu, Wenqiang Xu, Haoyuan Fu, Sucheng Qian, Yang Han, Cewu Lu

To bridge the gap, we present AKB-48: a large-scale Articulated object Knowledge Base which consists of 2, 037 real-world 3D articulated object models of 48 categories.

Object Object Reconstruction +1

TransCG: A Large-Scale Real-World Dataset for Transparent Object Depth Completion and a Grasping Baseline

1 code implementation17 Feb 2022 Hongjie Fang, Hao-Shu Fang, Sheng Xu, Cewu Lu

However, the majority of current grasping algorithms would fail in this case since they heavily rely on the depth image, while ordinary depth sensors usually fail to produce accurate depth information for transparent objects owing to the reflection and refraction of light.

Depth Completion Robotic Grasping +2

HAKE: A Knowledge Engine Foundation for Human Activity Understanding

3 code implementations14 Feb 2022 Yong-Lu Li, Xinpeng Liu, Xiaoqian Wu, Yizhuo Li, Zuoyu Qiu, Liang Xu, Yue Xu, Hao-Shu Fang, Cewu Lu

Human activity understanding is of widespread interest in artificial intelligence and spans diverse applications like health care and behavior analysis.

Action Recognition Human-Object Interaction Detection +2

Human Trajectory Prediction With Momentary Observation

no code implementations CVPR 2022 Jianhua Sun, YuXuan Li, Liang Chai, Hao-Shu Fang, Yong-Lu Li, Cewu Lu

Human trajectory prediction task aims to analyze human future movements given their past status, which is a crucial step for many autonomous systems such as self-driving cars and social robots.

Self-Driving Cars Trajectory Prediction

iSeg3D: An Interactive 3D Shape Segmentation Tool

no code implementations24 Dec 2021 Sucheng Qian, Liu Liu, Wenqiang Xu, Cewu Lu

It can obtain a satisfied segmentation result with minimal human clicks (< 10).


OMAD: Object Model with Articulated Deformations for Pose Estimation and Retrieval

no code implementations14 Dec 2021 Han Xue, Liu Liu, Wenqiang Xu, Haoyuan Fu, Cewu Lu

With the full representation of the object shape and joint states, we can address several tasks including category-level object pose estimation and the articulated object retrieval.

Object Pose Estimation +1

Regularity Learning via Explicit Distribution Modeling for Skeletal Video Anomaly Detection

1 code implementation7 Dec 2021 Shoubin Yu, Zhongyin Zhao, Haoshu Fang, Andong Deng, Haisheng Su, Dongliang Wang, Weihao Gan, Cewu Lu, Wei Wu

Different from pixel-based anomaly detection methods, pose-based methods utilize highly-structured skeleton data, which decreases the computational burden and also avoids the negative impact of background noise.

Anomaly Detection In Surveillance Videos Optical Flow Estimation +1

SAGCI-System: Towards Sample-Efficient, Generalizable, Compositional, and Incremental Robot Learning

no code implementations29 Nov 2021 Jun Lv, Qiaojun Yu, Lin Shao, Wenhai Liu, Wenqiang Xu, Cewu Lu

We apply our system to perform articulated object manipulation tasks, both in the simulation and the real world.

Understanding Pixel-level 2D Image Semantics with 3D Keypoint Knowledge Engine

no code implementations21 Nov 2021 Yang You, Chengkun Li, Yujing Lou, Zhoujun Cheng, Liangwei Li, Lizhuang Ma, Weiming Wang, Cewu Lu

Pixel-level 2D object semantic understanding is an important topic in computer vision and could help machine deeply understand objects (e. g. functionality and affordance) in our daily life.

Localization with Sampling-Argmax

1 code implementation NeurIPS 2021 Jiefeng Li, Tong Chen, Ruiqi Shi, Yujing Lou, Yong-Lu Li, Cewu Lu

In this work, we propose sampling-argmax, a differentiable training method that imposes implicit constraints to the shape of the probability map by minimizing the expectation of the localization error.

3D Human Pose Estimation

Learning Single/Multi-Attribute of Object with Symmetry and Group

1 code implementation9 Oct 2021 Yong-Lu Li, Yue Xu, Xinyu Xu, Xiaohan Mao, Cewu Lu

To model the compositional nature of these concepts, it is a good choice to learn them as transformations, e. g., coupling and decoupling.

Attribute Compositional Zero-Shot Learning

Human Pose Regression with Residual Log-likelihood Estimation

3 code implementations ICCV 2021 Jiefeng Li, Siyuan Bian, Ailing Zeng, Can Wang, Bo Pang, Wentao Liu, Cewu Lu

In light of this, we propose a novel regression paradigm with Residual Log-likelihood Estimation (RLE) to capture the underlying output distribution.

3D Human Pose Estimation Multi-Person Pose Estimation +1

ContourRender: Detecting Arbitrary Contour Shape For Instance Segmentation In One Pass

no code implementations7 Jun 2021 Tutian Tang, Wenqiang Xu, Ruolin Ye, Yan-Feng Wang, Cewu Lu

In addition, we specifically select a subset from COCO val2017 named COCO ContourHard-val to further demonstrate the contour quality improvements.

Instance Segmentation Semantic Segmentation

Towards Real-World Category-level Articulation Pose Estimation

no code implementations7 May 2021 Liu Liu, Han Xue, Wenqiang Xu, Haoyuan Fu, Cewu Lu

This setting allows varied kinematic structures within a semantic category, and multiple instances to co-exist in an observation of real world.

Mixed Reality Pose Estimation

H2O: A Benchmark for Visual Human-human Object Handover Analysis

no code implementations ICCV 2021 Ruolin Ye, Wenqiang Xu, Zhendong Xue, Tutian Tang, Yanfeng Wang, Cewu Lu

Besides, we also report the hand and object pose errors with existing baselines and show that the dataset can serve as the video demonstrations for robot imitation learning on the handover task.

Imitation Learning Object

Skimming and Scanning for Untrimmed Video Action Recognition

no code implementations21 Apr 2021 Yunyan Hong, Ailing Zeng, Min Li, Cewu Lu, Li Jiang, Qiang Xu

Video action recognition (VAR) is a primary task of video understanding, and untrimmed videos are more common in real-life scenes.

Action Recognition Temporal Action Localization +1

SuctionNet-1Billion: A Large-Scale Benchmark for Suction Grasping

no code implementations23 Mar 2021 Hanwen Cao, Hao-Shu Fang, Wenhai Liu, Cewu Lu

Meanwhile, we propose a method to predict numerous suction poses from an RGB-D image of a cluttered scene and demonstrate our superiority against several previous methods.

Robotic Grasping

PGT: A Progressive Method for Training Models on Long Videos

1 code implementation CVPR 2021 Bo Pang, Gao Peng, Yizhuo Li, Cewu Lu

This progressive training (PGT) method is able to train long videos end-to-end with limited resources and ensures the effective transmission of information.

Skeleton Merger: an Unsupervised Aligned Keypoint Detector

1 code implementation CVPR 2021 Ruoxi Shi, Zhengrong Xue, Yang You, Cewu Lu

In this paper, we propose an unsupervised aligned keypoint detector, Skeleton Merger, which utilizes skeletons to reconstruct objects.

Object Tracking Retrieval

RGB Matters: Learning 7-DoF Grasp Poses on Monocular RGBD Images

1 code implementation3 Mar 2021 Minghao Gou, Hao-Shu Fang, Zhanda Zhu, Sheng Xu, Chenxi Wang, Cewu Lu

In the first stage, an encoder-decoder like convolutional neural network Angle-View Net(AVN) is proposed to predict the SO(3) orientation of the gripper at every location of the image.

PRIN/SPRIN: On Extracting Point-wise Rotation Invariant Features

2 code implementations24 Feb 2021 Yang You, Yujing Lou, Ruoxi Shi, Qi Liu, Yu-Wing Tai, Lizhuang Ma, Weiming Wang, Cewu Lu

Spherical Voxel Convolution and Point Re-sampling are proposed to extract rotation invariant features for each point.

3D Feature Matching Data Augmentation

Graspness Discovery in Clutters for Fast and Accurate Grasp Detection

1 code implementation ICCV 2021 Chenxi Wang, Hao-Shu Fang, Minghao Gou, Hongjie Fang, Jin Gao, Cewu Lu

To quickly detect graspness in practice, we develop a neural network named graspness model to approximate the searching process.

Robotic Grasping

TDAF: Top-Down Attention Framework for Vision Tasks

no code implementations14 Dec 2020 Bo Pang, Yizhuo Li, Jiefeng Li, Muchen Li, Hanwen Cao, Cewu Lu

Such spatial and attention features are nested deeply, therefore, the proposed framework works in a mixed top-down and bottom-up manner.

Action Recognition object-detection +2

Learning Universal Shape Dictionary for Realtime Instance Segmentation

1 code implementation2 Dec 2020 Tutian Tang, Wenqiang Xu, Ruolin Ye, Lixin Yang, Cewu Lu

First, it learns a dictionary from a large collection of shape datasets, making any shape being able to be decomposed into a linear combination through the dictionary.

Explainable Models Instance Segmentation +3

CPF: Learning a Contact Potential Field to Model the Hand-Object Interaction

1 code implementation ICCV 2021 Lixin Yang, Xinyu Zhan, Kailin Li, Wenqiang Xu, Jiefeng Li, Cewu Lu

In this paper, we present an explicit contact representation namely Contact Potential Field (CPF), and a learning-fitting hybrid framework namely MIHO to Modeling the Interaction of Hand and Object.

Object Pose Estimation

HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation

3 code implementations CVPR 2021 Jiefeng Li, Chao Xu, Zhicun Chen, Siyuan Bian, Lixin Yang, Cewu Lu

We show that HybrIK preserves both the accuracy of 3D pose and the realistic body structure of the parametric human model, leading to a pixel-aligned 3D body mesh and a more accurate 3D pose than the pure 3D keypoint estimation methods.

3D human pose and shape estimation Keypoint Estimation

DIRV: Dense Interaction Region Voting for End-to-End Human-Object Interaction Detection

1 code implementation2 Oct 2020 Hao-Shu Fang, Yichen Xie, Dian Shao, Cewu Lu

On the other hand, existing one-stage methods mainly focus on the union regions of interactions, which introduce unnecessary visual information as disturbances to HOI detection.

Human-Object Interaction Detection

ASAP-Net: Attention and Structure Aware Point Cloud Sequence Segmentation

1 code implementation12 Aug 2020 Hanwen Cao, Yongyi Lu, Cewu Lu, Bo Pang, Gongshen Liu, Alan Yuille

In this paper, we further improve spatio-temporal point cloud feature learning with a flexible module called ASAP considering both attention and structure information across frames, which we find as two important factors for successful segmentation in dynamic point clouds.


BiHand: Recovering Hand Mesh with Multi-stage Bisected Hourglass Networks

1 code implementation12 Aug 2020 Lixin Yang, Jiasen Li, Wenqiang Xu, Yiqun Diao, Cewu Lu

Inside each stage, BiHand adopts a novel bisecting design which allows the networks to encapsulate two closely related information (e. g. 2D keypoints and silhouette in 2D seeding stage, 3D joints, and depth map in 3D lifting stage, joint rotations and shape parameters in the mesh generation stage) in a single forward pass.

Pose Tracking

HMOR: Hierarchical Multi-Person Ordinal Relations for Monocular Multi-Person 3D Pose Estimation

no code implementations ECCV 2020 Jiefeng Li, Can Wang, Wentao Liu, Chen Qian, Cewu Lu

The HMOR encodes interaction information as the ordinal relations of depths and angles hierarchically, which captures the body-part and joint level semantic and maintains global consistency at the same time.

3D Multi-Person Pose Estimation (absolute) 3D Multi-Person Pose Estimation (root-relative) +2

Approximated Bilinear Modules for Temporal Modeling

1 code implementation ICCV 2019 Xinqi Zhu, Chang Xu, Langwen Hui, Cewu Lu, DaCheng Tao

Specifically, we show how two-layer subnets in CNNs can be converted to temporal bilinear modules by adding an auxiliary-branch.

Action Recognition Video Classification

TubeTK: Adopting Tubes to Track Multi-Object in a One-Step Training Model

1 code implementation CVPR 2020 Bo Pang, Yizhuo Li, Yifan Zhang, Muchen Li, Cewu Lu

As deep learning brings excellent performances to object detection algorithms, Tracking by Detection (TBD) has become the mainstream tracking framework.

Multi-Object Tracking Object +2

Transferable Active Grasping and Real Embodied Dataset

1 code implementation28 Apr 2020 Xiangyu Chen, Zelin Ye, Jiankai Sun, Yuda Fan, Fang Hu, Chenxi Wang, Cewu Lu

Grasping in cluttered scenes is challenging for robot vision systems, as detection accuracy can be hindered by partial occlusion of objects.

Reinforcement Learning (RL)

Recursive Social Behavior Graph for Trajectory Prediction

no code implementations CVPR 2020 Jianhua Sun, Qinhong Jiang, Cewu Lu

Social interaction is an important topic in human trajectory prediction to generate plausible paths.

Trajectory Prediction

Semantic Correspondence via 2D-3D-2D Cycle

1 code implementation20 Apr 2020 Yang You, Chengkun Li, Yujing Lou, Zhoujun Cheng, Lizhuang Ma, Cewu Lu, Weiming Wang

Visual semantic correspondence is an important topic in computer vision and could help machine understand objects in our daily life.

Semantic correspondence

Asynchronous Interaction Aggregation for Action Detection

2 code implementations ECCV 2020 Jiajun Tang, Jin Xia, Xinzhi Mu, Bo Pang, Cewu Lu

We propose the Asynchronous Interaction Aggregation network (AIA) that leverages different interactions to boost action detection.

Action Detection

Symmetry and Group in Attribute-Object Compositions

1 code implementation CVPR 2020 Yong-Lu Li, Yue Xu, Xiaohan Mao, Cewu Lu

To model the compositional nature of these general concepts, it is a good choice to learn them through transformations, such as coupling and decoupling.

 Ranked #1 on Compositional Zero-Shot Learning on MIT-States (Top-1 accuracy % metric)

Attribute Compositional Zero-Shot Learning +1

Deep Variational Luenberger-type Observer for Stochastic Video Prediction

no code implementations12 Feb 2020 Dong Wang, Feng Zhou, Zheng Yan, Guang Yao, Zongxuan Liu, Wennan Ma, Cewu Lu

Our model builds upon an variational encoder which transforms the input video into a latent feature space and a Luenberger-type observer which captures the dynamic evolution of the latent features.

Representation Learning Video Prediction +1

GraspNet: A Large-Scale Clustered and Densely Annotated Dataset for Object Grasping

no code implementations31 Dec 2019 Hao-Shu Fang, Chenxi Wang, Minghao Gou, Cewu Lu

Object grasping is critical for many applications, which is also a challenging computer vision problem.

3D Objectness Estimation via Bottom-up Regret Grouping

no code implementations5 Dec 2019 Zelin Ye, Yan Hao, Liang Xu, Rui Zhu, Cewu Lu

Further ablation study also demonstrates the effectiveness of our grouping predictor and regret mechanism.

Transferable Force-Torque Dynamics Model for Peg-in-hole Task

no code implementations30 Nov 2019 Junfeng Ding, Chen Wang, Cewu Lu

We present a learning-based force-torque dynamics to achieve model-based control for contact-rich peg-in-hole task using force-only inputs.

Model-based Reinforcement Learning Model Predictive Control

Attribute Restoration Framework for Anomaly Detection

1 code implementation25 Nov 2019 Chaoqin Huang, Fei Ye, Jinkun Cao, Maosen Li, Ya zhang, Cewu Lu

We here propose to break this equivalence by erasing selected attributes from the original data and reformulate it as a restoration task, where the normal and the anomalous data are expected to be distinguishable based on restoration errors.

Anomaly Detection Attribute +1

RGB-D Individual Segmentation

no code implementations16 Oct 2019 Wenqiang Xu, Yanjun Fu, Yuchen Luo, Chang Liu, Cewu Lu

Fine-grained recognition task deals with sub-category classification problem, which is important for real-world applications.

CoLA Segmentation

Template-Instance Loss for Offline Handwritten Chinese Character Recognition

no code implementations12 Oct 2019 Yao Xiao, Dan Meng, Cewu Lu, Chi-Keung Tang

The long-standing challenges for offline handwritten Chinese character recognition (HCCR) are twofold: Chinese characters can be very diverse and complicated while similarly looking, and cursive handwriting (due to increased writing speed and infrequent pen lifting) makes strokes and even characters connected together in a flowing manner.

Offline Handwritten Chinese Character Recognition

InstaBoost: Boosting Instance Segmentation via Probability Map Guided Copy-Pasting

3 code implementations ICCV 2019 Hao-Shu Fang, Jianhua Sun, Runzhong Wang, Minghao Gou, Yong-Lu Li, Cewu Lu

With the guidance of such map, we boost the performance of R101-Mask R-CNN on instance segmentation from 35. 7 mAP to 37. 9 mAP without modifying the backbone or network structure.

Data Augmentation Instance Segmentation +3

Cross-Domain Adaptation for Animal Pose Estimation

no code implementations ICCV 2019 Jinkun Cao, Hongyang Tang, Hao-Shu Fang, Xiaoyong Shen, Cewu Lu, Yu-Wing Tai

Therefore, the easily available human pose dataset, which is of a much larger scale than our labeled animal dataset, provides important prior knowledge to boost up the performance on animal pose estimation.

Animal Pose Estimation Domain Adaptation

Three Branches: Detecting Actions With Richer Features

no code implementations13 Aug 2019 Jin Xia, Jiajun Tang, Cewu Lu

We present our three branch solutions for International Challenge on Activity Recognition at CVPR2019.

Activity Recognition Spatio-Temporal Action Localization +1

Explicit Shape Encoding for Real-Time Instance Segmentation

1 code implementation ICCV 2019 Wenqiang Xu, Haiyang Wang, Fubo Qi, Cewu Lu

In this paper, we propose a novel top-down instance segmentation framework based on explicit shape encoding, named \textbf{ESE-Seg}.

Object object-detection +4

HAKE: Human Activity Knowledge Engine

4 code implementations13 Apr 2019 Yong-Lu Li, Liang Xu, Xinpeng Liu, Xijie Huang, Yue Xu, Mingyang Chen, Ze Ma, Shiyi Wang, Hao-Shu Fang, Cewu Lu

To address these and promote the activity understanding, we build a large-scale Human Activity Knowledge Engine (HAKE) based on the human body part states.

Ranked #2 on Human-Object Interaction Detection on HICO (using extra training data)

Action Detection Human-Object Interaction Detection +1

Combinational Q-Learning for Dou Di Zhu

1 code implementation24 Jan 2019 Yang You, Liangwei Li, Baisong Guo, Weiming Wang, Cewu Lu

Deep reinforcement learning (DRL) has gained a lot of attention in recent years, and has been proven to be able to play Atari games and Go at or above human levels.

Atari Games Card Games +1

CrowdPose: Efficient Crowded Scenes Pose Estimation and A New Benchmark

3 code implementations CVPR 2019 Jiefeng Li, Can Wang, Hao Zhu, Yihuan Mao, Hao-Shu Fang, Cewu Lu

In this paper, we propose a novel and efficient method to tackle the problem of pose estimation in the crowd and a new dataset to better evaluate algorithms.

Keypoint Detection Multi-Person Pose Estimation

Deep RNN Framework for Visual Sequential Applications

1 code implementation CVPR 2019 Bo Pang, Kaiwen Zha, Hanwen Cao, Chen Shi, Cewu Lu

There are mainly two novel designs in our deep RNN framework: one is a new RNN module called Context Bridge Module (CBM) which splits the information flowing along the sequence (temporal direction) and along depth (spatial representation direction), making it easier to train when building deep by balancing these two directions; the other is the Overlap Coherence Training Scheme that reduces the training complexity for long visual sequential tasks on account of the limitation of computing resources.

Future prediction SSIM +1

Pointwise Rotation-Invariant Network with Adaptive Sampling and 3D Spherical Voxel Convolution

1 code implementation23 Nov 2018 Yang You, Yujing Lou, Qi Liu, Yu-Wing Tai, Lizhuang Ma, Cewu Lu, Weiming Wang

Point cloud analysis without pose priors is very challenging in real applications, as the orientations of point clouds are often unknown.

3D Feature Matching Data Augmentation

Transferable Interactiveness Knowledge for Human-Object Interaction Detection

3 code implementations CVPR 2019 Yong-Lu Li, Siyuan Zhou, Xijie Huang, Liang Xu, Ze Ma, Hao-Shu Fang, Yan-Feng Wang, Cewu Lu

On account of the generalization of interactiveness, interactiveness network is a transferable knowledge learner and can be cooperated with any HOI detection models to achieve desirable results.

Human-Object Interaction Detection Object

AXNet: ApproXimate computing using an end-to-end trainable neural network

2 code implementations27 Jul 2018 Zhenghao Peng, Xuyang Chen, Chengwen Xu, Naifeng Jing, Xiaoyao Liang, Cewu Lu, Li Jiang

To guarantee the approximation quality, existing works deploy two neural networks (NNs), e. g., an approximator and a predictor.

Multi-Task Learning Philosophy

PointSIFT: A SIFT-like Network Module for 3D Point Cloud Semantic Segmentation

4 code implementations2 Jul 2018 Mingyang Jiang, Yiran Wu, Tianqi Zhao, Zelin Zhao, Cewu Lu

Recently, 3D understanding research sheds light on extracting features from point cloud directly, which requires effective shape pattern description of point clouds.

Point Cloud Segmentation Semantic Segmentation

Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer

1 code implementation CVPR 2018 Hao-Shu Fang, Guansong Lu, Xiaolin Fang, Jianwen Xie, Yu-Wing Tai, Cewu Lu

In this paper, we present a novel method to generate synthetic human part segmentation data using easily-obtained human keypoint annotations.

Ranked #4 on Human Part Segmentation on PASCAL-Part (using extra training data)

Human Parsing Human Part Segmentation +3

Recurrent Residual Module for Fast Inference in Videos

no code implementations CVPR 2018 Bowen Pan, Wuwei Lin, Xiaolin Fang, Chaoqin Huang, Bolei Zhou, Cewu Lu

Deep convolutional neural networks (CNNs) have made impressive progress in many video recognition tasks such as video pose estimation and video object detection.

object-detection Pose Estimation +2

Human Action Adverb Recognition: ADHA Dataset and A Three-Stream Hybrid Model

no code implementations4 Feb 2018 Bo Pang, Kaiwen Zha, Cewu Lu

We introduce the first benchmark for a new problem --- recognizing human action adverbs (HAA): "Adverbs Describing Human Actions" (ADHA).

Action Recognition Image Captioning +1

Pose Flow: Efficient Online Pose Tracking

1 code implementation3 Feb 2018 Yuliang Xiu, Jiefeng Li, Haoyu Wang, Yinghong Fang, Cewu Lu

Multi-person articulated pose tracking in unconstrained videos is an important while challenging problem.

Ranked #9 on Pose Tracking on PoseTrack2017 (using extra training data)

Pose Tracking

Virtual to Real Reinforcement Learning for Autonomous Driving

6 code implementations13 Apr 2017 Xinlei Pan, Yurong You, Ziyan Wang, Cewu Lu

To our knowledge, this is the first successful case of driving policy trained by reinforcement learning that can adapt to real world driving data.

Autonomous Driving Domain Adaptation +5

Beyond Holistic Object Recognition: Enriching Image Understanding with Part States

no code implementations CVPR 2018 Cewu Lu, Hao Su, Yongyi Lu, Li Yi, Chi-Keung Tang, Leonidas Guibas

Important high-level vision tasks such as human-object interaction, image captioning and robotic manipulation require rich semantic descriptions of objects at part level.

Human-Object Interaction Detection Image Captioning +1

RMPE: Regional Multi-person Pose Estimation

14 code implementations ICCV 2017 Hao-Shu Fang, Shuqin Xie, Yu-Wing Tai, Cewu Lu

In this paper, we propose a novel regional multi-person pose estimation (RMPE) framework to facilitate pose estimation in the presence of inaccurate human bounding boxes.

2D Human Pose Estimation Human Detection +2

Visual Relationship Detection with Language Priors

no code implementations31 Jul 2016 Cewu Lu, Ranjay Krishna, Michael Bernstein, Li Fei-Fei

We improve on prior work by leveraging language priors from semantic word embeddings to finetune the likelihood of a predicted relationship.

Content-Based Image Retrieval Relationship Detection +3

Square Localization for Efficient and Accurate Object Detection

no code implementations ICCV 2015 Cewu Lu, Yongyi Lu, Hao Chen, Chi-Keung Tang

In the testing phase, sliding CNN models are applied which produces a set of response maps that can be effectively filtered by the learned co-presence prior to output the final bounding boxes for localizing an object.

Object object-detection +2

Deep LAC: Deep Localization, Alignment and Classification for Fine-Grained Recognition

no code implementations CVPR 2015 Di Lin, Xiaoyong Shen, Cewu Lu, Jiaya Jia

Our major contribution is to propose a valve linkage function(VLF) for back-propagation chaining and form our deep localization, alignment and classification (LAC) system.

Classification General Classification

1-HKUST: Object Detection in ILSVRC 2014

no code implementations22 Sep 2014 Cewu Lu, Hao Chen, Qifeng Chen, Hei Law, Yao Xiao, Chi-Keung Tang

We participated in the object detection track of ILSVRC 2014 and received the fourth place among the 38 teams.

Object object-detection +3

L0 Regularized Stationary Time Estimation for Crowd Group Analysis

no code implementations CVPR 2014 Shuai Yi, Xiaogang Wang, Cewu Lu, Jiaya Jia

We tackle stationary crowd analysis in this paper, which is similarly important as modeling mobile groups in crowd scenes and finds many applications in surveillance.

Learning Important Spatial Pooling Regions for Scene Classification

no code implementations CVPR 2014 Di Lin, Cewu Lu, Renjie Liao, Jiaya Jia

We address the false response influence problem when learning and applying discriminative parts to construct the mid-level representation in scene classification.

Classification General Classification +1

Two-Class Weather Classification

no code implementations CVPR 2014 Cewu Lu, Di Lin, Jiaya Jia, Chi-Keung Tang

Given a single outdoor image, this paper proposes a collaborative learning approach for labeling it as either sunny or cloudy.

Classification General Classification +1

Online Robust Dictionary Learning

no code implementations CVPR 2013 Cewu Lu, Jiaping Shi, Jiaya Jia

Online dictionary learning is particularly useful for processing large-scale and dynamic data in computer vision.

Dictionary Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.