Search Results for author: Cewu Lu

Found 157 papers, 85 papers with code

MS-MANO: Enabling Hand Pose Tracking with Biomechanical Constraints

no code implementations • 16 Apr 2024 • Pengfei Xie, Wenqiang Xu, Tutian Tang, Zhenjun Yu, Cewu Lu

To address this, we integrate a musculoskeletal system with a learnable parametric hand model, MANO, to create a new model, MS-MANO.

Pose Tracking

Paper
Add Code

SemGrasp: Semantic Grasp Generation via Language Aligned Discretization

no code implementations • 4 Apr 2024 • Kailin Li, Jingbo Wang, Lixin Yang, Cewu Lu, Bo Dai

We introduce a discrete representation that aligns the grasp space with semantic space, enabling the generation of grasp postures in accordance with language instructions.

Grasp Generation Language Modelling +2

Paper
Add Code

OAKINK2: A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion

no code implementations • 28 Mar 2024 • Xinyu Zhan, Lixin Yang, Yifei Zhao, Kangrui Mao, Hanlin Xu, Zenan Lin, Kailin Li, Cewu Lu

Based on the 3-level abstraction of OAKINK2, we explore a task-oriented framework for Complex Task Completion (CTC).

Motion Synthesis Object

Paper
Add Code

RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents

no code implementations • 28 Mar 2024 • Zeren Chen, Zhelun Shi, Xiaoya Lu, Lehan He, Sucheng Qian, Hao Shu Fang, Zhenfei Yin, Wanli Ouyang, Jing Shao, Yu Qiao, Cewu Lu, Lu Sheng

The ultimate goals of robotic learning is to acquire a comprehensive and generalizable robotic system capable of performing both seen skills within the training distribution and unseen skills in novel environments.

Motion Planning

Paper
Add Code

RPMArt: Towards Robust Perception and Manipulation for Articulated Objects

no code implementations • 24 Mar 2024 • JunBo Wang, Wenhai Liu, Qiaojun Yu, Yang You, Liu Liu, Weiming Wang, Cewu Lu

Our primary contribution is a Robust Articulation Network (RoArtNet) that is able to predict both joint parameters and affordable points robustly by local feature learning and point tuple voting.

Paper
Add Code

GLC++: Source-Free Universal Domain Adaptation through Global-Local Clustering and Contrastive Affinity Learning

2 code implementations • 21 Mar 2024 • Sanqing Qu, Tianpei Zou, Florian Röhrbein, Cewu Lu, Guang Chen, DaCheng Tao, Changjun Jiang

GLC++ enhances the novel category clustering accuracy of GLC by 4. 3% in open-set scenarios on Office-Home.

Clustering Contrastive Learning +2

Paper
Code

ManiPose: A Comprehensive Benchmark for Pose-aware Object Manipulation in Robotics

no code implementations • 20 Mar 2024 • Qiaojun Yu, Ce Hao, JunBo Wang, Wenhai Liu, Liu Liu, Yao Mu, Yang You, Hengxu Yan, Cewu Lu

Robotic manipulation in everyday scenarios, especially in unstructured environments, requires skills in pose-aware object manipulation (POM), which adapts robots' grasping and handling according to an object's 6D pose.

Motion Planning Pose Estimation

Paper
Add Code

ShapeBoost: Boosting Human Shape Estimation with Part-Based Parameterization and Clothing-Preserving Augmentation

no code implementations • 2 Mar 2024 • Siyuan Bian, Jiefeng Li, Jiasheng Tang, Cewu Lu

Accurate human shape recovery from a monocular RGB image is a challenging task because humans come in different shapes and sizes and wear different clothes.

Data Augmentation

Paper
Add Code

EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

1 code implementation • 26 Dec 2023 • Tai Wang, Xiaohan Mao, Chenming Zhu, Runsen Xu, Ruiyuan Lyu, Peisen Li, Xiao Chen, Wenwei Zhang, Kai Chen, Tianfan Xue, Xihui Liu, Cewu Lu, Dahua Lin, Jiangmiao Pang

In the realm of computer vision and robotics, embodied agents are expected to explore their environment and carry out human instructions.

Scene Understanding

300

Paper
Code

PACE: A Large-Scale Dataset with Pose Annotations in Cluttered Environments

1 code implementation • 23 Dec 2023 • Yang You, Kai Xiong, Zhening Yang, Zhengxiang Huang, Junwei Zhou, Ruoxi Shi, Zhou Fang, Adam W. Harley, Leonidas Guibas, Cewu Lu

We introduce PACE (Pose Annotations in Cluttered Environments), a large-scale benchmark designed to advance the development and evaluation of pose estimation methods in cluttered scenarios.

Pose Estimation Pose Tracking

Paper
Code

Primitive-based 3D Human-Object Interaction Modelling and Programming

no code implementations • 17 Dec 2023 • SiQi Liu, Yong-Lu Li, Zhou Fang, Xinpeng Liu, Yang You, Cewu Lu

To explore an effective embedding of HAOI for the machine, we build a new benchmark on 3D HAOI consisting of primitives together with their images and propose a task requiring machines to recover 3D HAOI using primitives from images.

3D Reconstruction Human-Object Interaction Detection +2

Paper
Add Code

Revisit Human-Scene Interaction via Space Occupancy

no code implementations • 5 Dec 2023 • Xinpeng Liu, Haowen Hou, Yanchao Yang, Yong-Lu Li, Cewu Lu

Human-scene Interaction (HSI) generation is a challenging task and crucial for various downstream tasks.

Paper
Add Code

Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement

1 code implementation • 1 Dec 2023 • Ziyu Wang, Yue Xu, Cewu Lu, Yong-Lu Li

It first distills the videos into still images as static memory and then compensates the dynamic and motion information with a learnable dynamic memory block.

Disentanglement

Paper
Code

Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human Activity Reasoning

no code implementations • NeurIPS 2023 • Xiaoqian Wu, Yong-Lu Li, Jianhua Sun, Cewu Lu

One possible path of activity reasoning is building a symbolic system composed of symbols and rules, where one rule connects multiple symbols, implying human knowledge and reasoning abilities.

Paper
Add Code

RFTrans: Leveraging Refractive Flow of Transparent Objects for Surface Normal Estimation and Manipulation

no code implementations • 21 Nov 2023 • Tutian Tang, Jiyu Liu, Jieyi Zhang, Haoyuan Fu, Wenqiang Xu, Cewu Lu

By leveraging refractive flow as an intermediate representation, the proposed method circumvents the drawbacks of directly predicting the geometry (e. g. surface normal) from images and helps bridge the sim-to-real gap.

Surface Normal Estimation Transparent objects

Paper
Add Code

UniFolding: Towards Sample-efficient, Scalable, and Generalizable Robotic Garment Folding

no code implementations • 2 Nov 2023 • Han Xue, Yutong Li, Wenqiang Xu, Huanyu Li, Dongzhe Zheng, Cewu Lu

Training data is collected via a human-centric process with offline and online stages.

Paper
Add Code

Bridging the Gap between Human Motion and Action Semantics via Kinematic Phrases

no code implementations • 6 Oct 2023 • Xinpeng Liu, Yong-Lu Li, Ailing Zeng, Zizheng Zhou, Yang You, Cewu Lu

The goal of motion understanding is to establish a reliable mapping between motion and action semantics, while it is a challenging many-to-many problem.

Paper
Add Code

GAMMA: Generalizable Articulation Modeling and Manipulation for Articulated Objects

1 code implementation • 28 Sep 2023 • Qiaojun Yu, JunBo Wang, Wenhai Liu, Ce Hao, Liu Liu, Lin Shao, Weiming Wang, Cewu Lu

Results show that GAMMA significantly outperforms SOTA articulation modeling and manipulation algorithms in unseen and cross-category articulated objects.

Manner Of Articulation Detection Robot Manipulation +1

Paper
Code

EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding

no code implementations • ICCV 2023 • Yue Xu, Yong-Lu Li, Zhemin Huang, Michael Xu Liu, Cewu Lu, Yu-Wing Tai, Chi-Keung Tang

With the surge in attention to Egocentric Hand-Object Interaction (Ego-HOI), large-scale datasets such as Ego4D and EPIC-KITCHENS have been proposed.

Action Recognition Temporal Action Localization

Paper
Add Code

CHORD: Category-level Hand-held Object Reconstruction via Shape Deformation

no code implementations • ICCV 2023 • Kailin Li, Lixin Yang, Haoyu Zhen, Zenan Lin, Xinyu Zhan, Licheng Zhong, Jian Xu, Kejian Wu, Cewu Lu

This can be attributed to the fact that humans have mastered the shape prior of the 'mug' category, and can quickly establish the corresponding relations between different mug instances and the prior, such as where the rim and handle are located.

Object Reconstruction

Paper
Add Code

ClothesNet: An Information-Rich 3D Garment Model Repository with Simulated Clothes Environment

no code implementations • ICCV 2023 • Bingyang Zhou, Haoyu Zhou, Tianhai Liang, Qiaojun Yu, Siheng Zhao, Yuwei Zeng, Jun Lv, Siyuan Luo, Qiancai Wang, Xinyuan Yu, Haonan Chen, Cewu Lu, Lin Shao

We present ClothesNet: a large-scale dataset of 3D clothes objects with information-rich annotations.

Keypoint Detection

Paper
Add Code

Color-NeuS: Reconstructing Neural Implicit Surfaces with Color

1 code implementation • 14 Aug 2023 • Licheng Zhong, Lixin Yang, Kailin Li, Haoyu Zhen, Mei Han, Cewu Lu

Mesh is extracted from the signed distance function (SDF) network for the surface, and color for each surface vertex is drawn from the global color network.

133

Paper
Code

RH20T: A Comprehensive Robotic Dataset for Learning Diverse Skills in One-Shot

no code implementations • 2 Jul 2023 • Hao-Shu Fang, Hongjie Fang, Zhenyu Tang, Jirong Liu, Chenxi Wang, JunBo Wang, Haoyi Zhu, Cewu Lu

A key challenge in robotic manipulation in open domains is how to acquire diverse and generalizable skills for robots.

Imitation Learning Motion Planning +2

Paper
Add Code

Distill Gold from Massive Ores: Efficient Dataset Distillation via Critical Samples Selection

1 code implementation • 28 May 2023 • Yue Xu, Yong-Lu Li, Kaitong Cui, Ziyu Wang, Cewu Lu, Yu-Wing Tai, Chi-Keung Tang

Our method consistently enhances the distillation algorithms, even on much larger-scale and more heterogeneous datasets, e. g. ImageNet-1K and Kinetics-400.

Paper
Code

NIKI: Neural Inverse Kinematics with Invertible Neural Networks for 3D Human Pose and Shape Estimation

1 code implementation • CVPR 2023 • Jiefeng Li, Siyuan Bian, Qi Liu, Jiasheng Tang, Fan Wang, Cewu Lu

In this work, we present NIKI (Neural Inverse Kinematics with Invertible Neural Network), which models bi-directional errors to improve the robustness to occlusions and obtain pixel-aligned accuracy.

Ranked #1 on 3D Human Pose Estimation on AGORA

3D human pose and shape estimation

241

Paper
Code

HybrIK-X: Hybrid Analytical-Neural Inverse Kinematics for Whole-body Mesh Recovery

1 code implementation • 12 Apr 2023 • Jiefeng Li, Siyuan Bian, Chao Xu, Zhicun Chen, Lixin Yang, Cewu Lu

To address these issues, this paper presents a novel hybrid inverse kinematics solution, HybrIK, that integrates the merits of 3D keypoint estimation and body mesh recovery in a unified framework.

Ranked #1 on 3D Human Reconstruction on AGORA

3D Human Pose Estimation 3D Human Reconstruction +1

1,117

Paper
Code

POEM: Reconstructing Hand in a Point Embedded Multi-view Stereo

1 code implementation • CVPR 2023 • Lixin Yang, Jian Xu, Licheng Zhong, Xinyu Zhan, Zhicheng Wang, Kejian Wu, Cewu Lu

Enable neural networks to capture 3D geometrical-aware features is essential in multi-view based vision tasks.

Paper
Code

From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding

no code implementations • 2 Apr 2023 • Yong-Lu Li, Xiaoqian Wu, Xinpeng Liu, Zehao Wang, Yiming Dou, Yikun Ji, Junyi Zhang, Yixing Li, Jingru Tan, Xudong Lu, Cewu Lu

By aligning the classes of previous datasets to our semantic space, we gather (image/video/skeleton/MoCap) datasets into a unified database in a unified label system, i. e., bridging "isolated islands" into a "Pangea".

Action Understanding Transfer Learning

Paper
Add Code

Visual-Tactile Sensing for In-Hand Object Reconstruction

no code implementations • CVPR 2023 • Wenqiang Xu, Zhenjun Yu, Han Xue, Ruolin Ye, Siqiong Yao, Cewu Lu

We propose a simulation environment, VT-Sim, which supports generating hand-object interaction for both rigid and deformable objects.

Object Object Reconstruction

Paper
Add Code

GarmentTracking: Category-Level Garment Pose Tracking

1 code implementation • CVPR 2023 • Han Xue, Wenqiang Xu, Jieyi Zhang, Tutian Tang, Yutong Li, Wenxin Du, Ruolin Ye, Cewu Lu

In this work, we present a complete package to address the category-level garment pose tracking task: (1) A recording system VR-Garment, with which users can manipulate virtual garment models in simulation through a VR interface.

Pose Tracking

Paper
Code

Upcycling Models under Domain and Category Shift

3 code implementations • CVPR 2023 • Sanqing Qu, Tianpei Zou, Florian Roehrbein, Cewu Lu, Guang Chen, DaCheng Tao, Changjun Jiang

We examine the superiority of our GLC on multiple benchmarks with different category shift scenarios, including partial-set, open-set, and open-partial-set DA.

Ranked #2 on Universal Domain Adaptation on VisDA2017

Clustering Source-Free Domain Adaptation +2

Paper
Code

CRIN: Rotation-Invariant Point Cloud Analysis and Rotation Estimation via Centrifugal Reference Frame

1 code implementation • 6 Mar 2023 • Yujing Lou, Zelin Ye, Yang You, Nianjuan Jiang, Jiangbo Lu, Weiming Wang, Lizhuang Ma, Cewu Lu

CRIN directly takes the coordinates of points as input and transforms local points into rotation-invariant representations via centrifugal reference frames.

Paper
Code

Unsupervised 3D Point Cloud Representation Learning by Triangle Constrained Contrast for Autonomous Driving

no code implementations • CVPR 2023 • Bo Pang, Hongchi Xia, Cewu Lu

In this paper, we design the Triangle Constrained Contrast (TriCC) framework tailored for autonomous driving scenes which learns 3D unsupervised representations through both the multimodal information and dynamic of temporal sequences.

Autonomous Driving Representation Learning +2

Paper
Add Code

ClothPose: A Real-world Benchmark for Visual Analysis of Garment Pose via An Indirect Recording Solution

no code implementations • ICCV 2023 • Wenqiang Xu, Wenxin Du, Han Xue, Yutong Li, Ruolin Ye, Yan-Feng Wang, Cewu Lu

In this work, we propose a recording system, GarmentTwin, which can track garment poses in dynamic settings such as manipulation.

2k Pose Estimation

Paper
Add Code

Stimulus Verification Is a Universal and Effective Sampler in Multi-Modal Human Trajectory Prediction

no code implementations • CVPR 2023 • Jianhua Sun, YuXuan Li, Liang Chai, Cewu Lu

To comprehensively cover the uncertainty of the future, the common practice of multi-modal human trajectory prediction is to first generate a set/distribution of candidate future trajectories and then sample required numbers of trajectories from them as final predictions.

Trajectory Prediction

Paper
Add Code

Target-Referenced Reactive Grasping for Dynamic Objects

no code implementations • CVPR 2023 • Jirong Liu, Ruo Zhang, Hao-Shu Fang, Minghao Gou, Hongjie Fang, Chenxi Wang, Sheng Xu, Hengxu Yan, Cewu Lu

Reactive grasping, which enables the robot to successfully grasp dynamic moving objects, is of great interest in robotics.

Paper
Add Code

Beyond Object Recognition: A New Benchmark towards Object Concept Learning

no code implementations • ICCV 2023 • Yong-Lu Li, Yue Xu, Xinyu Xu, Xiaohan Mao, Yuan YAO, SiQi Liu, Cewu Lu

To support OCL, we build a densely annotated knowledge base including extensive labels for three levels of object concept (category, attribute, affordance), and the causal relations of three levels.

Attribute Object +1

Paper
Add Code

One-Shot General Object Localization

1 code implementation • 24 Nov 2022 • Yang You, Zhuochen Miao, Kai Xiong, Weiming Wang, Cewu Lu

In contrast, our proposed OneLoc algorithm efficiently finds the object center and bounding box size by a special voting scheme.

Object Object Localization

Paper
Code

CPPF++: Uncertainty-Aware Sim2Real Object Pose Estimation by Vote Aggregation

2 code implementations • 24 Nov 2022 • Yang You, Wenhao He, Jin Liu, Hongkai Xiong, Weiming Wang, Cewu Lu

We introduce a novel method, CPPF++, designed for sim-to-real pose estimation.

Pose Estimation

Paper
Code

Discovering A Variety of Objects in Spatio-Temporal Human-Object Interactions

1 code implementation • 14 Nov 2022 • Yong-Lu Li, Hongwei Fan, Zuoyu Qiu, Yiming Dou, Liang Xu, Hao-Shu Fang, Peiyang Guo, Haisheng Su, Dongliang Wang, Wei Wu, Cewu Lu

In daily HOIs, humans often interact with a variety of objects, e. g., holding and touching dozens of household items in cleaning.

Human-Object Interaction Detection Object +3

Paper
Code

AlphaPose: Whole-Body Regional Multi-Person Pose Estimation and Tracking in Real-Time

7 code implementations • 7 Nov 2022 • Hao-Shu Fang, Jiefeng Li, Hongyang Tang, Chao Xu, Haoyi Zhu, Yuliang Xiu, Yong-Lu Li, Cewu Lu

Accurate whole-body multi-person pose estimation and tracking is an important yet challenging topic in computer vision.

Knowledge Distillation Multi-Person Pose Estimation +1

7,713

Paper
Code

SAM-RL: Sensing-Aware Model-Based Reinforcement Learning via Differentiable Physics-Based Simulation and Rendering

no code implementations • 27 Oct 2022 • Jun Lv, Yunhai Feng, Cheng Zhang, Shuang Zhao, Lin Shao, Cewu Lu

Model-based reinforcement learning (MBRL) is recognized with the potential to be significantly more sample-efficient than model-free RL.

Deformable Object Manipulation Model-based Reinforcement Learning +2

Paper
Add Code

Neural Eigenfunctions Are Structured Representation Learners

1 code implementation • 23 Oct 2022 • Zhijie Deng, Jiaxin Shi, Hao Zhang, Peng Cui, Cewu Lu, Jun Zhu

Unlike prior spectral methods such as Laplacian Eigenmap that operate in a nonparametric manner, Neural Eigenmap leverages NeuralEF to parametrically model eigenfunctions using a neural network.

Contrastive Learning Data Augmentation +7

Paper
Code

DART: Articulated Hand Model with Diverse Accessories and Rich Textures

1 code implementation • 14 Oct 2022 • Daiheng Gao, Yuliang Xiu, Kailin Li, Lixin Yang, Feng Wang, Peng Zhang, Bang Zhang, Cewu Lu, Ping Tan

Unity GUI is also provided to generate synthetic hand data with user-defined settings, e. g., pose, camera, background, lighting, textures, and accessories.

Hand Pose Estimation Unity

128

Paper
Code

X-NeRF: Explicit Neural Radiance Field for Multi-Scene 360$^{\circ} $ Insufficient RGB-D Views

1 code implementation • 11 Oct 2022 • Haoyi Zhu, Hao-Shu Fang, Cewu Lu

In this paper, we focus on a rarely discussed but important setting: can we train one model that can represent multiple scenes, with 360$^\circ $ insufficient views and RGB-D images?

Novel View Synthesis

Paper
Code

D&D: Learning Human Dynamics from Dynamic Camera

1 code implementation • 19 Sep 2022 • Jiefeng Li, Siyuan Bian, Chao Xu, Gang Liu, Gang Yu, Cewu Lu

In this work, we present D&D (Learning Human Dynamics from Dynamic Camera), which leverages the laws of physics to reconstruct 3D human motion from the in-the-wild videos with a moving camera.

3D Human Pose Estimation Human Dynamics

107

Paper
Code

Constructing Balance from Imbalance for Long-tailed Image Recognition

1 code implementation • 4 Aug 2022 • Yue Xu, Yong-Lu Li, Jiefeng Li, Cewu Lu

Previous methods tackle with data imbalance from the viewpoints of data distribution, feature space, and model design, etc.

Paper
Code

Mining Cross-Person Cues for Body-Part Interactiveness Learning in HOI Detection

1 code implementation • 28 Jul 2022 • Xiaoqian Wu, Yong-Lu Li, Xinpeng Liu, Junyi Zhang, Yuzhe Wu, Cewu Lu

Though significant progress has been made, interactiveness learning remains a challenging problem in HOI detection: existing methods usually generate redundant negative H-O pair proposals and fail to effectively extract interactive pairs.

Ranked #9 on Human-Object Interaction Detection on V-COCO

Human-Object Interaction Detection

Paper
Code

Unsupervised Visual Representation Learning by Synchronous Momentum Grouping

1 code implementation • 13 Jul 2022 • Bo Pang, Yifan Zhang, Yaoyi Li, Jia Cai, Cewu Lu

In this paper, we propose a genuine group-level contrastive visual representation learning method whose linear evaluation performance on ImageNet surpasses the vanilla supervised learning.

Ranked #38 on Self-Supervised Image Classification on ImageNet

Clustering Contrastive Learning +2

2,745

Paper
Code

Unseen Object 6D Pose Estimation: A Benchmark and Baselines

no code implementations • 23 Jun 2022 • Minghao Gou, Haolin Pan, Hao-Shu Fang, Ziyuan Liu, Cewu Lu, Ping Tan

In this paper, we propose a new task that enables and facilitates algorithms to estimate the 6D pose estimation of novel objects during testing.

6D Pose Estimation

Paper
Add Code

Interactiveness Field in Human-Object Interactions

1 code implementation • CVPR 2022 • Xinpeng Liu, Yong-Lu Li, Xiaoqian Wu, Yu-Wing Tai, Cewu Lu, Chi-Keung Tang

Human-Object Interaction (HOI) detection plays a core role in activity understanding.

Human-Object Interaction Detection Object

Paper
Code

Learning to Anticipate Future with Dynamic Context Removal

1 code implementation • CVPR 2022 • Xinyu Xu, Yong-Lu Li, Cewu Lu

Anticipating future events is an essential feature for intelligent systems and embodied AI.

Paper
Code

OakInk: A Large-scale Knowledge Repository for Understanding Hand-Object Interaction

1 code implementation • CVPR 2022 • Lixin Yang, Kailin Li, Xinyu Zhan, Fei Wu, Anran Xu, Liu Liu, Cewu Lu

We start to collect 1, 800 common household objects and annotate their affordances to construct the first knowledge base: Oak.

Grasp Generation Object +1

Paper
Code

Semantic Segmentation by Early Region Proxy

1 code implementation • CVPR 2022 • Yifan Zhang, Bo Pang, Cewu Lu

Typical vision backbones manipulate structured features.

Segmentation Semantic Segmentation

Paper
Code

CPPF: Towards Robust Category-Level 9D Pose Estimation in the Wild

1 code implementation • CVPR 2022 • Yang You, Ruoxi Shi, Weiming Wang, Cewu Lu

Drawing inspirations from traditional point pair features (PPFs), in this paper, we design a novel Category-level PPF (CPPF) voting method to achieve accurate, robust and generalizable 9D pose estimation in the wild.

Ranked #8 on 6D Pose Estimation using RGBD on REAL275

6D Pose Estimation using RGBD

Paper
Code

Highlighting Object Category Immunity for the Generalization of Human-Object Interaction Detection

1 code implementation • 19 Feb 2022 • Xinpeng Liu, Yong-Lu Li, Cewu Lu

To achieve OC-immunity, we propose an OC-immune network that decouples the inputs from OC, extracts OC-immune representations, and leverages uncertainty quantification to generalize to unseen objects.

Human-Object Interaction Detection Object +1

Paper
Code

AKB-48: A Real-World Articulated Object Knowledge Base

no code implementations • CVPR 2022 • Liu Liu, Wenqiang Xu, Haoyuan Fu, Sucheng Qian, Yang Han, Cewu Lu

To bridge the gap, we present AKB-48: a large-scale Articulated object Knowledge Base which consists of 2, 037 real-world 3D articulated object models of 48 categories.

Object Object Reconstruction +1

Paper
Add Code

TransCG: A Large-Scale Real-World Dataset for Transparent Object Depth Completion and a Grasping Baseline

1 code implementation • 17 Feb 2022 • Hongjie Fang, Hao-Shu Fang, Sheng Xu, Cewu Lu

However, the majority of current grasping algorithms would fail in this case since they heavily rely on the depth image, while ordinary depth sensors usually fail to produce accurate depth information for transparent objects owing to the reflection and refraction of light.

Ranked #1 on Transparent Object Depth Estimation on TransCG

Depth Completion Robotic Grasping +2

Paper
Code

HAKE: A Knowledge Engine Foundation for Human Activity Understanding

3 code implementations • 14 Feb 2022 • Yong-Lu Li, Xinpeng Liu, Xiaoqian Wu, Yizhuo Li, Zuoyu Qiu, Liang Xu, Yue Xu, Hao-Shu Fang, Cewu Lu

Human activity understanding is of widespread interest in artificial intelligence and spans diverse applications like health care and behavior analysis.

Action Recognition Human-Object Interaction Detection +2

217

Paper
Code

Human Trajectory Prediction With Momentary Observation

no code implementations • CVPR 2022 • Jianhua Sun, YuXuan Li, Liang Chai, Hao-Shu Fang, Yong-Lu Li, Cewu Lu

Human trajectory prediction task aims to analyze human future movements given their past status, which is a crucial step for many autonomous systems such as self-driving cars and social robots.

Self-Driving Cars Trajectory Prediction

Paper
Add Code

iSeg3D: An Interactive 3D Shape Segmentation Tool

no code implementations • 24 Dec 2021 • Sucheng Qian, Liu Liu, Wenqiang Xu, Cewu Lu

It can obtain a satisfied segmentation result with minimal human clicks (< 10).

Segmentation

Paper
Add Code

OMAD: Object Model with Articulated Deformations for Pose Estimation and Retrieval

no code implementations • 14 Dec 2021 • Han Xue, Liu Liu, Wenqiang Xu, Haoyuan Fu, Cewu Lu

With the full representation of the object shape and joint states, we can address several tasks including category-level object pose estimation and the articulated object retrieval.

Object Pose Estimation +1

Paper
Add Code

Regularity Learning via Explicit Distribution Modeling for Skeletal Video Anomaly Detection

1 code implementation • 7 Dec 2021 • Shoubin Yu, Zhongyin Zhao, Haoshu Fang, Andong Deng, Haisheng Su, Dongliang Wang, Weihao Gan, Cewu Lu, Wei Wu

Different from pixel-based anomaly detection methods, pose-based methods utilize highly-structured skeleton data, which decreases the computational burden and also avoids the negative impact of background noise.

Anomaly Detection In Surveillance Videos Optical Flow Estimation +1

Paper
Code

SAGCI-System: Towards Sample-Efficient, Generalizable, Compositional, and Incremental Robot Learning

no code implementations • 29 Nov 2021 • Jun Lv, Qiaojun Yu, Lin Shao, Wenhai Liu, Wenqiang Xu, Cewu Lu

We apply our system to perform articulated object manipulation tasks, both in the simulation and the real world.

Paper
Add Code

Understanding Pixel-level 2D Image Semantics with 3D Keypoint Knowledge Engine

no code implementations • 21 Nov 2021 • Yang You, Chengkun Li, Yujing Lou, Zhoujun Cheng, Liangwei Li, Lizhuang Ma, Weiming Wang, Cewu Lu

Pixel-level 2D object semantic understanding is an important topic in computer vision and could help machine deeply understand objects (e. g. functionality and affordance) in our daily life.

Paper
Add Code

Skeleton-Based Mutually Assisted Interacted Object Localization and Human Action Recognition

no code implementations • 28 Oct 2021 • Liang Xu, Cuiling Lan, Wenjun Zeng, Cewu Lu

Skeleton data carries valuable motion information and is widely explored in human action recognition.

Action Recognition Object +2

Paper
Add Code

Localization with Sampling-Argmax

1 code implementation • NeurIPS 2021 • Jiefeng Li, Tong Chen, Ruiqi Shi, Yujing Lou, Yong-Lu Li, Cewu Lu

In this work, we propose sampling-argmax, a differentiable training method that imposes implicit constraints to the shape of the probability map by minimizing the expectation of the localization error.

Ranked #158 on 3D Human Pose Estimation on Human3.6M

3D Human Pose Estimation

Paper
Code

Learning Single/Multi-Attribute of Object with Symmetry and Group

1 code implementation • 9 Oct 2021 • Yong-Lu Li, Yue Xu, Xinyu Xu, Xiaohan Mao, Cewu Lu

To model the compositional nature of these concepts, it is a good choice to learn them as transformations, e. g., coupling and decoupling.

Attribute Compositional Zero-Shot Learning

Paper
Code

ArtiBoost: Boosting Articulated 3D Hand-Object Pose Estimation via Online Exploration and Synthesis

2 code implementations • CVPR 2022 • Kailin Li, Lixin Yang, Xinyu Zhan, Jun Lv, Wenqiang Xu, Jiefeng Li, Cewu Lu

In contrast, data synthesis can easily ensure those diversities separately.

Ranked #3 on hand-object pose on HO-3D (using extra training data)

hand-object pose Object +2

110

Paper
Code

Human Pose Regression with Residual Log-likelihood Estimation

3 code implementations • ICCV 2021 • Jiefeng Li, Siyuan Bian, Ailing Zeng, Can Wang, Bo Pang, Wentao Liu, Cewu Lu

In light of this, we propose a novel regression paradigm with Residual Log-likelihood Estimation (RLE) to capture the underlying output distribution.

Ranked #59 on 3D Human Pose Estimation on Human3.6M

3D Human Pose Estimation Multi-Person Pose Estimation +1

4,995

Paper
Code

ContourRender: Detecting Arbitrary Contour Shape For Instance Segmentation In One Pass

no code implementations • 7 Jun 2021 • Tutian Tang, Wenqiang Xu, Ruolin Ye, Yan-Feng Wang, Cewu Lu

In addition, we specifically select a subset from COCO val2017 named COCO ContourHard-val to further demonstrate the contour quality improvements.

Instance Segmentation Semantic Segmentation

Paper
Add Code

Towards Real-World Category-level Articulation Pose Estimation

no code implementations • 7 May 2021 • Liu Liu, Han Xue, Wenqiang Xu, Haoyuan Fu, Cewu Lu

This setting allows varied kinematic structures within a semantic category, and multiple instances to co-exist in an observation of real world.

Mixed Reality Pose Estimation

Paper
Add Code

H2O: A Benchmark for Visual Human-human Object Handover Analysis

no code implementations • ICCV 2021 • Ruolin Ye, Wenqiang Xu, Zhendong Xue, Tutian Tang, Yanfeng Wang, Cewu Lu

Besides, we also report the hand and object pose errors with existing baselines and show that the dataset can serve as the video demonstrations for robot imitation learning on the handover task.

Imitation Learning Object

Paper
Add Code

Skimming and Scanning for Untrimmed Video Action Recognition

no code implementations • 21 Apr 2021 • Yunyan Hong, Ailing Zeng, Min Li, Cewu Lu, Li Jiang, Qiang Xu

Video action recognition (VAR) is a primary task of video understanding, and untrimmed videos are more common in real-life scenes.

Action Recognition Temporal Action Localization +1

Paper
Add Code

SuctionNet-1Billion: A Large-Scale Benchmark for Suction Grasping

no code implementations • 23 Mar 2021 • Hanwen Cao, Hao-Shu Fang, Wenhai Liu, Cewu Lu

Meanwhile, we propose a method to predict numerous suction poses from an RGB-D image of a cluttered scene and demonstrate our superiority against several previous methods.

Robotic Grasping

Paper
Add Code

PGT: A Progressive Method for Training Models on Long Videos

1 code implementation • CVPR 2021 • Bo Pang, Gao Peng, Yizhuo Li, Cewu Lu

This progressive training (PGT) method is able to train long videos end-to-end with limited resources and ensures the effective transmission of information.

Paper
Code

Skeleton Merger: an Unsupervised Aligned Keypoint Detector

1 code implementation • CVPR 2021 • Ruoxi Shi, Zhengrong Xue, Yang You, Cewu Lu

In this paper, we propose an unsupervised aligned keypoint detector, Skeleton Merger, which utilizes skeletons to reconstruct objects.

Object Tracking Retrieval

Paper
Code

Three Steps to Multimodal Trajectory Prediction: Modality Clustering, Classification and Synthesis

1 code implementation • ICCV 2021 • Jianhua Sun, YuXuan Li, Hao-Shu Fang, Cewu Lu

Multimodal prediction results are essential for trajectory prediction task as there is no single correct answer for the future.

Clustering General Classification +1

Paper
Code

RGB Matters: Learning 7-DoF Grasp Poses on Monocular RGBD Images

1 code implementation • 3 Mar 2021 • Minghao Gou, Hao-Shu Fang, Zhanda Zhu, Sheng Xu, Chenxi Wang, Cewu Lu

In the first stage, an encoder-decoder like convolutional neural network Angle-View Net(AVN) is proposed to predict the SO(3) orientation of the gripper at every location of the image.

Paper
Code

PRIN/SPRIN: On Extracting Point-wise Rotation Invariant Features

2 code implementations • 24 Feb 2021 • Yang You, Yujing Lou, Ruoxi Shi, Qi Liu, Yu-Wing Tai, Lizhuang Ma, Weiming Wang, Cewu Lu

Spherical Voxel Convolution and Point Re-sampling are proposed to extract rotation invariant features for each point.

3D Feature Matching Data Augmentation

Paper
Code

HandTailor: Towards High-Precision Monocular 3D Hand Recovery

2 code implementations • 18 Feb 2021 • Jun Lv, Wenqiang Xu, Lixin Yang, Sucheng Qian, Chongzhao Mao, Cewu Lu

3D hand pose estimation and shape recovery are challenging tasks in computer vision.

3D Hand Pose Estimation Vocal Bursts Intensity Prediction

131

Paper
Code

Transferable Interactiveness Knowledge for Human-Object Interaction Detection

1 code implementation • 25 Jan 2021 • Yong-Lu Li, Xinpeng Liu, Xiaoqian Wu, Xijie Huang, Liang Xu, Cewu Lu

Human-Object Interaction (HOI) detection is an important problem to understand how humans interact with objects.

Ranked #28 on Human-Object Interaction Detection on V-COCO

Human-Object Interaction Detection Object

228

Paper
Code

Graspness Discovery in Clutters for Fast and Accurate Grasp Detection

1 code implementation • ICCV 2021 • Chenxi Wang, Hao-Shu Fang, Minghao Gou, Hongjie Fang, Jin Gao, Cewu Lu

To quickly detect graspness in practice, we develop a neural network named graspness model to approximate the searching process.

Ranked #3 on Robotic Grasping on GraspNet-1Billion

Robotic Grasping

101

Paper
Code

TDAF: Top-Down Attention Framework for Vision Tasks

no code implementations • 14 Dec 2020 • Bo Pang, Yizhuo Li, Jiefeng Li, Muchen Li, Hanwen Cao, Cewu Lu

Such spatial and attention features are nested deeply, therefore, the proposed framework works in a mixed top-down and bottom-up manner.

Action Recognition object-detection +2

Paper
Add Code

Learning Universal Shape Dictionary for Realtime Instance Segmentation

1 code implementation • 2 Dec 2020 • Tutian Tang, Wenqiang Xu, Ruolin Ye, Lixin Yang, Cewu Lu

First, it learns a dictionary from a large collection of shape datasets, making any shape being able to be decomposed into a linear combination through the dictionary.

Explainable Models Instance Segmentation +3

Paper
Code

CPF: Learning a Contact Potential Field to Model the Hand-Object Interaction

1 code implementation • ICCV 2021 • Lixin Yang, Xinyu Zhan, Kailin Li, Wenqiang Xu, Jiefeng Li, Cewu Lu

In this paper, we present an explicit contact representation namely Contact Potential Field (CPF), and a learning-fitting hybrid framework namely MIHO to Modeling the Interaction of Hand and Object.

Object Pose Estimation

115

Paper
Code

HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation

3 code implementations • CVPR 2021 • Jiefeng Li, Chao Xu, Zhicun Chen, Siyuan Bian, Lixin Yang, Cewu Lu

We show that HybrIK preserves both the accuracy of 3D pose and the realistic body structure of the parametric human model, leading to a pixel-aligned 3D body mesh and a more accurate 3D pose than the pure 3D keypoint estimation methods.

Ranked #2 on 3D Human Pose Estimation on EMDB

3D human pose and shape estimation Keypoint Estimation

1,117

Paper
Code

Canonical Voting: Towards Robust Oriented Bounding Box Detection in 3D Scenes

1 code implementation • CVPR 2022 • Yang You, Zelin Ye, Yujing Lou, Chengkun Li, Yong-Lu Li, Lizhuang Ma, Weiming Wang, Cewu Lu

In the work, we disentangle the direct offset into Local Canonical Coordinates (LCC), box scales and box orientations.

3D Object Detection object-detection

Paper
Code

UKPGAN: A General Self-Supervised Keypoint Detector

1 code implementation • CVPR 2022 • Yang You, Wenhai Liu, Yanjie Ze, Yong-Lu Li, Weiming Wang, Cewu Lu

Keypoint detection is an essential component for the object registration and alignment.

Keypoint Detection Object

Paper
Code

HOI Analysis: Integrating and Decomposing Human-Object Interaction

2 code implementations • NeurIPS 2020 • Yong-Lu Li, Xinpeng Liu, Xiaoqian Wu, Yizhuo Li, Cewu Lu

Meanwhile, isolated human and object can also be integrated into coherent HOI again.

Ranked #20 on Human-Object Interaction Detection on V-COCO

Human-Object Interaction Detection Object

214

Paper
Code

DIRV: Dense Interaction Region Voting for End-to-End Human-Object Interaction Detection

1 code implementation • 2 Oct 2020 • Hao-Shu Fang, Yichen Xie, Dian Shao, Cewu Lu

On the other hand, existing one-stage methods mainly focus on the union regions of interactions, which introduce unnecessary visual information as disturbances to HOI detection.

Ranked #15 on Human-Object Interaction Detection on V-COCO

Human-Object Interaction Detection

Paper
Code

DecAug: Augmenting HOI Detection via Decomposition

no code implementations • 2 Oct 2020 • Yichen Xie, Hao-Shu Fang, Dian Shao, Yong-Lu Li, Cewu Lu

Human-object interaction (HOI) detection requires a large amount of annotated data.

Ranked #68 on Domain Generalization on PACS

Data Augmentation Domain Generalization +2

Paper
Add Code

BiHand: Recovering Hand Mesh with Multi-stage Bisected Hourglass Networks

1 code implementation • 12 Aug 2020 • Lixin Yang, Jiasen Li, Wenqiang Xu, Yiqun Diao, Cewu Lu

Inside each stage, BiHand adopts a novel bisecting design which allows the networks to encapsulate two closely related information (e. g. 2D keypoints and silhouette in 2D seeding stage, 3D joints, and depth map in 3D lifting stage, joint rotations and shape parameters in the mesh generation stage) in a single forward pass.

Pose Tracking

Paper
Code

ASAP-Net: Attention and Structure Aware Point Cloud Sequence Segmentation

1 code implementation • 12 Aug 2020 • Hanwen Cao, Yongyi Lu, Cewu Lu, Bo Pang, Gongshen Liu, Alan Yuille

In this paper, we further improve spatio-temporal point cloud feature learning with a flexible module called ASAP considering both attention and structure information across frames, which we find as two important factors for successful segmentation in dynamic point clouds.

Segmentation

Paper
Code

HMOR: Hierarchical Multi-Person Ordinal Relations for Monocular Multi-Person 3D Pose Estimation

no code implementations • ECCV 2020 • Jiefeng Li, Can Wang, Wentao Liu, Chen Qian, Cewu Lu

The HMOR encodes interaction information as the ordinal relations of depths and angles hierarchically, which captures the body-part and joint level semantic and maintains global consistency at the same time.

Ranked #6 on 3D Multi-Person Pose Estimation (absolute) on MuPoTS-3D

3D Multi-Person Pose Estimation (absolute) 3D Multi-Person Pose Estimation (root-relative) +2

Paper
Add Code

Approximated Bilinear Modules for Temporal Modeling

1 code implementation • ICCV 2019 • Xinqi Zhu, Chang Xu, Langwen Hui, Cewu Lu, DaCheng Tao

Specifically, we show how two-layer subnets in CNNs can be converted to temporal bilinear modules by adding an auxiliary-branch.

Action Recognition Video Classification

Paper
Code

TubeTK: Adopting Tubes to Track Multi-Object in a One-Step Training Model

1 code implementation • CVPR 2020 • Bo Pang, Yizhuo Li, Yifan Zhang, Muchen Li, Cewu Lu

As deep learning brings excellent performances to object detection algorithms, Tracking by Detection (TBD) has become the mainstream tracking framework.

Multi-Object Tracking Object +2

137

Paper
Code

Complex Sequential Understanding through the Awareness of Spatial and Temporal Concepts

1 code implementation • 30 May 2020 • Bo Pang, Kaiwen Zha, Hanwen Cao, Jiajun Tang, Minghui Yu, Cewu Lu

Understanding sequential information is a fundamental task for artificial intelligence.

Action Recognition Temporal Action Localization

Paper
Code

NTIRE 2020 Challenge on Video Quality Mapping: Methods and Results

no code implementations • 5 May 2020 • Dario Fuoli, Zhiwu Huang, Martin Danelljan, Radu Timofte, Hua Wang, Longcun Jin, Dewei Su, Jing Liu, Jaehoon Lee, Michal Kudelski, Lukasz Bala, Dmitry Hrybov, Marcin Mozejko, Muchen Li, Si-Yao Li, Bo Pang, Cewu Lu, Chao Li, Dongliang He, Fu Li, Shilei Wen

For track 2, some existing methods are evaluated, showing promising solutions to the weakly-supervised video quality mapping problem.

Paper
Add Code

Transferable Active Grasping and Real Embodied Dataset

1 code implementation • 28 Apr 2020 • Xiangyu Chen, Zelin Ye, Jiankai Sun, Yuda Fan, Fang Hu, Chenxi Wang, Cewu Lu

Grasping in cluttered scenes is challenging for robot vision systems, as detection accuracy can be hindered by partial occlusion of objects.

Reinforcement Learning (RL)

Paper
Code

Recursive Social Behavior Graph for Trajectory Prediction

no code implementations • CVPR 2020 • Jianhua Sun, Qinhong Jiang, Cewu Lu

Social interaction is an important topic in human trajectory prediction to generate plausible paths.

Trajectory Prediction

Paper
Add Code

Semantic Correspondence via 2D-3D-2D Cycle

1 code implementation • 20 Apr 2020 • Yang You, Chengkun Li, Yujing Lou, Zhoujun Cheng, Lizhuang Ma, Cewu Lu, Weiming Wang

Visual semantic correspondence is an important topic in computer vision and could help machine understand objects in our daily life.

Semantic correspondence

Paper
Code

Detailed 2D-3D Joint Representation for Human-Object Interaction

1 code implementation • CVPR 2020 • Yong-Lu Li, Xinpeng Liu, Han Lu, Shiyi Wang, Junqi Liu, Jiefeng Li, Cewu Lu

In light of these, we propose a detailed 2D-3D joint representation learning method.

Ranked #1 on Human-Object Interaction Detection on Ambiguious-HOI

Action Understanding Human-Object Interaction Detection +3

Paper
Code

Asynchronous Interaction Aggregation for Action Detection

2 code implementations • ECCV 2020 • Jiajun Tang, Jin Xia, Xinzhi Mu, Bo Pang, Cewu Lu

We propose the Asynchronous Interaction Aggregation network (AIA) that leverages different interactions to boost action detection.

Action Detection

387

Paper
Code

PaStaNet: Toward Human Activity Knowledge Engine

2 code implementations • CVPR 2020 • Yong-Lu Li, Liang Xu, Xinpeng Liu, Xijie Huang, Yue Xu, Shiyi Wang, Hao-Shu Fang, Ze Ma, Mingyang Chen, Cewu Lu

In light of this, we propose a new path: infer human part states first and then reason out the activities based on part-level semantics.

Ranked #3 on Human-Object Interaction Detection on HICO

Action Detection Human-Object Interaction Detection +1

217

Paper
Code

Symmetry and Group in Attribute-Object Compositions

1 code implementation • CVPR 2020 • Yong-Lu Li, Yue Xu, Xiaohan Mao, Cewu Lu

To model the compositional nature of these general concepts, it is a good choice to learn them through transformations, such as coupling and decoupling.

Ranked #1 on Compositional Zero-Shot Learning on MIT-States (Top-1 accuracy % metric)

Attribute Compositional Zero-Shot Learning +1

Paper
Code

KeypointNet: A Large-scale 3D Keypoint Dataset Aggregated from Numerous Human Annotations

1 code implementation • CVPR 2020 • Yang You, Yujing Lou, Chengkun Li, Zhoujun Cheng, Liangwei Li, Lizhuang Ma, Weiming Wang, Cewu Lu

Detecting 3D objects keypoints is of great interest to the areas of both graphics and computer vision.

144

Paper
Code

Deep Variational Luenberger-type Observer for Stochastic Video Prediction

no code implementations • 12 Feb 2020 • Dong Wang, Feng Zhou, Zheng Yan, Guang Yao, Zongxuan Liu, Wennan Ma, Cewu Lu

Our model builds upon an variational encoder which transforms the input video into a latent feature space and a Luenberger-type observer which captures the dynamic evolution of the latent features.

Representation Learning Video Prediction +1

Paper
Add Code

GraspNet: A Large-Scale Clustered and Densely Annotated Dataset for Object Grasping

no code implementations • 31 Dec 2019 • Hao-Shu Fang, Chenxi Wang, Minghao Gou, Cewu Lu

Object grasping is critical for many applications, which is also a challenging computer vision problem.

Paper
Add Code

Human Correspondence Consensus for 3D Object Semantic Understanding

1 code implementation • ECCV 2020 • Yujing Lou, Yang You, Chengkun Li, Zhoujun Cheng, Liangwei Li, Lizhuang Ma, Weiming Wang, Cewu Lu

Semantic understanding of 3D objects is crucial in many applications such as object manipulation.

3D Feature Matching 3D Point Cloud Matching +1

Paper
Code

3D Objectness Estimation via Bottom-up Regret Grouping

no code implementations • 5 Dec 2019 • Zelin Ye, Yan Hao, Liang Xu, Rui Zhu, Cewu Lu

Further ablation study also demonstrates the effectiveness of our grouping predictor and regret mechanism.

Paper
Add Code

Transferable Force-Torque Dynamics Model for Peg-in-hole Task

no code implementations • 30 Nov 2019 • Junfeng Ding, Chen Wang, Cewu Lu

We present a learning-based force-torque dynamics to achieve model-based control for contact-rich peg-in-hole task using force-only inputs.

Model-based Reinforcement Learning Model Predictive Control

Paper
Add Code

Attribute Restoration Framework for Anomaly Detection

1 code implementation • 25 Nov 2019 • Chaoqin Huang, Fei Ye, Jinkun Cao, Maosen Li, Ya zhang, Cewu Lu

We here propose to break this equivalence by erasing selected attributes from the original data and reformulate it as a restoration task, where the normal and the anomalous data are expected to be distinguishable based on restoration errors.

Ranked #21 on Anomaly Detection on One-class CIFAR-10

Anomaly Detection Attribute +1

Paper
Code

6-PACK: Category-level 6D Pose Tracker with Anchor-Based Keypoints

2 code implementations • 23 Oct 2019 • Chen Wang, Roberto Martín-Martín, Danfei Xu, Jun Lv, Cewu Lu, Li Fei-Fei, Silvio Savarese, Yuke Zhu

We present 6-PACK, a deep learning approach to category-level 6D object pose tracking on RGB-D data.

Ranked #1 on 6D Pose Estimation using RGBD on REAL275 (Rerr metric)

6D Pose Estimation 6D Pose Estimation using RGBD +2

286

Paper
Code

RGB-D Individual Segmentation

no code implementations • 16 Oct 2019 • Wenqiang Xu, Yanjun Fu, Yuchen Luo, Chang Liu, Cewu Lu

Fine-grained recognition task deals with sub-category classification problem, which is important for real-world applications.

CoLA Segmentation

Paper
Add Code

Template-Instance Loss for Offline Handwritten Chinese Character Recognition

no code implementations • 12 Oct 2019 • Yao Xiao, Dan Meng, Cewu Lu, Chi-Keung Tang

The long-standing challenges for offline handwritten Chinese character recognition (HCCR) are twofold: Chinese characters can be very diverse and complicated while similarly looking, and cursive handwriting (due to increased writing speed and infrequent pen lifting) makes strokes and even characters connected together in a flowing manner.

Offline Handwritten Chinese Character Recognition

Paper
Add Code

InstaBoost: Boosting Instance Segmentation via Probability Map Guided Copy-Pasting

3 code implementations • ICCV 2019 • Hao-Shu Fang, Jianhua Sun, Runzhong Wang, Minghao Gou, Yong-Lu Li, Cewu Lu

With the guidance of such map, we boost the performance of R101-Mask R-CNN on instance segmentation from 35. 7 mAP to 37. 9 mAP without modifying the backbone or network structure.

Ranked #78 on Instance Segmentation on COCO test-dev

Data Augmentation Instance Segmentation +3

27,790

Paper
Code

Cross-Domain Adaptation for Animal Pose Estimation

no code implementations • ICCV 2019 • Jinkun Cao, Hongyang Tang, Hao-Shu Fang, Xiaoyong Shen, Cewu Lu, Yu-Wing Tai

Therefore, the easily available human pose dataset, which is of a much larger scale than our labeled animal dataset, provides important prior knowledge to boost up the performance on animal pose estimation.

Animal Pose Estimation Domain Adaptation

Paper
Add Code

Three Branches: Detecting Actions With Richer Features

no code implementations • 13 Aug 2019 • Jin Xia, Jiajun Tang, Cewu Lu

We present our three branch solutions for International Challenge on Activity Recognition at CVPR2019.

Activity Recognition Spatio-Temporal Action Localization +1

Paper
Add Code

Explicit Shape Encoding for Real-Time Instance Segmentation

1 code implementation • ICCV 2019 • Wenqiang Xu, Haiyang Wang, Fubo Qi, Cewu Lu

In this paper, we propose a novel top-down instance segmentation framework based on explicit shape encoding, named \textbf{ESE-Seg}.

Ranked #3 on Semantic Contour Prediction on Sbd val

Object object-detection +4

106

Paper
Code

HAKE: Human Activity Knowledge Engine

4 code implementations • 13 Apr 2019 • Yong-Lu Li, Liang Xu, Xinpeng Liu, Xijie Huang, Yue Xu, Mingyang Chen, Ze Ma, Shiyi Wang, Hao-Shu Fang, Cewu Lu

To address these and promote the activity understanding, we build a large-scale Human Activity Knowledge Engine (HAKE) based on the human body part states.

Ranked #2 on Human-Object Interaction Detection on HICO (using extra training data)

Action Detection Human-Object Interaction Detection +1

217

Paper
Code

Combinational Q-Learning for Dou Di Zhu

1 code implementation • 24 Jan 2019 • Yang You, Liangwei Li, Baisong Guo, Weiming Wang, Cewu Lu

Deep reinforcement learning (DRL) has gained a lot of attention in recent years, and has been proven to be able to play Atari games and Go at or above human levels.

Atari Games Card Games +1

156

Paper
Code

DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion

8 code implementations • CVPR 2019 • Chen Wang, Danfei Xu, Yuke Zhu, Roberto Martín-Martín, Cewu Lu, Li Fei-Fei, Silvio Savarese

A key technical challenge in performing 6D object pose estimation from RGB-D image is to fully leverage the two complementary data sources.

Ranked #4 on 6D Pose Estimation on LineMOD

6D Pose Estimation 6D Pose Estimation using RGBD +1

1,041

Paper
Code

Estimating 6D Pose From Localizing Designated Surface Keypoints

6 code implementations • 4 Dec 2018 • Zelin Zhao, Gao Peng, Haoyu Wang, Hao-Shu Fang, Chengkun Li, Cewu Lu

In this paper, we present an accurate yet effective solution for 6D pose estimation from an RGB image.

Ranked #17 on 6D Pose Estimation using RGB on LineMOD

6D Pose Estimation 6D Pose Estimation using RGB +1

Paper
Code

CrowdPose: Efficient Crowded Scenes Pose Estimation and A New Benchmark

3 code implementations • CVPR 2019 • Jiefeng Li, Can Wang, Hao Zhu, Yihuan Mao, Hao-Shu Fang, Cewu Lu

In this paper, we propose a novel and efficient method to tackle the problem of pose estimation in the crowd and a new dataset to better evaluate algorithms.

Ranked #6 on Multi-Person Pose Estimation on OCHuman

Keypoint Detection Multi-Person Pose Estimation

4,995

Paper
Code

Deep RNN Framework for Visual Sequential Applications

1 code implementation • CVPR 2019 • Bo Pang, Kaiwen Zha, Hanwen Cao, Chen Shi, Cewu Lu

There are mainly two novel designs in our deep RNN framework: one is a new RNN module called Context Bridge Module (CBM) which splits the information flowing along the sequence (temporal direction) and along depth (spatial representation direction), making it easier to train when building deep by balancing these two directions; the other is the Overlap Coherence Training Scheme that reduces the training complexity for long visual sequential tasks on account of the limitation of computing resources.

Future prediction SSIM +1

Paper
Code

Pointwise Rotation-Invariant Network with Adaptive Sampling and 3D Spherical Voxel Convolution

1 code implementation • 23 Nov 2018 • Yang You, Yujing Lou, Qi Liu, Yu-Wing Tai, Lizhuang Ma, Cewu Lu, Weiming Wang

Point cloud analysis without pose priors is very challenging in real applications, as the orientations of point clouds are often unknown.

3D Feature Matching Data Augmentation

Paper
Code

Transferable Interactiveness Knowledge for Human-Object Interaction Detection

3 code implementations • CVPR 2019 • Yong-Lu Li, Siyuan Zhou, Xijie Huang, Liang Xu, Ze Ma, Hao-Shu Fang, Yan-Feng Wang, Cewu Lu

On account of the generalization of interactiveness, interactiveness network is a transferable knowledge learner and can be cooperated with any HOI detection models to achieve desirable results.

Ranked #29 on Human-Object Interaction Detection on V-COCO

Human-Object Interaction Detection Object

228

Paper
Code

NavigationNet: A Large-scale Interactive Indoor Navigation Dataset

no code implementations • 25 Aug 2018 • He Huang, Yujing Shen, Jiankai Sun, Cewu Lu

Indoor navigation aims at performing navigation within buildings.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Pairwise Body-Part Attention for Recognizing Human-Object Interactions

1 code implementation • ECCV 2018 • Hao-Shu Fang, Jinkun Cao, Yu-Wing Tai, Cewu Lu

We propose a new pairwise body-part attention model which can learn to focus on crucial parts, and their correlations for HOI recognition.

Ranked #5 on Human-Object Interaction Detection on HICO

feature selection Human-Object Interaction Detection +1

Paper
Code

AXNet: ApproXimate computing using an end-to-end trainable neural network

2 code implementations • 27 Jul 2018 • Zhenghao Peng, Xuyang Chen, Chengwen Xu, Naifeng Jing, Xiaoyao Liang, Cewu Lu, Li Jiang

To guarantee the approximation quality, existing works deploy two neural networks (NNs), e. g., an approximator and a predictor.

Multi-Task Learning Philosophy

Paper
Code

PointSIFT: A SIFT-like Network Module for 3D Point Cloud Semantic Segmentation

4 code implementations • 2 Jul 2018 • Mingyang Jiang, Yiran Wu, Tianqi Zhao, Zelin Zhao, Cewu Lu

Recently, 3D understanding research sheds light on extracting features from point cloud directly, which requires effective shape pattern description of point clouds.

Point Cloud Segmentation Semantic Segmentation

636

Paper
Code

LiDAR-Video Driving Dataset: Learning Driving Policies Effectively

no code implementations • CVPR 2018 • Yiping Chen, Jingkang Wang, Jonathan Li, Cewu Lu, Zhipeng Luo, Han Xue, Cheng Wang

Learning autonomous-driving policies is one of the most challenging but promising tasks for computer vision.

Autonomous Driving

Paper
Add Code

Environment Upgrade Reinforcement Learning for Non-Differentiable Multi-Stage Pipelines

no code implementations • CVPR 2018 • Shuqin Xie, Zitian Chen, Chao Xu, Cewu Lu

We propose a training algorithm for this framework to address the different training demands of agent and environment.

Instance Segmentation Pose Estimation +3

Paper
Add Code

Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer

1 code implementation • CVPR 2018 • Hao-Shu Fang, Guansong Lu, Xiaolin Fang, Jianwen Xie, Yu-Wing Tai, Cewu Lu

In this paper, we present a novel method to generate synthetic human part segmentation data using easily-obtained human keypoint annotations.

Ranked #4 on Human Part Segmentation on PASCAL-Part (using extra training data)

Human Parsing Human Part Segmentation +3

300

Paper
Code

Recurrent Residual Module for Fast Inference in Videos

no code implementations • CVPR 2018 • Bowen Pan, Wuwei Lin, Xiaolin Fang, Chaoqin Huang, Bolei Zhou, Cewu Lu

Deep convolutional neural networks (CNNs) have made impressive progress in many video recognition tasks such as video pose estimation and video object detection.

object-detection Pose Estimation +2

Paper
Add Code

Human Action Adverb Recognition: ADHA Dataset and A Three-Stream Hybrid Model

no code implementations • 4 Feb 2018 • Bo Pang, Kaiwen Zha, Cewu Lu

We introduce the first benchmark for a new problem --- recognizing human action adverbs (HAA): "Adverbs Describing Human Actions" (ADHA).

Action Recognition Image Captioning +1

Paper
Add Code

Pose Flow: Efficient Online Pose Tracking

1 code implementation • 3 Feb 2018 • Yuliang Xiu, Jiefeng Li, Haoyu Wang, Yinghong Fang, Cewu Lu

Multi-person articulated pose tracking in unconstrained videos is an important while challenging problem.

Ranked #9 on Pose Tracking on PoseTrack2017 (using extra training data)

Pose Tracking

420

Paper
Code

Annotation-Free and One-Shot Learning for Instance Segmentation of Homogeneous Object Clusters

no code implementations • 1 Feb 2018 • Zheng Wu, Ruiheng Chang, Jiaxu Ma, Cewu Lu, Chi-Keung Tang

We propose a novel approach for instance segmen- tation given an image of homogeneous object clus- ter (HOC).

Instance Segmentation One-Shot Learning +1

Paper
Add Code

SRDA: Generating Instance Segmentation Annotation Via Scanning, Reasoning And Domain Adaptation

1 code implementation • ECCV 2018 • Wenqiang Xu, Yonglu Li, Cewu Lu

Instance segmentation is a problem of significance in computer vision.

Domain Adaptation Instance Segmentation +2

Paper
Code

TRL: Discriminative Hints for Scalable Reverse Curriculum Learning

no code implementations • ICLR 2018 • Chen Wang, Xiangyu Chen, Zelin Ye, Jialu Wang, Ziruo Cai, Shixiang Gu, Cewu Lu

However, tasks with sparse rewards remain challenging when the state space is large.

Robot Manipulation

Paper
Add Code

Online Video Object Detection Using Association LSTM

no code implementations • ICCV 2017 • Yongyi Lu, Cewu Lu, Chi-Keung Tang

Video object detection is a fundamental tool for many applications.

Object object-detection +1

Paper
Add Code

Virtual to Real Reinforcement Learning for Autonomous Driving

6 code implementations • 13 Apr 2017 • Xinlei Pan, Yurong You, Ziyan Wang, Cewu Lu

To our knowledge, this is the first successful case of driving policy trained by reinforcement learning that can adapt to real world driving data.

Autonomous Driving Domain Adaptation +5

Paper
Code

Beyond Holistic Object Recognition: Enriching Image Understanding with Part States

no code implementations • CVPR 2018 • Cewu Lu, Hao Su, Yongyi Lu, Li Yi, Chi-Keung Tang, Leonidas Guibas

Important high-level vision tasks such as human-object interaction, image captioning and robotic manipulation require rich semantic descriptions of objects at part level.

Human-Object Interaction Detection Image Captioning +1

Paper
Add Code

RMPE: Regional Multi-person Pose Estimation

14 code implementations • ICCV 2017 • Hao-Shu Fang, Shuqin Xie, Yu-Wing Tai, Cewu Lu

In this paper, we propose a novel regional multi-person pose estimation (RMPE) framework to facilitate pose estimation in the presence of inaccurate human bounding boxes.

Ranked #1 on Pose Estimation on UAV-Human

2D Human Pose Estimation Human Detection +2

7,713

Paper
Code

Visual Relationship Detection with Language Priors

no code implementations • 31 Jul 2016 • Cewu Lu, Ranjay Krishna, Michael Bernstein, Li Fei-Fei

We improve on prior work by leveraging language priors from semantic word embeddings to finetune the likelihood of a predicted relationship.

Ranked #2 on Scene Graph Generation on VRD

Content-Based Image Retrieval Relationship Detection +3

Paper
Add Code

Contour Box: Rejecting Object Proposals Without Explicit Closed Contours

no code implementations • ICCV 2015 • Cewu Lu, Shu Liu, Jiaya Jia, Chi-Keung Tang

Closed contour is an important objectness indicator.

Object

Paper
Add Code

Square Localization for Efficient and Accurate Object Detection

no code implementations • ICCV 2015 • Cewu Lu, Yongyi Lu, Hao Chen, Chi-Keung Tang

In the testing phase, sliding CNN models are applied which produces a set of response maps that can be effectively filtered by the learned co-presence prior to output the final bounding boxes for localizing an object.

Object object-detection +2

Paper
Add Code

Box Aggregation for Proposal Decimation: Last Mile of Object Detection

no code implementations • ICCV 2015 • Shu Liu, Cewu Lu, Jiaya Jia

Regions-with-convolutional-neural-network (RCNN) is now a commonly employed object detection pipeline.

Object object-detection +1

Paper
Add Code

Complexity-Adaptive Distance Metric for Object Proposals Generation

no code implementations • CVPR 2015 • Yao Xiao, Cewu Lu, Efstratios Tsougenis, Yongyi Lu, Chi-Keung Tang

Distance metric plays a key role in grouping superpixels to produce object proposals for object detection.

Object object-detection +2

Paper
Add Code

Deep LAC: Deep Localization, Alignment and Classification for Fine-Grained Recognition

no code implementations • CVPR 2015 • Di Lin, Xiaoyong Shen, Cewu Lu, Jiaya Jia

Our major contribution is to propose a valve linkage function(VLF) for back-propagation chaining and form our deep localization, alignment and classification (LAC) system.

Classification General Classification

Paper
Add Code

1-HKUST: Object Detection in ILSVRC 2014

no code implementations • 22 Sep 2014 • Cewu Lu, Hao Chen, Qifeng Chen, Hei Law, Yao Xiao, Chi-Keung Tang

We participated in the object detection track of ILSVRC 2014 and received the fourth place among the 38 teams.

Object object-detection +3

Paper
Add Code

Two-Class Weather Classification

no code implementations • CVPR 2014 • Cewu Lu, Di Lin, Jiaya Jia, Chi-Keung Tang

Given a single outdoor image, this paper proposes a collaborative learning approach for labeling it as either sunny or cloudy.

Classification General Classification +1

Paper
Add Code

Range-Sample Depth Feature for Action Recognition

no code implementations • CVPR 2014 • Cewu Lu, Jiaya Jia, Chi-Keung Tang

We propose binary range-sample feature in depth.

Action Recognition Temporal Action Localization

Paper
Add Code

L0 Regularized Stationary Time Estimation for Crowd Group Analysis

no code implementations • CVPR 2014 • Shuai Yi, Xiaogang Wang, Cewu Lu, Jiaya Jia

We tackle stationary crowd analysis in this paper, which is similarly important as modeling mobile groups in crowd scenes and finds many applications in surveillance.

Paper
Add Code

Learning Important Spatial Pooling Regions for Scene Classification

no code implementations • CVPR 2014 • Di Lin, Cewu Lu, Renjie Liao, Jiaya Jia

We address the false response influence problem when learning and applying discriminative parts to construct the mid-level representation in scene classification.

Classification General Classification +1

Paper
Add Code

Online Robust Dictionary Learning

no code implementations • CVPR 2013 • Cewu Lu, Jiaping Shi, Jiaya Jia

Online dictionary learning is particularly useful for processing large-scale and dynamic data in computer vision.

Dictionary Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.