SuctionNet-1Billion: A Large-Scale Benchmark for Suction Grasping

Hanwen Cao, Hao-Shu Fang, Wenhai Liu, Cewu Lu

Meanwhile, we propose a method to predict numerous suction poses from an RGB-D image of a cluttered scene and demonstrate our superiority against several previous methods.

Robotic Grasping

TDAF: Top-Down Attention Framework for Vision Tasks

Bo Pang, Yizhuo Li, Jiefeng Li, Muchen Li, Hanwen Cao, Cewu Lu

Such spatial and attention features are nested deeply, therefore, the proposed framework works in a mixed top-down and bottom-up manner.

Action Recognition Object Detection +1

ASAP-Net: Attention and Structure Aware Point Cloud Sequence Segmentation

Hanwen Cao, Yongyi Lu, Cewu Lu, Bo Pang, Gongshen Liu, Alan Yuille

In this paper, we further improve spatio-temporal point cloud feature learning with a flexible module called ASAP considering both attention and structure information across frames, which we find as two important factors for successful segmentation in dynamic point clouds.

Deep RNN Framework for Visual Sequential Applications

Bo Pang, Kaiwen Zha, Hanwen Cao, Chen Shi, Cewu Lu

There are mainly two novel designs in our deep RNN framework: one is a new RNN module called Context Bridge Module (CBM) which splits the information flowing along the sequence (temporal direction) and along depth (spatial representation direction), making it easier to train when building deep by balancing these two directions; the other is the Overlap Coherence Training Scheme that reduces the training complexity for long visual sequential tasks on account of the limitation of computing resources.

Future prediction SSIM +1

