ECT: Fine-grained Edge Detection with Learned Cause Tokens

1 code implementation6 Aug 2023 Shaocong Xu, Xiaoxue Chen, Yuhang Zheng, Guyue Zhou, Yurong Chen, Hongbin Zha, Hao Zhao

To address these three issues, we propose a two-stage transformer-based network sequentially predicting generic edges and fine-grained edges, which has a global receptive field thanks to the attention mechanism.

Edge Detection

STRAP: Structured Object Affordance Segmentation with Point Supervision

1 code implementation17 Apr 2023 Leiyao Cui, Xiaoxue Chen, Hao Zhao, Guyue Zhou, Yixin Zhu

By label affinity, we refer to affordance segmentation as a multi-label prediction problem: A plate can be both holdable and containable.

Scene Understanding

DPF: Learning Dense Prediction Fields with Weak Supervision

1 code implementation CVPR 2023 Xiaoxue Chen, Yuhang Zheng, Yupeng Zheng, Qiang Zhou, Hao Zhao, Guyue Zhou, Ya-Qin Zhang

We showcase the effectiveness of DPFs using two substantially different tasks: high-level semantic parsing and low-level intrinsic image decomposition.

Intrinsic Image Decomposition Scene Understanding +1

TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation

1 code implementation19 Oct 2022 Pengfei Li, Beiwen Tian, Yongliang Shi, Xiaoxue Chen, Hao Zhao, Guyue Zhou, Ya-Qin Zhang

As such, we study the challenging problem of task oriented detection, which aims to find objects that best afford an action indicated by verbs like sit comfortably on.

Instance Segmentation Referring Expression +2

Understanding Embodied Reference with Touch-Line Transformer

1 code implementation11 Oct 2022 Yang Li, Xiaoxue Chen, Hao Zhao, Jiangtao Gong, Guyue Zhou, Federico Rossano, Yixin Zhu

Human studies have revealed that objects referred to or pointed to do not lie on the elbow-wrist line, a common misconception; instead, they lie on the so-called virtual touch line.

Distance-Aware Occlusion Detection with Focused Attention

1 code implementation23 Aug 2022 Yang Li, Yucheng Tu, Xiaoxue Chen, Hao Zhao, Guyue Zhou

In this work, (1) we propose a novel three-decoder architecture as the infrastructure for focused attention; 2) we use the generalized intersection box prediction task to effectively guide our model to focus on occlusion-specific regions; 3) our model achieves a new state-of-the-art performance on distance-aware relationship detection.

Human-Object Interaction Detection Relationship Detection +1

SNAKE: Shape-aware Neural 3D Keypoint Field

1 code implementation3 Jun 2022 Chengliang Zhong, Peixing You, Xiaoxue Chen, Hao Zhao, Fuchun Sun, Guyue Zhou, Xiaodong Mu, Chuang Gan, Wenbing Huang

Detecting 3D keypoints from point clouds is important for shape reconstruction, while this work investigates the dual question: can shape reconstruction benefit 3D keypoint detection?

Keypoint Detection

Cerberus Transformer: Joint Semantic, Affordance and Attribute Parsing

1 code implementation CVPR 2022 Xiaoxue Chen, Tianyu Liu, Hao Zhao, Guyue Zhou, Ya-Qin Zhang

Multi-task indoor scene understanding is widely considered as an intriguing formulation, as the affinity of different tasks may lead to improved performance.

Scene Understanding Semantic Segmentation +1

PQ-Transformer: Jointly Parsing 3D Objects and Layouts from Point Clouds

1 code implementation12 Sep 2021 Xiaoxue Chen, Hao Zhao, Guyue Zhou, Ya-Qin Zhang

Such a scheme has two limitations: 1) Storing and running several networks for different tasks are expensive for typical robotic platforms.

object-detection Object Detection +2

Text Recognition in the Wild: A Survey

1 code implementation7 May 2020 Xiaoxue Chen, Lianwen Jin, Yuanzhi Zhu, Canjie Luo, Tianwei Wang

This paper aims to (1) summarize the fundamental problems and the state-of-the-art associated with scene text recognition; (2) introduce new insights and ideas; (3) provide a comprehensive review of publicly available resources; (4) point out directions for future work.

Scene Text Recognition

Decoupled Attention Network for Text Recognition

4 code implementations21 Dec 2019 Tianwei Wang, Yuanzhi Zhu, Lianwen Jin, Canjie Luo, Xiaoxue Chen, Yaqiang Wu, Qianying Wang, Mingxiang Cai

To remedy this issue, we propose a decoupled attention network (DAN), which decouples the alignment operation from using historical decoding results.

Handwritten Text Recognition Scene Text Recognition

Adaptive Embedding Gate for Attention-Based Scene Text Recognition

no code implementations26 Aug 2019 Xiaoxue Chen, Tianwei Wang, Yuanzhi Zhu, Lianwen Jin, Canjie Luo

Scene text recognition has attracted particular research interest because it is a very challenging problem and has various applications.

Scene Text Recognition

