MV-DETR: Multi-modality indoor object detection by Multi-View DEtecton TRansformers

no code implementations13 Aug 2024 ZiChao Dong, Yilin Zhang, Xufeng Huang, Hang Ji, Zhan Shi, Xin Zhan, Junbo Chen

Unfortunately, single RGBD dataset with thousands of data is not enough for training an discriminating filter for visual texture feature extraction.

3D Object Detection object-detection

Low-Resolution Self-Attention for Semantic Segmentation

1 code implementation8 Oct 2023 Yu-Huan Wu, Shi-Chen Zhang, Yun Liu, Le Zhang, Xin Zhan, Daquan Zhou, Jiashi Feng, Ming-Ming Cheng, Liangli Zhen

Semantic segmentation tasks naturally require high-resolution information for pixel-wise segmentation and global context information for class prediction.

Decoder Segmentation +1

Construction of Paired Knowledge Graph-Text Datasets Informed by Cyclic Evaluation

no code implementations20 Sep 2023 Ali Mousavi, Xin Zhan, He Bai, Peng Shi, Theo Rekatsinas, Benjamin Han, Yunyao Li, Jeff Pound, Josh Susskind, Natalie Schluter, Ihab Ilyas, Navdeep Jaitly

Guided by these observations, we construct a new, improved dataset called LAGRANGE using heuristics meant to improve equivalence between KG and text and show the impact of each of the heuristics on cyclic evaluation.

Hallucination Knowledge Graphs

PUPS: Point Cloud Unified Panoptic Segmentation

no code implementations13 Feb 2023 Shihao Su, Jianyun Xu, Huanyu Wang, Zhenwei Miao, Xin Zhan, Dayang Hao, Xi Li

Point cloud panoptic segmentation is a challenging task that seeks a holistic solution for both semantic and instance segmentation to predict groupings of coherent points.

Decoder Instance Segmentation +2

INT: Towards Infinite-frames 3D Detection with An Efficient Framework

1 code implementation30 Sep 2022 Jianyun Xu, Zhenwei Miao, Da Zhang, Hongyu Pan, Kaixuan Liu, Peihan Hao, Jun Zhu, Zhengyang Sun, Hongmin Li, Xin Zhan

By employing INT on CenterPoint, we can get around 7% (Waymo) and 15% (nuScenes) performance boost with only 2~4ms latency overhead, and currently SOTA on the Waymo 3D Detection leaderboard.

Ret3D: Rethinking Object Relations for Efficient 3D Object Detection in Driving Scenes

no code implementations18 Aug 2022 Yu-Huan Wu, Da Zhang, Le Zhang, Xin Zhan, Dengxin Dai, Yun Liu, Ming-Ming Cheng

Current efficient LiDAR-based detection frameworks are lacking in exploiting object relations, which naturally present in both spatial and temporal manners.

3D Object Detection Object +2

BE-STI: Spatial-Temporal Integrated Network for Class-Agnostic Motion Prediction With Bidirectional Enhancement

no code implementations CVPR 2022 Yunlong Wang, Hongyu Pan, Jun Zhu, Yu-Huan Wu, Xin Zhan, Kun Jiang, Diange Yang

In this paper, we propose a novel Spatial-Temporal Integrated network with Bidirectional Enhancement, BE-STI, to improve the temporal motion prediction performance by spatial semantic features, which points out an efficient way to combine semantic segmentation and motion prediction.

Autonomous Driving motion prediction +1

P2T: Pyramid Pooling Transformer for Scene Understanding

4 code implementations22 Jun 2021 Yu-Huan Wu, Yun Liu, Xin Zhan, Ming-Ming Cheng

A popular solution to this problem is to use a single pooling operation to reduce the sequence length.

Ranked #7 on RGB Salient Object Detection on DUTS-TE (max F-measure metric)

Image Classification Instance Segmentation +5

