Search Results for author: Jiangmiao Pang

Found 36 papers, 28 papers with code

3DGSR: Implicit Surface Reconstruction with 3D Gaussian Splatting

no code implementations • 30 Mar 2024 • Xiaoyang Lyu, Yang-tian Sun, Yi-Hua Huang, Xiuzhe Wu, ZiYi Yang, Yilun Chen, Jiangmiao Pang, Xiaojuan Qi

In this paper, we present an implicit surface reconstruction method with 3D Gaussian Splatting (3DGS), namely 3DGSR, that allows for accurate 3D reconstruction with intricate details while inheriting the high efficiency and rendering quality of 3DGS.

3D Reconstruction Surface Reconstruction

Paper
Add Code

GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction

no code implementations • 25 Feb 2024 • Xiao Chen, Quanyi Li, Tai Wang, Tianfan Xue, Jiangmiao Pang

Previous works attempt to automate this process using the Next-Best-View (NBV) policy for active 3D reconstruction.

3D Reconstruction Reinforcement Learning (RL)

Paper
Add Code

Multi-Object Tracking by Hierarchical Visual Representations

no code implementations • 24 Feb 2024 • Jinkun Cao, Jiangmiao Pang, Kris Kitani

We propose a new visual hierarchical representation paradigm for multi-object tracking.

Multi-Object Tracking Object

Paper
Add Code

Mixed Gaussian Flow for Diverse Trajectory Prediction

no code implementations • 19 Feb 2024 • Jiahe Chen, Jinkun Cao, Dahua Lin, Kris Kitani, Jiangmiao Pang

However, mapping from a standard Gaussian by a flow-based model hurts the capacity to capture complicated patterns of trajectories, ignoring the under-represented motion intentions in the training data.

Trajectory Prediction

Paper
Add Code

EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

1 code implementation • 26 Dec 2023 • Tai Wang, Xiaohan Mao, Chenming Zhu, Runsen Xu, Ruiyuan Lyu, Peisen Li, Xiao Chen, Wenwei Zhang, Kai Chen, Tianfan Xue, Xihui Liu, Cewu Lu, Dahua Lin, Jiangmiao Pang

In the realm of computer vision and robotics, embodied agents are expected to explore their environment and carry out human instructions.

Scene Understanding

291

Paper
Code

Hybrid Internal Model: Learning Agile Legged Locomotion with Simulated Robot Response

1 code implementation • 18 Dec 2023 • Junfeng Long, ZiRui Wang, Quanyi Li, Jiawei Gao, Liu Cao, Jiangmiao Pang

Robust locomotion control depends on accurate state estimations.

Contrastive Learning

138

Paper
Code

OV-PARTS: Towards Open-Vocabulary Part Segmentation

1 code implementation • NeurIPS 2023 • Meng Wei, Xiaoyu Yue, Wenwei Zhang, Shu Kong, Xihui Liu, Jiangmiao Pang

Secondly, part segmentation introduces an open granularity challenge due to the diverse and often ambiguous definitions of parts in the open world.

Open Vocabulary Semantic Segmentation Segmentation +1

Paper
Code

Understanding Masked Autoencoders From a Local Contrastive Perspective

no code implementations • 3 Oct 2023 • Xiaoyu Yue, Lei Bai, Meng Wei, Jiangmiao Pang, Xihui Liu, Luping Zhou, Wanli Ouyang

Masked AutoEncoder (MAE) has revolutionized the field of self-supervised learning with its simple yet effective masking and reconstruction strategies.

Contrastive Learning Data Augmentation +1

Paper
Add Code

Unified Human-Scene Interaction via Prompted Chain-of-Contacts

1 code implementation • 14 Sep 2023 • Zeqi Xiao, Tai Wang, Jingbo Wang, Jinkun Cao, Wenwei Zhang, Bo Dai, Dahua Lin, Jiangmiao Pang

Based on the definition, UniHSI constitutes a Large Language Model (LLM) Planner to translate language prompts into task plans in the form of CoC, and a Unified Controller that turns CoC into uniform task execution.

Language Modelling Large Language Model

117

Paper
Code

PointLLM: Empowering Large Language Models to Understand Point Clouds

3 code implementations • 31 Aug 2023 • Runsen Xu, Xiaolong Wang, Tai Wang, Yilun Chen, Jiangmiao Pang, Dahua Lin

The unprecedented advancements in Large Language Models (LLMs) have shown a profound impact on natural language processing but are yet to fully embrace the realm of 3D understanding.

Ranked #3 on 3D Question Answering (3D-QA) on 3D MM-Vet

3D Object Classification 3D Question Answering (3D-QA) +2

379

Paper
Code

Transformer-Based Visual Segmentation: A Survey

2 code implementations • 19 Apr 2023 • Xiangtai Li, Henghui Ding, Haobo Yuan, Wenwei Zhang, Jiangmiao Pang, Guangliang Cheng, Kai Chen, Ziwei Liu, Chen Change Loy

Recently, transformers, a type of neural network based on self-attention originally designed for natural language processing, have considerably surpassed previous convolutional or recurrent approaches in various vision processing tasks.

Autonomous Driving Point Cloud Segmentation +1

567

Paper
Code

DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object Detection and Tracking

1 code implementation • 29 Mar 2023 • Qing Lian, Tai Wang, Dahua Lin, Jiangmiao Pang

Recent multi-camera 3D object detectors usually leverage temporal information to construct multi-view stereo that alleviates the ill-posed depth estimation.

3D Object Detection Depth Estimation +3

Paper
Code

Position-Guided Point Cloud Panoptic Segmentation Transformer

1 code implementation • 23 Mar 2023 • Zeqi Xiao, Wenwei Zhang, Tai Wang, Chen Change Loy, Dahua Lin, Jiangmiao Pang

DEtection TRansformer (DETR) started a trend that uses a group of learnable queries for unified visual perception.

Ranked #1 on Panoptic Segmentation on SemanticKITTI

Instance Segmentation Panoptic Segmentation +3

Paper
Code

MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training

1 code implementation • CVPR 2023 • Runsen Xu, Tai Wang, Wenwei Zhang, Runjian Chen, Jinkun Cao, Jiangmiao Pang, Dahua Lin

This paper introduces the Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training and a carefully designed data-efficient 3D object detection benchmark on the Waymo dataset.

3D Object Detection object-detection

Paper
Code

Dense Distinct Query for End-to-End Object Detection

1 code implementation • CVPR 2023 • Shilong Zhang, Xinjiang Wang, Jiaqi Wang, Jiangmiao Pang, Chengqi Lyu, Wenwei Zhang, Ping Luo, Kai Chen

Concretely, we first lay dense queries like traditional detectors and then select distinct ones for one-to-one assignments.

Ranked #3 on Object Detection on CrowdHuman (full body)

Object object-detection +1

236

Paper
Code

Tube-Link: A Flexible Cross Tube Framework for Universal Video Segmentation

2 code implementations • ICCV 2023 • Xiangtai Li, Haobo Yuan, Wenwei Zhang, Guangliang Cheng, Jiangmiao Pang, Chen Change Loy

Our framework is a near-online approach that takes a short subclip as input and outputs the corresponding spatial-temporal tube masks.

Ranked #3 on Video Semantic Segmentation on VSPW

Contrastive Learning Segmentation +4

105

Paper
Code

QDTrack: Quasi-Dense Similarity Learning for Appearance-Only Multiple Object Tracking

2 code implementations • 12 Oct 2022 • Tobias Fischer, Thomas E. Huang, Jiangmiao Pang, Linlu Qiu, Haofeng Chen, Trevor Darrell, Fisher Yu

In this paper, we present Quasi-Dense Similarity Learning, which densely samples hundreds of object regions on a pair of images for contrastive learning.

Ranked #4 on Multiple Object Tracking on BDD100K test

Contrastive Learning Multiple Object Tracking +1

377

Paper
Code

Monocular 3D Object Detection with Depth from Motion

1 code implementation • 26 Jul 2022 • Tai Wang, Jiangmiao Pang, Dahua Lin

Perceiving 3D objects from monocular inputs is crucial for robotic systems, given its economy compared to multi-sensor settings.

Depth Estimation Monocular 3D Object Detection +2

295

Paper
Code

What Are Expected Queries in End-to-End Object Detection?

1 code implementation • 2 Jun 2022 • Shilong Zhang, Xinjiang Wang, Jiaqi Wang, Jiangmiao Pang, Kai Chen

As both sparse and dense queries are imperfect, then \emph{what are expected queries in end-to-end object detection}?

Instance Segmentation object-detection +2

236

Paper
Code

Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation

1 code implementation • CVPR 2022 • Xiangtai Li, Wenwei Zhang, Jiangmiao Pang, Kai Chen, Guangliang Cheng, Yunhai Tong, Chen Change Loy

We hope this simple, yet effective method can serve as a new, flexible baseline in unified video segmentation design.

Ranked #1 on Video Panoptic Segmentation on KITTI-STEP (using extra training data)

Image Segmentation Instance Segmentation +5

149

Paper
Code

Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking

7 code implementations • CVPR 2023 • Jinkun Cao, Jiangmiao Pang, Xinshuo Weng, Rawal Khirodkar, Kris Kitani

Instead of relying only on the linear state estimate (i. e., estimation-centric approach), we use object observations (i. e., the measurements by object detector) to compute a virtual trajectory over the occlusion period to fix the error accumulation of filter parameters during the occlusion period.

Ranked #2 on Multiple Object Tracking on KITTI Tracking test

Multi-Object Tracking Multiple Object Tracking +3

12,034

Paper
Code

Dense Siamese Network for Dense Unsupervised Learning

1 code implementation • 21 Mar 2022 • Wenwei Zhang, Jiangmiao Pang, Kai Chen, Chen Change Loy

It also extracts a batch of region embeddings that correspond to some sub-regions in the overlapped area to be contrasted for region consistency.

Ranked #2 on Unsupervised Semantic Segmentation on COCO-All (mIoU metric)

Self-Supervised Learning Unsupervised Semantic Segmentation

Paper
Code

Towards Balanced Learning for Instance Recognition

no code implementations • 23 Aug 2021 • Jiangmiao Pang, Kai Chen, Qi Li, Zhihai Xu, Huajun Feng, Jianping Shi, Wanli Ouyang, Dahua Lin

In this work, we carefully revisit the standard training practice of detectors, and find that the detection performance is often limited by the imbalance during the training process, which generally consists in three levels - sample level, feature level, and objective level.

Paper
Add Code

Self-Adversarial Disentangling for Specific Domain Adaptation

no code implementations • 8 Aug 2021 • Qianyu Zhou, Qiqi Gu, Jiangmiao Pang, Xuequan Lu, Lizhuang Ma

In this paper, we study a practical setting called Specific Domain Adaptation (SDA) that aligns the source and target domains in a demanded-specific dimension.

Ranked #10 on Unsupervised Domain Adaptation on Cityscapes to Foggy Cityscapes

Image-to-Image Translation on Cityscapes-to-Foggy Cityscapes object-detection +3

Paper
Add Code

Context-Aware Mixup for Domain Adaptive Semantic Segmentation

1 code implementation • 8 Aug 2021 • Qianyu Zhou, Zhengyang Feng, Qiqi Gu, Jiangmiao Pang, Guangliang Cheng, Xuequan Lu, Jianping Shi, Lizhuang Ma

The generated contextual mask is critical in this work and will guide the context-aware domain mixup on three different levels.

Ranked #5 on Image-to-Image Translation on SYNTHIA-to-Cityscapes

Semantic Segmentation Synthetic-to-Real Translation +1

Paper
Code

Probabilistic and Geometric Depth: Detecting Objects in Perspective

1 code implementation • 29 Jul 2021 • Tai Wang, Xinge Zhu, Jiangmiao Pang, Dahua Lin

As the preliminary depth estimation of each instance is usually inaccurate in this ill-posed setting, we incorporate a probabilistic representation to capture the uncertainty.

Ranked #10 on 3D Object Detection on KITTI Cars Hard val

Attribute Depth Estimation +2

4,785

Paper
Code

K-Net: Towards Unified Image Segmentation

1 code implementation • NeurIPS 2021 • Wenwei Zhang, Jiangmiao Pang, Kai Chen, Chen Change Loy

The framework, named K-Net, segments both instances and semantic categories consistently by a group of learnable kernels, where each kernel is responsible for generating a mask for either a potential instance or a stuff class.

Ranked #7 on Panoptic Segmentation on COCO test-dev

Image Segmentation Instance Segmentation +2

457

Paper
Code

FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection

8 code implementations • 22 Apr 2021 • Tai Wang, Xinge Zhu, Jiangmiao Pang, Dahua Lin

In this paper, we study this problem with a practice built on a fully convolutional single-stage detector and propose a general framework FCOS3D.

Ranked #323 on 3D Object Detection on nuScenes

Autonomous Driving Monocular 3D Object Detection +2

4,785

Paper
Code

Seesaw Loss for Long-Tailed Instance Segmentation

2 code implementations • CVPR 2021 • Jiaqi Wang, Wenwei Zhang, Yuhang Zang, Yuhang Cao, Jiangmiao Pang, Tao Gong, Kai Chen, Ziwei Liu, Chen Change Loy, Dahua Lin

Instances of head classes dominate a long-tailed dataset and they serve as negative samples of tail categories.

Instance Segmentation Semantic Segmentation

27,708

Paper
Code

Quasi-Dense Similarity Learning for Multiple Object Tracking

3 code implementations • CVPR 2021 • Jiangmiao Pang, Linlu Qiu, Xia Li, Haofeng Chen, Qi Li, Trevor Darrell, Fisher Yu

Compared to methods with similar detectors, it boosts almost 10 points of MOTA and significantly decreases the number of ID switches on BDD100K and Waymo datasets.

Ranked #1 on One-Shot Object Detection on PASCAL VOC 2012 val

Contrastive Learning Metric Learning +4

377

Paper
Code

Side-Aware Boundary Localization for More Precise Object Detection

3 code implementations • ECCV 2020 • Jiaqi Wang, Wenwei Zhang, Yuhang Cao, Kai Chen, Jiangmiao Pang, Tao Gong, Jianping Shi, Chen Change Loy, Dahua Lin

To tackle the difficulty of precise localization in the presence of displacements with large variance, we further propose a two-step localization scheme, which first predicts a range of movement through bucket prediction and then pinpoints the precise position within the predicted bucket.

Object object-detection +2

27,708

Paper
Code

MMDetection: Open MMLab Detection Toolbox and Benchmark

144 code implementations • 17 Jun 2019 • Kai Chen, Jiaqi Wang, Jiangmiao Pang, Yuhang Cao, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jiarui Xu, Zheng Zhang, Dazhi Cheng, Chenchen Zhu, Tianheng Cheng, Qijie Zhao, Buyu Li, Xin Lu, Rui Zhu, Yue Wu, Jifeng Dai, Jingdong Wang, Jianping Shi, Wanli Ouyang, Chen Change Loy, Dahua Lin

In this paper, we introduce the various features of this toolbox.

Benchmarking Instance Segmentation +2

27,708

Paper
Code

Libra R-CNN: Towards Balanced Learning for Object Detection

6 code implementations • CVPR 2019 • Jiangmiao Pang, Kai Chen, Jianping Shi, Huajun Feng, Wanli Ouyang, Dahua Lin

Ranked #149 on Object Detection on COCO test-dev

object-detection Object Detection

27,708

Paper
Code

R$^2$-CNN: Fast Tiny Object Detection in Large-Scale Remote Sensing Images

no code implementations • 16 Feb 2019 • Jiangmiao Pang, Cong Li, Jianping Shi, Zhihai Xu, Huajun Feng

To tackle these problems, we propose a unified and self-reinforced network called remote sensing region-based convolutional neural network ($\mathcal{R}^2$-CNN), composing of backbone Tiny-Net, intermediate global attention block, and final classifier and detector.

object-detection Object Detection

Paper
Add Code

Hybrid Task Cascade for Instance Segmentation

5 code implementations • CVPR 2019 • Kai Chen, Jiangmiao Pang, Jiaqi Wang, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jianping Shi, Wanli Ouyang, Chen Change Loy, Dahua Lin

In exploring a more effective approach, we find that the key to a successful instance segmentation cascade is to fully leverage the reciprocal relationship between detection and segmentation.

Ranked #32 on Object Detection on COCO-O

Instance Segmentation object-detection +4

27,708

Paper
Code

FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction

6 code implementations • NeurIPS 2018 • Shuyang Sun, Jiangmiao Pang, Jianping Shi, Shuai Yi, Wanli Ouyang

The basic principles in designing convolutional neural network (CNN) structures for predicting objects on different levels, e. g., image-level, region-level, and pixel-level are diverging.

Image Classification

2,917

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.