no code implementations • 30 Mar 2024 • Xiaoyang Lyu, Yang-tian Sun, Yi-Hua Huang, Xiuzhe Wu, ZiYi Yang, Yilun Chen, Jiangmiao Pang, Xiaojuan Qi
In this paper, we present an implicit surface reconstruction method with 3D Gaussian Splatting (3DGS), namely 3DGSR, that allows for accurate 3D reconstruction with intricate details while inheriting the high efficiency and rendering quality of 3DGS.
no code implementations • 25 Feb 2024 • Xiao Chen, Quanyi Li, Tai Wang, Tianfan Xue, Jiangmiao Pang
Previous works attempt to automate this process using the Next-Best-View (NBV) policy for active 3D reconstruction.
no code implementations • 24 Feb 2024 • Jinkun Cao, Jiangmiao Pang, Kris Kitani
We propose a new visual hierarchical representation paradigm for multi-object tracking.
no code implementations • 19 Feb 2024 • Jiahe Chen, Jinkun Cao, Dahua Lin, Kris Kitani, Jiangmiao Pang
However, mapping from a standard Gaussian by a flow-based model hurts the capacity to capture complicated patterns of trajectories, ignoring the under-represented motion intentions in the training data.
1 code implementation • 26 Dec 2023 • Tai Wang, Xiaohan Mao, Chenming Zhu, Runsen Xu, Ruiyuan Lyu, Peisen Li, Xiao Chen, Wenwei Zhang, Kai Chen, Tianfan Xue, Xihui Liu, Cewu Lu, Dahua Lin, Jiangmiao Pang
In the realm of computer vision and robotics, embodied agents are expected to explore their environment and carry out human instructions.
1 code implementation • 18 Dec 2023 • Junfeng Long, ZiRui Wang, Quanyi Li, Jiawei Gao, Liu Cao, Jiangmiao Pang
Robust locomotion control depends on accurate state estimations.
1 code implementation • NeurIPS 2023 • Meng Wei, Xiaoyu Yue, Wenwei Zhang, Shu Kong, Xihui Liu, Jiangmiao Pang
Secondly, part segmentation introduces an open granularity challenge due to the diverse and often ambiguous definitions of parts in the open world.
no code implementations • 3 Oct 2023 • Xiaoyu Yue, Lei Bai, Meng Wei, Jiangmiao Pang, Xihui Liu, Luping Zhou, Wanli Ouyang
Masked AutoEncoder (MAE) has revolutionized the field of self-supervised learning with its simple yet effective masking and reconstruction strategies.
1 code implementation • 14 Sep 2023 • Zeqi Xiao, Tai Wang, Jingbo Wang, Jinkun Cao, Wenwei Zhang, Bo Dai, Dahua Lin, Jiangmiao Pang
Based on the definition, UniHSI constitutes a Large Language Model (LLM) Planner to translate language prompts into task plans in the form of CoC, and a Unified Controller that turns CoC into uniform task execution.
3 code implementations • 31 Aug 2023 • Runsen Xu, Xiaolong Wang, Tai Wang, Yilun Chen, Jiangmiao Pang, Dahua Lin
The unprecedented advancements in Large Language Models (LLMs) have shown a profound impact on natural language processing but are yet to fully embrace the realm of 3D understanding.
Ranked #3 on 3D Question Answering (3D-QA) on 3D MM-Vet
2 code implementations • 19 Apr 2023 • Xiangtai Li, Henghui Ding, Haobo Yuan, Wenwei Zhang, Jiangmiao Pang, Guangliang Cheng, Kai Chen, Ziwei Liu, Chen Change Loy
Recently, transformers, a type of neural network based on self-attention originally designed for natural language processing, have considerably surpassed previous convolutional or recurrent approaches in various vision processing tasks.
1 code implementation • 29 Mar 2023 • Qing Lian, Tai Wang, Dahua Lin, Jiangmiao Pang
Recent multi-camera 3D object detectors usually leverage temporal information to construct multi-view stereo that alleviates the ill-posed depth estimation.
1 code implementation • 23 Mar 2023 • Zeqi Xiao, Wenwei Zhang, Tai Wang, Chen Change Loy, Dahua Lin, Jiangmiao Pang
DEtection TRansformer (DETR) started a trend that uses a group of learnable queries for unified visual perception.
Ranked #1 on Panoptic Segmentation on SemanticKITTI
1 code implementation • CVPR 2023 • Runsen Xu, Tai Wang, Wenwei Zhang, Runjian Chen, Jinkun Cao, Jiangmiao Pang, Dahua Lin
This paper introduces the Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training and a carefully designed data-efficient 3D object detection benchmark on the Waymo dataset.
1 code implementation • CVPR 2023 • Shilong Zhang, Xinjiang Wang, Jiaqi Wang, Jiangmiao Pang, Chengqi Lyu, Wenwei Zhang, Ping Luo, Kai Chen
Concretely, we first lay dense queries like traditional detectors and then select distinct ones for one-to-one assignments.
Ranked #3 on Object Detection on CrowdHuman (full body)
2 code implementations • ICCV 2023 • Xiangtai Li, Haobo Yuan, Wenwei Zhang, Guangliang Cheng, Jiangmiao Pang, Chen Change Loy
Our framework is a near-online approach that takes a short subclip as input and outputs the corresponding spatial-temporal tube masks.
Ranked #3 on Video Semantic Segmentation on VSPW
2 code implementations • 12 Oct 2022 • Tobias Fischer, Thomas E. Huang, Jiangmiao Pang, Linlu Qiu, Haofeng Chen, Trevor Darrell, Fisher Yu
In this paper, we present Quasi-Dense Similarity Learning, which densely samples hundreds of object regions on a pair of images for contrastive learning.
Ranked #4 on Multiple Object Tracking on BDD100K test
1 code implementation • 26 Jul 2022 • Tai Wang, Jiangmiao Pang, Dahua Lin
Perceiving 3D objects from monocular inputs is crucial for robotic systems, given its economy compared to multi-sensor settings.
1 code implementation • 2 Jun 2022 • Shilong Zhang, Xinjiang Wang, Jiaqi Wang, Jiangmiao Pang, Kai Chen
As both sparse and dense queries are imperfect, then \emph{what are expected queries in end-to-end object detection}?
1 code implementation • CVPR 2022 • Xiangtai Li, Wenwei Zhang, Jiangmiao Pang, Kai Chen, Guangliang Cheng, Yunhai Tong, Chen Change Loy
We hope this simple, yet effective method can serve as a new, flexible baseline in unified video segmentation design.
Ranked #1 on Video Panoptic Segmentation on KITTI-STEP (using extra training data)
7 code implementations • CVPR 2023 • Jinkun Cao, Jiangmiao Pang, Xinshuo Weng, Rawal Khirodkar, Kris Kitani
Instead of relying only on the linear state estimate (i. e., estimation-centric approach), we use object observations (i. e., the measurements by object detector) to compute a virtual trajectory over the occlusion period to fix the error accumulation of filter parameters during the occlusion period.
Ranked #2 on Multiple Object Tracking on KITTI Tracking test
1 code implementation • 21 Mar 2022 • Wenwei Zhang, Jiangmiao Pang, Kai Chen, Chen Change Loy
It also extracts a batch of region embeddings that correspond to some sub-regions in the overlapped area to be contrasted for region consistency.
Ranked #2 on Unsupervised Semantic Segmentation on COCO-All (mIoU metric)
no code implementations • 23 Aug 2021 • Jiangmiao Pang, Kai Chen, Qi Li, Zhihai Xu, Huajun Feng, Jianping Shi, Wanli Ouyang, Dahua Lin
In this work, we carefully revisit the standard training practice of detectors, and find that the detection performance is often limited by the imbalance during the training process, which generally consists in three levels - sample level, feature level, and objective level.
no code implementations • 8 Aug 2021 • Qianyu Zhou, Qiqi Gu, Jiangmiao Pang, Xuequan Lu, Lizhuang Ma
In this paper, we study a practical setting called Specific Domain Adaptation (SDA) that aligns the source and target domains in a demanded-specific dimension.
Image-to-Image Translation on Cityscapes-to-Foggy Cityscapes object-detection +3
1 code implementation • 8 Aug 2021 • Qianyu Zhou, Zhengyang Feng, Qiqi Gu, Jiangmiao Pang, Guangliang Cheng, Xuequan Lu, Jianping Shi, Lizhuang Ma
The generated contextual mask is critical in this work and will guide the context-aware domain mixup on three different levels.
Ranked #5 on Image-to-Image Translation on SYNTHIA-to-Cityscapes
1 code implementation • 29 Jul 2021 • Tai Wang, Xinge Zhu, Jiangmiao Pang, Dahua Lin
As the preliminary depth estimation of each instance is usually inaccurate in this ill-posed setting, we incorporate a probabilistic representation to capture the uncertainty.
Ranked #10 on 3D Object Detection on KITTI Cars Hard val
1 code implementation • NeurIPS 2021 • Wenwei Zhang, Jiangmiao Pang, Kai Chen, Chen Change Loy
The framework, named K-Net, segments both instances and semantic categories consistently by a group of learnable kernels, where each kernel is responsible for generating a mask for either a potential instance or a stuff class.
Ranked #7 on Panoptic Segmentation on COCO test-dev
8 code implementations • 22 Apr 2021 • Tai Wang, Xinge Zhu, Jiangmiao Pang, Dahua Lin
In this paper, we study this problem with a practice built on a fully convolutional single-stage detector and propose a general framework FCOS3D.
Ranked #323 on 3D Object Detection on nuScenes
2 code implementations • CVPR 2021 • Jiaqi Wang, Wenwei Zhang, Yuhang Zang, Yuhang Cao, Jiangmiao Pang, Tao Gong, Kai Chen, Ziwei Liu, Chen Change Loy, Dahua Lin
Instances of head classes dominate a long-tailed dataset and they serve as negative samples of tail categories.
3 code implementations • CVPR 2021 • Jiangmiao Pang, Linlu Qiu, Xia Li, Haofeng Chen, Qi Li, Trevor Darrell, Fisher Yu
Compared to methods with similar detectors, it boosts almost 10 points of MOTA and significantly decreases the number of ID switches on BDD100K and Waymo datasets.
Ranked #1 on One-Shot Object Detection on PASCAL VOC 2012 val
3 code implementations • ECCV 2020 • Jiaqi Wang, Wenwei Zhang, Yuhang Cao, Kai Chen, Jiangmiao Pang, Tao Gong, Jianping Shi, Chen Change Loy, Dahua Lin
To tackle the difficulty of precise localization in the presence of displacements with large variance, we further propose a two-step localization scheme, which first predicts a range of movement through bucket prediction and then pinpoints the precise position within the predicted bucket.
144 code implementations • 17 Jun 2019 • Kai Chen, Jiaqi Wang, Jiangmiao Pang, Yuhang Cao, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jiarui Xu, Zheng Zhang, Dazhi Cheng, Chenchen Zhu, Tianheng Cheng, Qijie Zhao, Buyu Li, Xin Lu, Rui Zhu, Yue Wu, Jifeng Dai, Jingdong Wang, Jianping Shi, Wanli Ouyang, Chen Change Loy, Dahua Lin
In this paper, we introduce the various features of this toolbox.
6 code implementations • CVPR 2019 • Jiangmiao Pang, Kai Chen, Jianping Shi, Huajun Feng, Wanli Ouyang, Dahua Lin
In this work, we carefully revisit the standard training practice of detectors, and find that the detection performance is often limited by the imbalance during the training process, which generally consists in three levels - sample level, feature level, and objective level.
Ranked #149 on Object Detection on COCO test-dev
no code implementations • 16 Feb 2019 • Jiangmiao Pang, Cong Li, Jianping Shi, Zhihai Xu, Huajun Feng
To tackle these problems, we propose a unified and self-reinforced network called remote sensing region-based convolutional neural network ($\mathcal{R}^2$-CNN), composing of backbone Tiny-Net, intermediate global attention block, and final classifier and detector.
5 code implementations • CVPR 2019 • Kai Chen, Jiangmiao Pang, Jiaqi Wang, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jianping Shi, Wanli Ouyang, Chen Change Loy, Dahua Lin
In exploring a more effective approach, we find that the key to a successful instance segmentation cascade is to fully leverage the reciprocal relationship between detection and segmentation.
Ranked #32 on Object Detection on COCO-O
6 code implementations • NeurIPS 2018 • Shuyang Sun, Jiangmiao Pang, Jianping Shi, Shuai Yi, Wanli Ouyang
The basic principles in designing convolutional neural network (CNN) structures for predicting objects on different levels, e. g., image-level, region-level, and pixel-level are diverging.