Search Results for author: Jiale Cao

Found 31 papers, 15 papers with code

VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection

no code implementations • 15 Apr 2024 • Bonan Ding, Jin Xie, Jing Nie, Jiale Cao

Therefore, an effective solution involves transforming monocular images into LiDAR-like representations and employing a LiDAR-based 3D object detector to predict the 3D coordinates of objects.

Autonomous Driving Monocular 3D Object Detection +2

Paper
Add Code

Implicit and Explicit Language Guidance for Diffusion-based Visual Perception

no code implementations • 11 Apr 2024 • Hefeng Wang, Jiale Cao, Jin Xie, Aiping Yang, Yanwei Pang

The explicit branch utilizes the ground-truth labels of corresponding images as text prompts to condition feature extraction of diffusion model.

Depth Estimation Image Generation +1

Paper
Add Code

SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior

no code implementations • 29 Mar 2024 • Zhongrui Yu, Haoran Wang, Jinze Yang, Hanzhang Wang, Zeke Xie, Yunfeng Cai, Jiale Cao, Zhong Ji, Mingming Sun

To tackle this problem, we propose a novel approach that enhances the capacity of 3DGS by leveraging prior from a Diffusion Model along with complementary multi-modal data.

Autonomous Driving Neural Rendering +1

Paper
Add Code

CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation

1 code implementation • 19 Mar 2024 • Wenqi Zhu, Jiale Cao, Jin Xie, Shuangming Yang, Yanwei Pang

Given a set of initial queries, class-agnostic mask generation employs a transformer decoder to predict query masks and corresponding object scores and mask IoU scores.

Instance Segmentation Language Modelling +4

Paper
Code

SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation

1 code implementation • 27 Nov 2023 • Bin Xie, Jiale Cao, Jin Xie, Fahad Shahbaz Khan, Yanwei Pang

In this paper, we propose a simple encoder-decoder, named SED, for open-vocabulary semantic segmentation, which comprises a hierarchical encoder-based cost map generation and a gradual fusion decoder with category early rejection.

Ranked #3 on Open Vocabulary Semantic Segmentation on PASCAL Context-459

Open Vocabulary Semantic Segmentation Segmentation +1

Paper
Code

Global Context Aggregation Network for Lightweight Saliency Detection of Surface Defects

no code implementations • 22 Sep 2023 • Feng Yan, Xiaoheng Jiang, Yang Lu, Lisha Cui, Shupan Li, Jiale Cao, Mingliang Xu, DaCheng Tao

To this end, we develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure.

Defect Detection Saliency Detection

Paper
Add Code

CINFormer: Transformer network with multi-stage CNN feature injection for surface defect segmentation

no code implementations • 22 Sep 2023 • Xiaoheng Jiang, Kaiyi Guo, Yang Lu, Feng Yan, Hao liu, Jiale Cao, Mingliang Xu, DaCheng Tao

To address these issues, we propose a transformer network with multi-stage CNN (Convolutional Neural Network) feature injection for surface defect segmentation, which is a UNet-like structure named CINFormer.

Defect Detection

Paper
Add Code

A Spatial-Temporal Deformable Attention based Framework for Breast Lesion Detection in Videos

1 code implementation • 9 Sep 2023 • Chao Qin, Jiale Cao, Huazhu Fu, Rao Muhammad Anwer, Fahad Shahbaz Khan

Existing video-based breast lesion detection approaches typically perform temporal feature aggregation of deep backbone features based on the self-attention operation.

Lesion Detection

Paper
Code

DFormer: Diffusion-guided Transformer for Universal Image Segmentation

1 code implementation • 6 Jun 2023 • Hefeng Wang, Jiale Cao, Rao Muhammad Anwer, Jin Xie, Fahad Shahbaz Khan, Yanwei Pang

Our DFormer outperforms the recent diffusion-based panoptic segmentation method Pix2Seq-D with a gain of 3. 6% on MS COCO val2017 set.

Denoising Image Segmentation +3

Paper
Code

Transformer-based stereo-aware 3D object detection from binocular images

no code implementations • 24 Apr 2023 • Hanqing Sun, Yanwei Pang, Jiale Cao, Jin Xie, Xuelong Li

In this paper, we explore the model design of Transformers in binocular 3D object detection, focusing particularly on extracting and encoding the task-specific image correspondence information.

3D Object Detection Object +1

Paper
Add Code

LEAPS: End-to-End One-Step Person Search With Learnable Proposals

no code implementations • 21 Mar 2023 • Zhiqiang Dong, Jiale Cao, Rao Muhammad Anwer, Jin Xie, Fahad Khan, Yanwei Pang

Given a set of sparse and learnable proposals, LEAPS employs a dynamic person search head to directly perform person detection and corresponding re-id feature generation without non-maximum suppression post-processing.

Human Detection Person Search

Paper
Add Code

Deep Intra-Image Contrastive Learning for Weakly Supervised One-Step Person Search

1 code implementation • 9 Feb 2023 • Jiabei Wang, Yanwei Pang, Jiale Cao, Hanqing Sun, Zhuang Shao, Xuelong Li

We hope that our simple intra-image contrastive learning can provide more paradigms on weakly supervised person search.

Contrastive Learning Pedestrian Detection +1

Paper
Code

3D Vision with Transformers: A Survey

1 code implementation • 8 Aug 2022 • Jean Lahoud, Jiale Cao, Fahad Shahbaz Khan, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan, Ming-Hsuan Yang

The success of the transformer architecture in natural language processing has recently triggered attention in the computer vision field.

Pose Estimation

366

Paper
Code

PSTR: End-to-End One-Step Person Search With Transformers

1 code implementation • CVPR 2022 • Jiale Cao, Yanwei Pang, Rao Muhammad Anwer, Hisham Cholakkal, Jin Xie, Mubarak Shah, Fahad Shahbaz Khan

We propose a novel one-step transformer-based person search framework, PSTR, that jointly performs person detection and re-identification (re-id) in a single architecture.

Human Detection Person Search

Paper
Code

Video Instance Segmentation via Multi-scale Spatio-temporal Split Attention Transformer

1 code implementation • 24 Mar 2022 • Omkar Thawakar, Sanath Narayan, Jiale Cao, Hisham Cholakkal, Rao Muhammad Anwer, Muhammad Haris Khan, Salman Khan, Michael Felsberg, Fahad Shahbaz Khan

When using the ResNet50 backbone, our MS-STS achieves a mask AP of 50. 1 %, outperforming the best reported results in literature by 2. 7 % and by 4. 8 % at higher overlap threshold of AP_75, while being comparable in model size and speed on Youtube-VIS 2019 val.

Instance Segmentation Semantic Segmentation +2

Paper
Code

ESGN: Efficient Stereo Geometry Network for Fast 3D Object Detection

no code implementations • 28 Nov 2021 • Aqi Gao, Yanwei Pang, Jing Nie, Jiale Cao, Yishun Guo

The key in our ESGN is an efficient geometry-aware feature generation (EGFG) module.

3D Object Detection Knowledge Distillation +1

Paper
Add Code

Shape Prior Non-Uniform Sampling Guided Real-time Stereo 3D Object Detection

no code implementations • 18 Jun 2021 • Aqi Gao, Jiale Cao, Yanwei Pang

Compared with the baseline RTS3D, our proposed method has 2. 57% improvement on AP3d almost without extra network parameters.

3D Object Detection Object +1

Paper
Add Code

Track to Detect and Segment: An Online Multi-Object Tracker

1 code implementation • CVPR 2021 • Jialian Wu, Jiale Cao, Liangchen Song, Yu Wang, Ming Yang, Junsong Yuan

Most online multi-object trackers perform object detection stand-alone in a neural net without any input from tracking.

Ranked #1 on Instance Segmentation on nuScenes

3D Multi-Object Tracking Instance Segmentation +7

545

Paper
Code

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection

1 code implementation • 3 Dec 2020 • Tiancai Wang, Tong Yang, Jiale Cao, Xiangyu Zhang

Object detectors usually achieve promising results with the supervision of complete instance annotations.

MULTI-VIEW LEARNING Object +4

Paper
Code

TJU-DHD: A Diverse High-Resolution Dataset for Object Detection

1 code implementation • 18 Nov 2020 • Yanwei Pang, Jiale Cao, Yazhao Li, Jin Xie, Hanqing Sun, Jinfeng Gong

In addition, a new diverse pedestrian dataset is further built.

object-detection Object Detection +2

135

Paper
Code

From Handcrafted to Deep Features for Pedestrian Detection: A Survey

2 code implementations • 1 Oct 2020 • Jiale Cao, Yanwei Pang, Jin Xie, Fahad Shahbaz Khan, Ling Shao

In addition to single-spectral pedestrian detection, we also review multi-spectral pedestrian detection, which provides more robust features for illumination variance.

Pedestrian Detection

166

Paper
Code

SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation

1 code implementation • ECCV 2020 • Jiale Cao, Rao Muhammad Anwer, Hisham Cholakkal, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao

In terms of real-time capabilities, SipMask outperforms YOLACT with an absolute gain of 3. 0% (mask AP) under similar settings, while operating at comparable speed on a Titan Xp.

Ranked #12 on Real-time Instance Segmentation on MSCOCO

object-detection Object Detection +4

334

Paper
Code

D2Det: Towards High Quality Object Detection and Instance Segmentation

1 code implementation • CVPR 2020 • Jiale Cao, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao

For precise localization, we introduce a dense local regression that predicts multiple dense box offsets for an object proposal.

Ranked #69 on Instance Segmentation on COCO test-dev

Instance Segmentation Object +5

297

Paper
Code

NETNet: Neighbor Erasing and Transferring Network for Better Single Shot Object Detection

no code implementations • CVPR 2020 • Yazhao Li, Yanwei Pang, Jianbing Shen, Jiale Cao, Ling Shao

With this observation, we propose a new Neighbor Erasing and Transferring (NET) mechanism to reconfigure the pyramid features and explore scale-aware features.

Object object-detection +1

Paper
Add Code

Hierarchical Shot Detector

1 code implementation • ICCV 2019 • Jiale Cao, Yanwei Pang, Jungong Han, Xuelong Li

To further solve the second problem, a hierarchical shot detector (HSD) is proposed, which stacks two ROC modules and one feature enhanced module.

Ranked #4 on Object Detection on PASCAL VOC 2007

General Classification object-detection +2

Paper
Code

Triply Supervised Decoder Networks for Joint Detection and Segmentation

no code implementations • CVPR 2019 • Jiale Cao, Yanwei Pang, Xuelong. Li

Experimental results on the VOC2007 and VOC2012 datasets demonstrate that the proposed TripleNet is able to improve both the detection and segmentation accuracies without adding extra computational costs.

Ranked #18 on Semantic Segmentation on PASCAL VOC 2012 test

object-detection Object Detection +3

Paper
Add Code

Exploring Multi-Branch and High-Level Semantic Networks for Improving Pedestrian Detection

no code implementations • 3 Apr 2018 • Jiale Cao, Yanwei Pang, Xuelong. Li

In this paper, we propose a multi-branch and high-level semantic network by gradually splitting a base network into multiple different branches.

object-detection Object Detection +1

Paper
Add Code

Learning Multilayer Channel Features for Pedestrian Detection

no code implementations • 1 Mar 2016 • Jiale Cao, Yanwei Pang, Xuelong. Li

For example, CNN classifies these proposals by the full-connected layer features while proposal scores and the features in the inner-layers of CNN are ignored.

Ranked #25 on Pedestrian Detection on Caltech

Pedestrian Detection

Paper
Add Code

Pedestrian Detection Inspired by Appearance Constancy and Shape Symmetry

no code implementations • CVPR 2016 • Jiale Cao, Yanwei Pang, Xuelong. Li

Finally, we propose to combine both non-neighboring and neighboring features for pedestrian detection.

Ranked #28 on Pedestrian Detection on Caltech

Pedestrian Detection

Paper
Add Code

Learning Sampling Distributions for Efficient Object Detection

no code implementations • 23 Aug 2015 • Yanwei Pang, Jiale Cao, Xuelong. Li

Multistage particle windows (MPW), proposed by Gualdi et al., is an algorithm of fast and accurate object detection.

Face Detection Object +2

Paper
Add Code

Cascade Learning by Optimally Partitioning

no code implementations • 18 Aug 2015 • Yanwei Pang, Jiale Cao, Xuelong. Li

iCascade searches the optimal number ri of weak classifiers of each stage i by directly minimizing the computation cost of the cascade.

Face Detection object-detection +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.