Search Results for author: Jiaya Jia

Found 177 papers, 100 papers with code

Memory Selection Network for Video Propagation

no code implementations • ECCV 2020 • Ruizheng Wu, Huaijia Lin, Xiaojuan Qi, Jiaya Jia

Video propagation is a fundamental problem in video processing where guidance frame predictions are propagated to guide predictions of the target frame.

Colorization Semantic Segmentation +3

Paper
Add Code

Particularity beyond Commonality: Unpaired Identity Transfer with Multiple References

no code implementations • ECCV 2020 • Ruizheng Wu, Xin Tao, Ying-Cong Chen, Xiaoyong Shen, Jiaya Jia

Unpaired image-to-image translation aims to translate images from the source class to target one by providing sufficient data for these classes.

Image-to-Image Translation Translation

Paper
Add Code

CN: Channel Normalization For Point Cloud Recognition

no code implementations • ECCV 2020 • Zetong Yang, Yanan sun, Shu Liu, Xiaojuan Qi, Jiaya Jia

In 3D recognition, to fuse multi-scale structure information, existing methods apply hierarchical frameworks stacked by multiple fusion layers for integrating current relative locations with structure information from the previous level.

Paper
Add Code

Scalable Language Model with Generalized Continual Learning

2 code implementations • 11 Apr 2024 • Bohao Peng, Zhuotao Tian, Shu Liu, MingChang Yang, Jiaya Jia

In this study, we introduce the Scalable Language Model (SLM) to overcome these limitations within a more challenging and generalized setting, representing a significant advancement toward practical applications for continual learning.

Continual Learning Language Modelling +1

140

Paper
Code

Unified Language-driven Zero-shot Domain Adaptation

no code implementations • 10 Apr 2024 • Senqiao Yang, Zhuotao Tian, Li Jiang, Jiaya Jia

This paper introduces Unified Language-driven Zero-shot Domain Adaptation (ULDA), a novel task setting that enables a single model to adapt to diverse target domains without explicit domain-ID knowledge.

Domain Adaptation Representation Learning

Paper
Add Code

Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

2 code implementations • 27 Mar 2024 • Yanwei Li, Yuechen Zhang, Chengyao Wang, Zhisheng Zhong, Yixin Chen, Ruihang Chu, Shaoteng Liu, Jiaya Jia

We try to narrow the gap by mining the potential of VLMs for better performance and any-to-any workflow from three aspects, i. e., high-resolution visual tokens, high-quality data, and VLM-guided generation.

Ranked #8 on Visual Question Answering on MM-Vet

Image Comprehension Visual Dialog +1

2,792

Paper
Code

OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation

1 code implementation • 21 Mar 2024 • Bohao Peng, Xiaoyang Wu, Li Jiang, Yukang Chen, Hengshuang Zhao, Zhuotao Tian, Jiaya Jia

This exploration led to the creation of Omni-Adaptive 3D CNNs (OA-CNNs), a family of networks that integrates a lightweight module to greatly enhance the adaptivity of sparse CNNs at minimal computational cost.

Ranked #5 on 3D Semantic Segmentation on ScanNet200

3D Semantic Segmentation LIDAR Semantic Segmentation

1,115

Paper
Code

GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding

1 code implementation • 14 Mar 2024 • Chengyao Wang, Li Jiang, Xiaoyang Wu, Zhuotao Tian, Bohao Peng, Hengshuang Zhao, Jiaya Jia

To address this issue, we propose GroupContrast, a novel approach that combines segment grouping and semantic-aware contrastive learning.

Contrastive Learning Representation Learning +2

Paper
Code

RL-GPT: Integrating Reinforcement Learning and Code-as-policy

no code implementations • 29 Feb 2024 • Shaoteng Liu, Haoqi Yuan, Minda Hu, Yanwei Li, Yukang Chen, Shu Liu, Zongqing Lu, Jiaya Jia

To seamlessly integrate both modalities, we introduce a two-level hierarchical framework, RL-GPT, comprising a slow agent and a fast agent.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

VLPose: Bridging the Domain Gap in Pose Estimation with Language-Vision Tuning

no code implementations • 22 Feb 2024 • Jingyao Li, Pengguang Chen, Xuan Ju, Hong Xu, Jiaya Jia

Our research aims to bridge the domain gap between natural and artificial scenarios with efficient tuning strategies.

Pose Estimation

Paper
Add Code

MOODv2: Masked Image Modeling for Out-of-Distribution Detection

no code implementations • 5 Jan 2024 • Jingyao Li, Pengguang Chen, Shaozuo Yu, Shu Liu, Jiaya Jia

The crux of effective out-of-distribution (OOD) detection lies in acquiring a robust in-distribution (ID) representation, distinct from OOD samples.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Paper
Add Code

MR-GSM8K: A Meta-Reasoning Revolution in Large Language Model Evaluation

2 code implementations • 28 Dec 2023 • Zhongshen Zeng, Pengguang Chen, Shu Liu, Haiyun Jiang, Jiaya Jia

In this work, we introduce a novel evaluation paradigm for Large Language Models, one that challenges them to engage in meta-reasoning.

GSM8K Language Modelling +2

Paper
Code

LISA++: An Improved Baseline for Reasoning Segmentation with Large Language Model

no code implementations • 28 Dec 2023 • Senqiao Yang, Tianyuan Qu, Xin Lai, Zhuotao Tian, Bohao Peng, Shu Liu, Jiaya Jia

While LISA effectively bridges the gap between segmentation and large language models to enable reasoning segmentation, it poses certain limitations: unable to distinguish different instances of the target region, and constrained by the pre-defined textual response formats.

Instance Segmentation Language Modelling +3

Paper
Add Code

BAL: Balancing Diversity and Novelty for Active Learning

1 code implementation • 26 Dec 2023 • Jingyao Li, Pengguang Chen, Shaozuo Yu, Shu Liu, Jiaya Jia

Experimental results demonstrate that, when labeling 80% of the samples, the performance of the current SOTA method declines by 0. 74%, whereas our proposed BAL achieves performance comparable to the full dataset.

Active Learning Self-Supervised Learning

Paper
Code

MoTCoder: Elevating Large Language Models with Modular of Thought for Challenging Programming Tasks

1 code implementation • 26 Dec 2023 • Jingyao Li, Pengguang Chen, Jiaya Jia

Large Language Models (LLMs) have showcased impressive capabilities in handling straightforward programming tasks.

Ranked #1 on Code Generation on CodeContests (Test Set pass@1 metric)

Code Generation

Paper
Code

Prompt Highlighter: Interactive Control for Multi-Modal LLMs

1 code implementation • 7 Dec 2023 • Yuechen Zhang, Shengju Qian, Bohao Peng, Shu Liu, Jiaya Jia

Without tuning on LLaVA-v1. 5, our method secured 70. 7 in the MMBench test and 1552. 5 in MME-perception.

Text Generation

Paper
Code

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models

2 code implementations • 28 Nov 2023 • Yanwei Li, Chengyao Wang, Jiaya Jia

Current VLMs, while proficient in tasks like image captioning and visual question answering, face computational burdens when processing long videos due to the excessive visual tokens.

Ranked #5 on Video-based Generative Performance Benchmarking on VideoInstruct

Image Captioning Video-based Generative Performance Benchmarking +2

832

Paper
Code

LLMGA: Multimodal Large Language Model based Generation Assistant

1 code implementation • 27 Nov 2023 • Bin Xia, Shiyin Wang, Yingfan Tao, Yitong Wang, Jiaya Jia

In the first stage, we train the MLLM to grasp the properties of image generation and editing, enabling it to generate detailed prompts.

Image Generation Language Modelling +4

252

Paper
Code

Lightweight In-Context Tuning for Multimodal Unified Models

no code implementations • 8 Oct 2023 • Yixin Chen, Shuai Zhang, Boran Han, Jiaya Jia

In-context learning (ICL) involves reasoning from given contextual examples.

Image Captioning In-Context Learning +4

Paper
Add Code

LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

2 code implementations • 21 Sep 2023 • Yukang Chen, Shengju Qian, Haotian Tang, Xin Lai, Zhijian Liu, Song Han, Jiaya Jia

For example, training on the context length of 8192 needs 16x computational costs in self-attention layers as that of 2048.

4k Instruction Following +2

5,673

Paper
Code

Mask-Attention-Free Transformer for 3D Instance Segmentation

1 code implementation • ICCV 2023 • Xin Lai, Yuhui Yuan, Ruihang Chu, Yukang Chen, Han Hu, Jiaya Jia

Therefore, we abandon the mask attention design and resort to an auxiliary center regression task instead.

3D Instance Segmentation Position +2

Paper
Code

FocalFormer3D : Focusing on Hard Instance for 3D Object Detection

1 code implementation • 8 Aug 2023 • Yilun Chen, Zhiding Yu, Yukang Chen, Shiyi Lan, Animashree Anandkumar, Jiaya Jia, Jose Alvarez

For 3D object detection, we instantiate this method as FocalFormer3D, a simple yet effective detector that excels at excavating difficult objects and improving prediction recall.

Ranked #8 on 3D Object Detection on nuScenes

3D Object Detection Autonomous Driving +2

130

Paper
Code

LISA: Reasoning Segmentation via Large Language Model

2 code implementations • 1 Aug 2023 • Xin Lai, Zhuotao Tian, Yukang Chen, Yanwei Li, Yuhui Yuan, Shu Liu, Jiaya Jia

In this work, we propose a new segmentation task -- reasoning segmentation.

Language Modelling Large Language Model +3

1,442

Paper
Code

Hierarchical Dense Correlation Distillation for Few-Shot Segmentation-Extended Abstract

no code implementations • 27 Jun 2023 • Bohao Peng, Zhuotao Tian, Xiaoyang Wu, Chengyao Wang, Shu Liu, Jingyong Su, Jiaya Jia

We hope our work can benefit broader industrial applications where novel classes with limited annotations are required to be decently identified.

Few-Shot Semantic Segmentation Segmentation +2

Paper
Add Code

Real-World Image Variation by Aligning Diffusion Inversion Chain

2 code implementations • NeurIPS 2023 • Yuechen Zhang, Jinbo Xing, Eric Lo, Jiaya Jia

Our pipeline enhances the generation quality of image variations by aligning the image generation process to the source image's inversion chain.

Image-Variation Semantic Similarity +2

126

Paper
Code

Self-supervised Learning by View Synthesis

no code implementations • 22 Apr 2023 • Shaoteng Liu, Xiangyu Zhang, Tao Hu, Jiaya Jia

In each iteration, the input to VSA is one view (or multiple views) of a 3D object and the output is a synthesized image in another target pose.

3D Classification Self-Supervised Learning

Paper
Add Code

TagCLIP: Improving Discrimination Ability of Open-Vocabulary Semantic Segmentation

no code implementations • 15 Apr 2023 • Jingyao Li, Pengguang Chen, Shengju Qian, Jiaya Jia

However, existing models easily misidentify input pixels from unseen classes, thus confusing novel classes with semantically-similar ones.

Language Modelling Open Vocabulary Semantic Segmentation +2

Paper
Add Code

Point2Pix: Photo-Realistic Point Cloud Rendering via Neural Radiance Fields

no code implementations • CVPR 2023 • Tao Hu, Xiaogang Xu, Shu Liu, Jiaya Jia

Also, we present Point Encoding to build Multi-scale Radiance Fields that provide discriminative 3D point features.

valid

Paper
Add Code

TriVol: Point Cloud Rendering via Triple Volumes

1 code implementation • CVPR 2023 • Tao Hu, Xiaogang Xu, Ruihang Chu, Jiaya Jia

However, artifacts still appear in rendered images, due to the challenges in extracting continuous and discriminative 3D features from point clouds.

Paper
Code

Hierarchical Dense Correlation Distillation for Few-Shot Segmentation

1 code implementation • CVPR 2023 • Bohao Peng, Zhuotao Tian, Xiaoyang Wu, Chenyao Wang, Shu Liu, Jingyong Su, Jiaya Jia

Few-shot semantic segmentation (FSS) aims to form class-agnostic models segmenting unseen classes with only a handful of annotations.

Ranked #6 on Few-Shot Semantic Segmentation on COCO-20i (1-shot)

Few-Shot Semantic Segmentation Segmentation +2

Paper
Code

Spherical Transformer for LiDAR-based 3D Recognition

2 code implementations • CVPR 2023 • Xin Lai, Yukang Chen, Fanbin Lu, Jianhui Liu, Jiaya Jia

In this work, we study the varying-sparsity distribution of LiDAR points and present SphereFormer to directly aggregate information from dense close points to the sparse distant ones.

Ranked #1 on Semantic Segmentation on KITTI Semantic Segmentation

3D Object Detection 3D Semantic Segmentation +3

273

Paper
Code

Learning Context-aware Classifier for Semantic Segmentation

2 code implementations • 21 Mar 2023 • Zhuotao Tian, Jiequan Cui, Li Jiang, Xiaojuan Qi, Xin Lai, Yixin Chen, Shu Liu, Jiaya Jia

Semantic segmentation is still a challenging task for parsing diverse contexts in different scenes, thus the fixed classifier might not be able to well address varying feature distributions during testing.

Segmentation Semantic Segmentation

1,115

Paper
Code

VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking

2 code implementations • CVPR 2023 • Yukang Chen, Jianhui Liu, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia

Our core insight is to predict objects directly based on sparse voxel features, without relying on hand-crafted proxies.

Ranked #1 on 3D Object Detection on Argoverse2

3D Object Detection Object +1

641

Paper
Code

Video-P2P: Video Editing with Cross-attention Control

1 code implementation • 8 Mar 2023 • Shaoteng Liu, Yuechen Zhang, Wenbo Li, Zhe Lin, Jiaya Jia

This paper presents Video-P2P, a novel framework for real-world video editing with cross-attention control.

Image Generation Video Editing +1

331

Paper
Code

StraIT: Non-autoregressive Generation with Stratified Image Transformer

no code implementations • 1 Mar 2023 • Shengju Qian, Huiwen Chang, Yuanzhen Li, Zizhao Zhang, Jiaya Jia, Han Zhang

We propose Stratified Image Transformer(StraIT), a pure non-autoregressive(NAR) generative model that demonstrates superiority in high-quality image synthesis over existing autoregressive(AR) and diffusion models(DMs).

Image Generation

Paper
Add Code

Rethinking Out-of-distribution (OOD) Detection: Masked Image Modeling is All You Need

1 code implementation • CVPR 2023 • Jingyao Li, Pengguang Chen, Shaozuo Yu, Zexin He, Shu Liu, Jiaya Jia

The core of out-of-distribution (OOD) detection is to learn the in-distribution (ID) representation, which is distinguishable from OOD samples.

Ranked #12 on Out-of-Distribution Detection on ImageNet-1k vs Places (AUROC metric)

Out-of-Distribution Detection

Paper
Code

Understanding Imbalanced Semantic Segmentation Through Neural Collapse

2 code implementations • CVPR 2023 • Zhisheng Zhong, Jiequan Cui, Yibo Yang, Xiaoyang Wu, Xiaojuan Qi, Xiangyu Zhang, Jiaya Jia

Based on our empirical and theoretical analysis, we point out that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes, which breaks the equiangular and maximally separated structure of neural collapse for both feature centers and classifiers.

3D Semantic Segmentation Segmentation

1,115

Paper
Code

High Quality Entity Segmentation

no code implementations • ICCV 2023 • Lu Qi, Jason Kuen, Tiancheng Shen, Jiuxiang Gu, Wenbo Li, Weidong Guo, Jiaya Jia, Zhe Lin, Ming-Hsuan Yang

Given the high-quality and -resolution nature of the dataset, we propose CropFormer which is designed to tackle the intractability of instance-level segmentation on high-resolution images.

Image Segmentation Segmentation +1

Paper
Add Code

End-to-end 3D Tracking with Decoupled Queries

no code implementations • ICCV 2023 • Yanwei Li, Zhiding Yu, Jonah Philion, Anima Anandkumar, Sanja Fidler, Jiaya Jia, Jose Alvarez

In this work, we present an end-to-end framework for camera-based 3D multi-object tracking, called DQTrack.

3D Multi-Object Tracking

Paper
Add Code

FocalFormer3D: Focusing on Hard Instance for 3D Object Detection

1 code implementation • ICCV 2023 • Yilun Chen, Zhiding Yu, Yukang Chen, Shiyi Lan, Anima Anandkumar, Jiaya Jia, Jose M. Alvarez

For 3D object detection, we instantiate this method as FocalFormer3D, a simple yet effective detector that excels at excavating difficult objects and improving prediction recall.

3D Object Detection Autonomous Driving +2

130

Paper
Code

Command-Driven Articulated Object Understanding and Manipulation

no code implementations • CVPR 2023 • Ruihang Chu, Zhengzhe Liu, Xiaoqing Ye, Xiao Tan, Xiaojuan Qi, Chi-Wing Fu, Jiaya Jia

The key of Cart is to utilize the prediction of object structures to connect visual observations with user commands for effective manipulations.

motion prediction Object +1

Paper
Add Code

Removing Anomalies as Noises for Industrial Defect Localization

no code implementations • ICCV 2023 • Fanbin Lu, Xufeng Yao, Chi-Wing Fu, Jiaya Jia

Our denoising model outperforms the state-of-the-art reconstruction-based anomaly detection methods for precise anomaly localization and high-quality normal image reconstruction on the MVTec-AD benchmark.

Denoising Image Reconstruction +1

Paper
Add Code

What Makes for Good Tokenizers in Vision Transformer?

no code implementations • 21 Dec 2022 • Shengju Qian, Yi Zhu, Wenbo Li, Mu Li, Jiaya Jia

The architecture of transformers, which recently witness booming applications in vision tasks, has pivoted against the widespread convolutional paradigm.

Paper
Add Code

General Adversarial Defense Against Black-box Attacks via Pixel Level and Feature Level Distribution Alignments

no code implementations • 11 Dec 2022 • Xiaogang Xu, Hengshuang Zhao, Philip Torr, Jiaya Jia

In this paper, we use Deep Generative Networks (DGNs) with a novel training mechanism to eliminate the distribution gap.

Adversarial Attack Adversarial Defense +4

Paper
Add Code

Ref-NPR: Reference-Based Non-Photorealistic Radiance Fields for Controllable Scene Stylization

1 code implementation • CVPR 2023 • Yuechen Zhang, Zexin He, Jinbo Xing, Xufeng Yao, Jiaya Jia

We propose a ray registration process based on the stylized reference view to obtain pseudo-ray supervision in novel views.

Semantic correspondence

119

Paper
Code

Image Inpainting via Iteratively Decoupled Probabilistic Modeling

2 code implementations • 6 Dec 2022 • Wenbo Li, Xin Yu, Kun Zhou, Yibing Song, Zhe Lin, Jiaya Jia

To achieve high-quality results with low computational cost, we present a novel pixel spread model (PSM) that iteratively employs decoupled probabilistic modeling, combining the optimization efficiency of GANs with the prediction tractability of probabilistic models.

Denoising Image Inpainting

Paper
Code

High-Quality Entity Segmentation

1 code implementation • 10 Nov 2022 • Lu Qi, Jason Kuen, Weidong Guo, Tiancheng Shen, Jiuxiang Gu, Jiaya Jia, Zhe Lin, Ming-Hsuan Yang

It improves mask prediction by fusing high-res image crops that provide more fine-grained image details and the full image.

Image Segmentation Segmentation +2

662

Paper
Code

Generalized Parametric Contrastive Learning

4 code implementations • 26 Sep 2022 • Jiequan Cui, Zhisheng Zhong, Zhuotao Tian, Shu Liu, Bei Yu, Jiaya Jia

Based on theoretical analysis, we observe that supervised contrastive loss tends to bias high-frequency classes and thus increases the difficulty of imbalanced learning.

Ranked #5 on Long-tail Learning on iNaturalist 2018

Contrastive Learning Domain Generalization +3

221

Paper
Code

End-to-end View Synthesis via NeRF Attention

no code implementations • 29 Jul 2022 • Zelin Zhao, Jiaya Jia

On the one hand, NeRFA considers the volumetric rendering equation as a soft feature modulation procedure.

Inductive Bias Novel View Synthesis

Paper
Add Code

DecoupleNet: Decoupled Network for Domain Adaptive Semantic Segmentation

1 code implementation • 20 Jul 2022 • Xin Lai, Zhuotao Tian, Xiaogang Xu, Yingcong Chen, Shu Liu, Hengshuang Zhao, LiWei Wang, Jiaya Jia

Unsupervised domain adaptation in semantic segmentation has been raised to alleviate the reliance on expensive pixel-wise annotations.

Segmentation Semantic Segmentation +2

Paper
Code

Tracking Objects as Pixel-wise Distributions

1 code implementation • 12 Jul 2022 • Zelin Zhao, Ze Wu, Yueqing Zhuang, Boxun Li, Jiaya Jia

During inference, a pixel-wise association procedure is proposed to recover object connections through frames based on the pixel-wise prediction.

Multi-Object Tracking Object

158

Paper
Code

Deep Parametric 3D Filters for Joint Video Denoising and Illumination Enhancement in Video Super Resolution

1 code implementation • 5 Jul 2022 • Xiaogang Xu, RuiXing Wang, Chi-Wing Fu, Jiaya Jia

Despite the quality improvement brought by the recent methods, video super-resolution (SR) is still very challenging, especially for videos that are low-light and noisy.

Denoising Video Denoising +1

Paper
Code

Towards Real-World Video Denosing: A Practical Video Denosing Dataset and Network

no code implementations • 4 Jul 2022 • Xiaogang Xu, Yitong Yu, Nianjuan Jiang, Jiangbo Lu, Bei Yu, Jiaya Jia

Moreover, we also propose a new video denoising framework, called Recurrent Video Denoising Transformer (RVDT), which can achieve SOTA performance on PVDD and other current video denoising benchmarks.

Denoising Video Denoising

Paper
Add Code

LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs

2 code implementations • CVPR 2023 • Yukang Chen, Jianhui Liu, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia

Recent advance in 2D CNNs has revealed that large kernels are important.

3D Object Detection Object +3

359

Paper
Code

EfficientNeRF: Efficient Neural Radiance Fields

1 code implementation • 2 Jun 2022 • Tao Hu, Shu Liu, Yilun Chen, Tiancheng Shen, Jiaya Jia

Neural Radiance Fields (NeRF) has been wildly applied to various tasks for its high-quality representation of 3D scenes.

valid

149

Paper
Code

Unifying Voxel-based Representation with Transformer for 3D Object Detection

1 code implementation • 1 Jun 2022 • Yanwei Li, Yilun Chen, Xiaojuan Qi, Zeming Li, Jian Sun, Jiaya Jia

To this end, the modality-specific space is first designed to represent different inputs in the voxel feature space.

3D Object Detection Object +3

214

Paper
Code

Voxel Field Fusion for 3D Object Detection

1 code implementation • CVPR 2022 • Yanwei Li, Xiaojuan Qi, Yukang Chen, LiWei Wang, Zeming Li, Jian Sun, Jiaya Jia

In this work, we present a conceptually simple yet effective framework for cross-modality 3D object detection, named voxel field fusion.

3D Object Detection Data Augmentation +2

Paper
Code

Video Frame Interpolation with Transformer

1 code implementation • CVPR 2022 • Liying Lu, Ruizheng Wu, Huaijia Lin, Jiangbo Lu, Jiaya Jia

Video frame interpolation (VFI), which aims to synthesize intermediate frames of a video, has made remarkable progress with development of deep convolutional networks over past years.

Ranked #5 on Video Frame Interpolation on MSU Video Frame Interpolation (VMAF metric)

Video Frame Interpolation

108

Paper
Code

Focal Sparse Convolutional Networks for 3D Object Detection

2 code implementations • CVPR 2022 • Yukang Chen, Yanwei Li, Xiangyu Zhang, Jian Sun, Jiaya Jia

In this paper, we introduce two new modules to enhance the capability of Sparse CNNs, both are based on making feature sparsity learnable with position-wise importance prediction.

3D Object Detection Object +1

359

Paper
Code

DSGN++: Exploiting Visual-Spatial Relation for Stereo-based 3D Detectors

1 code implementation • 6 Apr 2022 • Yilun Chen, Shijia Huang, Shu Liu, Bei Yu, Jiaya Jia

First, to effectively lift the 2D information to stereo volume, we propose depth-wise plane sweeping (DPS) that allows denser connections and extracts depth-guided features.

Ranked #1 on 3D Object Detection From Stereo Images on KITTI Cyclists Moderate

3D Object Detection From Stereo Images Relation

Paper
Code

Region Rebalance for Long-Tailed Semantic Segmentation

5 code implementations • 5 Apr 2022 • Jiequan Cui, Yuhui Yuan, Zhisheng Zhong, Zhuotao Tian, Han Hu, Stephen Lin, Jiaya Jia

In this paper, we study the problem of class imbalance in semantic segmentation.

Ranked #21 on Semantic Segmentation on ADE20K

Segmentation Semantic Segmentation

221

Paper
Code

Multi-View Transformer for 3D Visual Grounding

1 code implementation • CVPR 2022 • Shijia Huang, Yilun Chen, Jiaya Jia, LiWei Wang

The multi-view space enables the network to learn a more robust multi-modal representation for 3D visual grounding and eliminates the dependence on specific views.

Visual Grounding

Paper
Code

MAT: Mask-Aware Transformer for Large Hole Image Inpainting

1 code implementation • CVPR 2022 • Wenbo Li, Zhe Lin, Kun Zhou, Lu Qi, Yi Wang, Jiaya Jia

Recent studies have shown the importance of modeling long-range interactions in the inpainting problem.

Ranked #1 on Image Inpainting on CelebA-HQ

Image Inpainting valid

679

Paper
Code

Stratified Transformer for 3D Point Cloud Segmentation

4 code implementations • CVPR 2022 • Xin Lai, Jianhui Liu, Li Jiang, LiWei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia

In this paper, we propose Stratified Transformer that is able to capture long-range contexts and demonstrates strong generalization ability and high performance.

Ranked #14 on Semantic Segmentation on ScanNet

Point Cloud Segmentation Position +1

1,115

Paper
Code

Rebalanced Siamese Contrastive Mining for Long-Tailed Recognition

2 code implementations • 22 Mar 2022 • Zhisheng Zhong, Jiequan Cui, Zeming Li, Eric Lo, Jian Sun, Jiaya Jia

Given the promising performance of contrastive learning, we propose Rebalanced Siamese Contrastive Mining (ResCom) to tackle imbalanced recognition.

Ranked #5 on Long-tail Learning on CIFAR-10-LT (ρ=10)

Contrastive Learning Long-tail Learning +1

Paper
Code

SEA: Bridging the Gap Between One- and Two-stage Detector Distillation via SEmantic-aware Alignment

no code implementations • 2 Mar 2022 • Yixin Chen, Zhuotao Tian, Pengguang Chen, Shu Liu, Jiaya Jia

We revisit the one- and two-stage detector distillation tasks and present a simple and efficient semantic-aware framework to fill the gap between them.

Instance Segmentation object-detection +2

Paper
Add Code

A Unified Query-based Paradigm for Point Cloud Understanding

1 code implementation • CVPR 2022 • Zetong Yang, Li Jiang, Yanan sun, Bernt Schiele, Jiaya Jia

This is achieved by introducing an intermediate representation, i. e., Q-representation, in the querying stage to serve as a bridge between the embedding stage and task heads.

Ranked #7 on Semantic Segmentation on S3DIS

Autonomous Driving object-detection +2

118

Paper
Code

EfficientNeRF Efficient Neural Radiance Fields

no code implementations • CVPR 2022 • Tao Hu, Shu Liu, Yilun Chen, Tiancheng Shen, Jiaya Jia

Neural Radiance Fields (NeRF) has been wildly applied to various tasks for its high-quality representation of 3D scenes.

valid

Paper
Add Code

SNR-Aware Low-Light Image Enhancement

1 code implementation • CVPR 2022 • Xiaogang Xu, RuiXing Wang, Chi-Wing Fu, Jiaya Jia

They are long-range operations for image regions of extremely low Signal-to-Noise-Ratio (SNR) and short-range operations for other regions.

Ranked #2 on Low-Light Image Enhancement on LIME

Low-Light Image Enhancement

141

Paper
Code

TWIST: Two-Way Inter-Label Self-Training for Semi-Supervised 3D Instance Segmentation

no code implementations • CVPR 2022 • Ruihang Chu, Xiaoqing Ye, Zhengzhe Liu, Xiao Tan, Xiaojuan Qi, Chi-Wing Fu, Jiaya Jia

We explore the way to alleviate the label-hungry problem in a semi-supervised setting for 3D instance segmentation.

3D Instance Segmentation Denoising +2

Paper
Add Code

On Efficient Transformer-Based Image Pre-training for Low-Level Vision

1 code implementation • 19 Dec 2021 • Wenbo Li, Xin Lu, Shengju Qian, Jiangbo Lu, Xiangyu Zhang, Jiaya Jia

Pre-training has marked numerous state of the arts in high-level computer vision, while few attempts have ever been made to investigate how pre-training acts in image processing systems.

Ranked #5 on Image Super-Resolution on Set5 - 2x upscaling (using extra training data)

Denoising Image Super-Resolution

119

Paper
Code

CA-SSL: Class-Agnostic Semi-Supervised Learning for Detection and Segmentation

1 code implementation • 9 Dec 2021 • Lu Qi, Jason Kuen, Zhe Lin, Jiuxiang Gu, Fengyun Rao, Dian Li, Weidong Guo, Zhen Wen, Ming-Hsuan Yang, Jiaya Jia

To improve instance-level detection/segmentation performance, existing self-supervised and semi-supervised methods extract either task-unrelated or task-specific training signals from unlabeled data.

object-detection Object Detection +2

661

Paper
Code

High Quality Segmentation for Ultra High-resolution Images

1 code implementation • CVPR 2022 • Tiancheng Shen, Yuechen Zhang, Lu Qi, Jason Kuen, Xingyu Xie, Jianlong Wu, Zhe Lin, Jiaya Jia

To segment 4K or 6K ultra high-resolution images needs extra computation consideration in image segmentation.

4k Image Segmentation +3

661

Paper
Code

Blending Anti-Aliasing into Vision Transformer

no code implementations • NeurIPS 2021 • Shengju Qian, Hao Shao, Yi Zhu, Mu Li, Jiaya Jia

In this work, we analyze the uncharted problem of aliasing in vision transformer and explore to incorporate anti-aliasing properties.

Paper
Add Code

Guided Point Contrastive Learning for Semi-supervised Point Cloud Semantic Segmentation

1 code implementation • ICCV 2021 • Li Jiang, Shaoshuai Shi, Zhuotao Tian, Xin Lai, Shu Liu, Chi-Wing Fu, Jiaya Jia

To address the high cost and challenges of 3D point-level labeling, we present a method for semi-supervised point cloud semantic segmentation to adopt unlabeled point clouds in training to boost the model performance.

3D Semantic Segmentation Contrastive Learning +1

478

Paper
Code

PFENet++: Boosting Few-shot Semantic Segmentation with the Noise-filtered Context-aware Prior Mask

1 code implementation • 28 Sep 2021 • Xiaoliu Luo, Zhuotao Tian, Taiping Zhang, Bei Yu, Yuan Yan Tang, Jiaya Jia

In this work, we revisit the prior mask guidance proposed in ``Prior Guided Feature Enrichment Network for Few-Shot Segmentation''.

Few-Shot Semantic Segmentation Semantic Segmentation

Paper
Code

Deep Structured Instance Graph for Distilling Object Detectors

1 code implementation • ICCV 2021 • Yixin Chen, Pengguang Chen, Shu Liu, LiWei Wang, Jiaya Jia

Effectively structuring deep knowledge plays a pivotal role in transfer from teacher to student, especially in semantic vision tasks.

Instance Segmentation Knowledge Distillation +5

Paper
Code

Image Synthesis via Semantic Composition

no code implementations • ICCV 2021 • Yi Wang, Lu Qi, Ying-Cong Chen, Xiangyu Zhang, Jiaya Jia

In this paper, we present a novel approach to synthesize realistic images based on their semantic layouts.

Image Generation Semantic Composition

Paper
Add Code

Multi-Scale Aligned Distillation for Low-Resolution Detection

2 code implementations • CVPR 2021 • Lu Qi, Jason Kuen, Jiuxiang Gu, Zhe Lin, Yi Wang, Yukang Chen, Yanwei Li, Jiaya Jia

However, this option traditionally hurts the detection performance much.

Knowledge Distillation object-detection +1

129

Paper
Code

Exploring and Improving Mobile Level Vision Transformers

no code implementations • 30 Aug 2021 • Pengguang Chen, Yixin Chen, Shu Liu, MingChang Yang, Jiaya Jia

We analyze the reason behind this phenomenon, and propose a novel irregular patch embedding module and adaptive patch fusion module to improve the performance.

Paper
Add Code

Fully Convolutional Networks for Panoptic Segmentation with Point-based Supervision

1 code implementation • 17 Aug 2021 • Yanwei Li, Hengshuang Zhao, Xiaojuan Qi, Yukang Chen, Lu Qi, LiWei Wang, Zeming Li, Jian Sun, Jiaya Jia

In particular, Panoptic FCN encodes each object instance or stuff category with the proposed kernel generator and produces the prediction by convolving the high-resolution feature directly.

Panoptic Segmentation Segmentation +1

388

Paper
Code

Conditional Temporal Variational AutoEncoder for Action Video Prediction

no code implementations • 12 Aug 2021 • Xiaogang Xu, Yi Wang, LiWei Wang, Bei Yu, Jiaya Jia

To synthesize a realistic action sequence based on a single human image, it is crucial to model both motion patterns and diversity in the action video.

motion prediction Video Prediction

Paper
Add Code

Open-World Entity Segmentation

2 code implementations • 29 Jul 2021 • Lu Qi, Jason Kuen, Yi Wang, Jiuxiang Gu, Hengshuang Zhao, Zhe Lin, Philip Torr, Jiaya Jia

By removing the need of class label prediction, the models trained for such task can focus more on improving segmentation quality.

Image Manipulation Image Segmentation +2

661

Paper
Code

Parametric Contrastive Learning

5 code implementations • ICCV 2021 • Jiequan Cui, Zhisheng Zhong, Shu Liu, Bei Yu, Jiaya Jia

In this paper, we propose Parametric Contrastive Learning (PaCo) to tackle long-tailed recognition.

Ranked #12 on Long-tail Learning on iNaturalist 2018

Contrastive Learning Image Classification +1

221

Paper
Code

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency

2 code implementations • CVPR 2021 • Xin Lai, Zhuotao Tian, Li Jiang, Shu Liu, Hengshuang Zhao, LiWei Wang, Jiaya Jia

Semantic segmentation has made tremendous progress in recent years.

Semi-Supervised Semantic Segmentation

177

Paper
Code

Self-Supervised 3D Mesh Reconstruction From Single Images

no code implementations • CVPR 2021 • Tao Hu, LiWei Wang, Xiaogang Xu, Shu Liu, Jiaya Jia

Recent single-view 3D reconstruction methods reconstruct object's shape and texture from a single image with only 2D image-level annotation.

3D Reconstruction Attribute +2

Paper
Add Code

MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Resolution

1 code implementation • CVPR 2021 • Liying Lu, Wenbo Li, Xin Tao, Jiangbo Lu, Jiaya Jia

Therefore, high-quality correspondence matching is critical.

Image Super-Resolution

153

Paper
Code

LAPAR: Linearly-Assembled Pixel-Adaptive Regression Network for Single Image Super-Resolution and Beyond

2 code implementations • NeurIPS 2020 • Wenbo Li, Kun Zhou, Lu Qi, Nianjuan Jiang, Jiangbo Lu, Jiaya Jia

Single image super-resolution (SISR) deals with a fundamental problem of upsampling a low-resolution (LR) image to its high-resolution (HR) version.

Image Deblocking Image Denoising +2

223

Paper
Code

Distilling Knowledge via Knowledge Review

7 code implementations • CVPR 2021 • Pengguang Chen, Shu Liu, Hengshuang Zhao, Jiaya Jia

Knowledge distillation transfers knowledge from the teacher network to the student one, with the goal of greatly improving the performance of the student network.

Ranked #12 on Knowledge Distillation on CIFAR-100

Instance Segmentation Knowledge Distillation +3

1,265

Paper
Code

Improving Calibration for Long-Tailed Recognition

5 code implementations • CVPR 2021 • Zhisheng Zhong, Jiequan Cui, Shu Liu, Jiaya Jia

Motivated by the fact that predicted probability distributions of classes are highly related to the numbers of class instances, we propose label-aware smoothing to deal with different degrees of over-confidence for classes and improve classifier learning.

Ranked #16 on Long-tail Learning on CIFAR-10-LT (ρ=100)

Long-tail Learning Representation Learning

139

Paper
Code

Jigsaw Clustering for Unsupervised Visual Representation Learning

1 code implementation • CVPR 2021 • Pengguang Chen, Shu Liu, Jiaya Jia

It is even comparable to the contrastive learning methods when only half of training batches are used.

Clustering Contrastive Learning +1

Paper
Code

Scale-aware Automatic Augmentation for Object Detection

1 code implementation • CVPR 2021 • Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei LI, Jiaya Jia

We propose Scale-aware AutoAug to learn data augmentation policies for object detection.

Data Augmentation Instance Segmentation +5

196

Paper
Code

Best-Buddy GANs for Highly Detailed Image Super-Resolution

2 code implementations • 29 Mar 2021 • Wenbo Li, Kun Zhou, Lu Qi, Liying Lu, Nianjuan Jiang, Jiangbo Lu, Jiaya Jia

We consider the single image super-resolution (SISR) problem, where a high-resolution (HR) image is generated based on a low-resolution (LR) input.

4k Image Super-Resolution

223

Paper
Code

Bidirectional Projection Network for Cross Dimension Scene Understanding

1 code implementation • CVPR 2021 • WenBo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong

Via the \emph{BPM}, complementary 2D and 3D information can interact with each other in multiple architectural levels, such that advantages in these two visual domains can be combined for better scene recognition.

Ranked #11 on Semantic Segmentation on ScanNet

2D Semantic Segmentation 3D Semantic Segmentation +3

163

Paper
Code

Video Instance Segmentation with a Propose-Reduce Paradigm

1 code implementation • ICCV 2021 • Huaijia Lin, Ruizheng Wu, Shu Liu, Jiangbo Lu, Jiaya Jia

Video instance segmentation (VIS) aims to segment and associate all instances of predefined classes for each frame in videos.

Ranked #2 on Unsupervised Video Object Segmentation on DAVIS 2017 (val) (using extra training data)

Instance Segmentation Segmentation +3

Paper
Code

ResLT: Residual Learning for Long-tailed Recognition

5 code implementations • 26 Jan 2021 • Jiequan Cui, Shu Liu, Zhuotao Tian, Zhisheng Zhong, Jiaya Jia

From this perspective, the trivial solution utilizes different branches for the head, medium, and tail classes respectively, and then sums their outputs as the final results is not feasible.

Ranked #20 on Long-tail Learning on iNaturalist 2018

Long-tail Learning

221

Paper
Code

General Adversarial Defense via Pixel Level and Feature Level Distribution Alignment

no code implementations • 1 Jan 2021 • Xiaogang Xu, Hengshuang Zhao, Philip Torr, Jiaya Jia

Specifically, compared with previous methods, we propose a more efficient pixel-level training constraint to weaken the hardness of aligning adversarial samples to clean samples, which can thus obviously enhance the robustness on adversarial samples.

Adversarial Defense Image Classification +3

Paper
Add Code

Seeing Dynamic Scene in the Dark: A High-Quality Video Dataset With Mechatronic Alignment

1 code implementation • ICCV 2021 • RuiXing Wang, Xiaogang Xu, Chi-Wing Fu, Jiangbo Lu, Bei Yu, Jiaya Jia

Low-light video enhancement is an important task.

Low-Light Image Enhancement Video Enhancement

Paper
Code

Point Transformer

24 code implementations • ICCV 2021 • Hengshuang Zhao, Li Jiang, Jiaya Jia, Philip Torr, Vladlen Koltun

For example, on the challenging S3DIS dataset for large-scale semantic scene segmentation, the Point Transformer attains an mIoU of 70. 4% on Area 5, outperforming the strongest prior model by 3. 3 absolute percentage points and crossing the 70% mIoU threshold for the first time.

Ranked #3 on 3D Semantic Segmentation on STPLS3D

3D Part Segmentation 3D Point Cloud Classification +8

1,665

Paper
Code

GeoNet++: Iterative Geometric Neural Network with Edge-Aware Refinement for Joint Depth and Surface Normal Estimation

2 code implementations • 13 Dec 2020 • Xiaojuan Qi, Zhengzhe Liu, Renjie Liao, Philip H. S. Torr, Raquel Urtasun, Jiaya Jia

Note that GeoNet++ is generic and can be used in other depth/normal prediction frameworks to improve the quality of 3D reconstruction and pixel-wise accuracy of depth and surface normals.

3D Reconstruction Depth Estimation +2

119

Paper
Code

Fully Convolutional Networks for Panoptic Segmentation

6 code implementations • CVPR 2021 • Yanwei Li, Hengshuang Zhao, Xiaojuan Qi, LiWei Wang, Zeming Li, Jian Sun, Jiaya Jia

In this paper, we present a conceptually simple, strong, and efficient framework for panoptic segmentation, called Panoptic FCN.

Ranked #1 on Panoptic Segmentation on COCO minival (SQ metric)

Panoptic Segmentation Segmentation

388

Paper
Code

Learnable Boundary Guided Adversarial Training

3 code implementations • ICCV 2021 • Jiequan Cui, Shu Liu, LiWei Wang, Jiaya Jia

Previous adversarial training raises model robustness under the compromise of accuracy on natural data.

Ranked #1 on Adversarial Defense on CIFAR-100

Adversarial Defense

607

Paper
Code

Generalized Few-shot Semantic Segmentation

1 code implementation • CVPR 2022 • Zhuotao Tian, Xin Lai, Li Jiang, Shu Liu, Michelle Shu, Hengshuang Zhao, Jiaya Jia

Then, since context is essential for semantic segmentation, we propose the Context-Aware Prototype Learning (CAPL) that significantly improves performance by 1) leveraging the co-occurrence prior knowledge from support samples, and 2) dynamically enriching contextual information to the classifier, conditioned on the content of each query image.

Ranked #3 on Generalized Few-Shot Semantic Segmentation on COCO-20i (1-shot)

Generalized Few-Shot Semantic Segmentation Segmentation +1

Paper
Code

Prior Guided Feature Enrichment Network for Few-Shot Segmentation

3 code implementations • 4 Aug 2020 • Zhuotao Tian, Hengshuang Zhao, Michelle Shu, Zhicheng Yang, Ruiyu Li, Jiaya Jia

It consists of novel designs of (1) a training-free prior mask generation method that not only retains generalization power but also improves model performance and (2) Feature Enrichment Module (FEM) that overcomes spatial inconsistency by adaptively enriching query features with support features and prior masks.

Ranked #66 on Few-Shot Semantic Segmentation on COCO-20i (1-shot)

Few-Shot Semantic Segmentation Semantic Segmentation

294

Paper
Code

MuCAN: Multi-Correspondence Aggregation Network for Video Super-Resolution

1 code implementation • ECCV 2020 • Wenbo Li, Xin Tao, Taian Guo, Lu Qi, Jiangbo Lu, Jiaya Jia

Motivated by these findings, we propose a temporal multi-correspondence aggregation strategy to leverage similar patches across frames, and a cross-scale nonlocal-correspondence aggregation scheme to explore self-similarity of images across scales.

Optical Flow Estimation Video Super-Resolution

223

Paper
Code

Exploring Self-attention for Image Recognition

1 code implementation • CVPR 2020 • Hengshuang Zhao, Jiaya Jia, Vladlen Koltun

Recent work has shown that self-attention can serve as a basic building block for image recognition models.

746

Paper
Code

Dynamic Scale Training for Object Detection

4 code implementations • 26 Apr 2020 • Yukang Chen, Peizhen Zhang, Zeming Li, Yanwei Li, Xiangyu Zhang, Lu Qi, Jian Sun, Jiaya Jia

We propose a Dynamic Scale Training paradigm (abbreviated as DST) to mitigate scale variation challenge in object detection.

Instance Segmentation Model Optimization +4

Paper
Code

Attentive Normalization for Conditional Image Generation

1 code implementation • CVPR 2020 • Yi Wang, Ying-Cong Chen, Xiangyu Zhang, Jian Sun, Jiaya Jia

Traditional convolution-based generative adversarial networks synthesize images based on hierarchical local operations, where long-range dependency relation is implicitly modeled with a Markov chain.

Conditional Image Generation Semantic correspondence +2

Paper
Code

PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation

2 code implementations • CVPR 2020 • Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia

Instance segmentation is an important task for scene understanding.

Ranked #5 on 3D Instance Segmentation on STPLS3D

3D Instance Segmentation Clustering +3

1,115

Paper
Code

VCNet: A Robust Approach to Blind Image Inpainting

2 code implementations • ECCV 2020 • Yi Wang, Ying-Cong Chen, Xin Tao, Jiaya Jia

Blind inpainting is a task to automatically complete visual contents without specifying masks for missing areas in an image.

Image Inpainting

Paper
Code

Dynamic Divide-and-Conquer Adversarial Training for Robust Semantic Segmentation

1 code implementation • ICCV 2021 • Xiaogang Xu, Hengshuang Zhao, Jiaya Jia

Adversarial training is promising for improving robustness of deep neural networks towards adversarial perturbations, especially on the classification task.

Segmentation Semantic Segmentation

Paper
Code

PointINS: Point-based Instance Segmentation

no code implementations • 13 Mar 2020 • Lu Qi, Yi Wang, Yukang Chen, Yingcong Chen, Xiangyu Zhang, Jian Sun, Jiaya Jia

In this paper, we explore the mask representation in instance segmentation with Point-of-Interest (PoI) features.

Instance Segmentation Object Detection +3

Paper
Add Code

3DSSD: Point-based 3D Single Stage Object Detector

2 code implementations • CVPR 2020 • Zetong Yang, Yanan sun, Shu Liu, Jiaya Jia

Our method outperforms all state-of-the-art voxel-based single stage methods by a large margin, and has comparable performance to two stage point-based methods as well, with inference speed more than 25 FPS, 2x faster than former state-of-the-art point-based methods.

Object

4,790

Paper
Code

GridMask Data Augmentation

7 code implementations • 13 Jan 2020 • Pengguang Chen, Shu Liu, Hengshuang Zhao, Xingquan Wang, Jiaya Jia

Then we show limitation of existing information dropping algorithms and propose our structured method, which is simple and yet very effective.

Data Augmentation object-detection +4

5,251

Paper
Code

DSGN: Deep Stereo Geometry Network for 3D Object Detection

1 code implementation • CVPR 2020 • Yilun Chen, Shu Liu, Xiaoyong Shen, Jiaya Jia

Most state-of-the-art 3D object detectors heavily rely on LiDAR sensors because there is a large performance gap between image-based and LiDAR-based methods.

Ranked #4 on 3D Object Detection From Stereo Images on KITTI Cyclists Moderate

3D Object Detection From Stereo Images Object +2

318

Paper
Code

Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation

no code implementations • ICCV 2019 • Li Jiang, Hengshuang Zhao, Shu Liu, Xiaoyong Shen, Chi-Wing Fu, Jiaya Jia

To incorporate point features in the edge branch, we establish a hierarchical graph framework, where the graph is initialized from a coarse layer and gradually enriched along the point decoding process.

Ranked #41 on Semantic Segmentation on S3DIS Area5

Scene Labeling Semantic Segmentation

Paper
Add Code

Aggregation via Separation: Boosting Facial Landmark Detector with Semi-Supervised Style Translation

1 code implementation • ICCV 2019 • Shengju Qian, Keqiang Sun, Wayne Wu, Chen Qian, Jiaya Jia

Facial landmark detection, or face alignment, is a fundamental task that has been extensively studied.

Ranked #18 on Face Alignment on WFLW

Face Alignment Facial Landmark Detection +1

182

Paper
Code

Fast Point R-CNN

no code implementations • ICCV 2019 • Yilun Chen, Shu Liu, Xiaoyong Shen, Jiaya Jia

We present a unified, efficient and effective framework for point-cloud based 3D object detection.

3D Object Detection object-detection

Paper
Add Code

STD: Sparse-to-Dense 3D Object Detector for Point Cloud

no code implementations • ICCV 2019 • Zetong Yang, Yanan sun, Shu Liu, Xiaoyong Shen, Jiaya Jia

We present a new two-stage 3D object detection framework, named sparse-to-dense 3D Object Detector (STD).

Ranked #1 on Birds Eye View Object Detection on KITTI Pedestrians Easy

3D Object Detection Object +1

Paper
Add Code

Attribute-Driven Spontaneous Motion in Unpaired Image Translation

1 code implementation • ICCV 2019 • Ruizheng Wu, Xin Tao, Xiaodong Gu, Xiaoyong Shen, Jiaya Jia

Current image translation methods, albeit effective to produce high-quality results in various applications, still do not consider much geometric transform.

Attribute Motion Estimation +1

Paper
Code

Landmark Assisted CycleGAN for Cartoon Face Generation

no code implementations • 2 Jul 2019 • Ruizheng Wu, Xiaodong Gu, Xin Tao, Xiaoyong Shen, Yu-Wing Tai, Jiaya Jia

In this paper, we are interested in generating an cartoon face of a person by using unpaired training data between real faces and cartoon ones.

Face Generation

Paper
Add Code

Region Refinement Network for Salient Object Detection

no code implementations • 27 Jun 2019 • Zhuotao Tian, Hengshuang Zhao, Michelle Shu, Jiaze Wang, Ruiyu Li, Xiaoyong Shen, Jiaya Jia

Albeit intensively studied, false prediction and unclear boundaries are still major issues of salient object detection.

Object object-detection +5

Paper
Add Code

Associatively Segmenting Instances and Semantics in Point Clouds

3 code implementations • CVPR 2019 • Xinlong Wang, Shu Liu, Xiaoyong Shen, Chunhua Shen, Jiaya Jia

A 3D point cloud describes the real scene precisely and intuitively. To date how to segment diversified elements in such an informative 3D scene is rarely discussed.

Ranked #15 on 3D Instance Segmentation on S3DIS (mRec metric)

3D Instance Segmentation 3D Semantic Segmentation +1

248

Paper
Code

Human Pose Estimation with Spatial Contextual Information

no code implementations • 7 Jan 2019 • Hong Zhang, Hao Ouyang, Shu Liu, Xiaojuan Qi, Xiaoyong Shen, Ruigang Yang, Jiaya Jia

With this principle, we present two conceptually simple and yet computational efficient modules, namely Cascade Prediction Fusion (CPF) and Pose Graph Neural Network (PGNN), to exploit underlying contextual information.

Ranked #10 on Pose Estimation on MPII Human Pose

Pose Estimation

Paper
Add Code

IPOD: Intensive Point-based Object Detector for Point Cloud

no code implementations • 13 Dec 2018 • Zetong Yang, Yanan sun, Shu Liu, Xiaoyong Shen, Jiaya Jia

We present a novel 3D object detection framework, named IPOD, based on raw point cloud.

Ranked #1 on 3D Object Detection on KITTI Pedestrians Easy

3D Object Detection Object +1

Paper
Add Code

Image Inpainting via Generative Multi-column Convolutional Neural Networks

2 code implementations • NeurIPS 2018 • Yi Wang, Xin Tao, Xiaojuan Qi, Xiaoyong Shen, Jiaya Jia

In this paper, we propose a generative multi-column network for image inpainting.

Image Inpainting

417

Paper
Code

Sequential Context Encoding for Duplicate Removal

no code implementations • NeurIPS 2018 • Lu Qi, Shu Liu, Jianping Shi, Jiaya Jia

Duplicate removal is a critical step to accomplish a reasonable amount of predictions in prevalent proposal-based object detection frameworks.

Object object-detection +1

Paper
Add Code

GAL: Geometric Adversarial Loss for Single-View 3D-Object Reconstruction

no code implementations • ECCV 2018 • Li Jiang, Shaoshuai Shi, Xiaojuan Qi, Jiaya Jia

We propose to add geometric adversarial loss (GAL).

3D Object Reconstruction

Paper
Add Code

Compositing-aware Image Search

no code implementations • ECCV 2018 • Hengshuang Zhao, Xiaohui Shen, Zhe Lin, Kalyan Sunkavalli, Brian Price, Jiaya Jia

We present a new image search technique that, given a background image, returns compatible foreground objects for image compositing tasks.

Image Retrieval Object

Paper
Add Code

PSANet: Point-wise Spatial Attention Network for Scene Parsing

4 code implementations • ECCV 2018 • Hengshuang Zhao, Yi Zhang, Shu Liu, Jianping Shi, Chen Change Loy, Dahua Lin, Jiaya Jia

We notice information flow in convolutional neural networks is restricted inside local neighborhood regions due to the physical design of convolutional filters, which limits the overall understanding of complex scenes.

Ranked #51 on Semantic Segmentation on Cityscapes test

Position Scene Parsing +1

7,387

Paper
Code

SegStereo: Exploiting Semantic Information for Disparity Estimation

no code implementations • ECCV 2018 • Guorun Yang, Hengshuang Zhao, Jianping Shi, Zhidong Deng, Jiaya Jia

Disparity estimation for binocular stereo images finds a wide range of applications.

Ranked #6 on Semantic Segmentation on KITTI Semantic Segmentation

Disparity Estimation Semantic Segmentation

Paper
Add Code

GeoNet: Geometric Neural Network for Joint Depth and Surface Normal Estimation

1 code implementation • CVPR 2018 • Xiaojuan Qi, Renjie Liao, Zhengzhe Liu, Raquel Urtasun, Jiaya Jia

In this paper, we propose Geometric Neural Network (GeoNet) to jointly predict depth and surface normal maps from a single image.

Depth Estimation Surface Normal Estimation

119

Paper
Code

Referring Image Segmentation via Recurrent Refinement Networks

1 code implementation • CVPR 2018 • Ruiyu Li, Kaican Li, Yi-Chun Kuo, Michelle Shu, Xiaojuan Qi, Xiaoyong Shen, Jiaya Jia

We address the problem of image segmentation from natural language descriptions.

Image Segmentation Referring Expression Segmentation +2

Paper
Code

Semi-parametric Image Synthesis

1 code implementation • CVPR 2018 • Xiaojuan Qi, Qifeng Chen, Jiaya Jia, Vladlen Koltun

We present a semi-parametric approach to photographic image synthesis from semantic layouts.

Ranked #6 on Image-to-Image Translation on ADE20K-Outdoor Labels-to-Photos

Image-to-Image Translation Semantic Segmentation

269

Paper
Code

Facelet-Bank for Fast Portrait Manipulation

no code implementations • CVPR 2018 • Ying-Cong Chen, Huaijia Lin, Michelle Shu, Ruiyu Li, Xin Tao, Yangang Ye, Xiaoyong Shen, Jiaya Jia

Digital face manipulation has become a popular and fascinating way to touch images with the prevalence of smartphones and social networks.

Facial Editing

Paper
Add Code

Path Aggregation Network for Instance Segmentation

10 code implementations • CVPR 2018 • Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, Jiaya Jia

The way that information propagates in neural networks is of great importance.

Ranked #3 on Object Detection on iSAID

Instance Segmentation object-detection +3

46,832

Paper
Code

Scale-recurrent Network for Deep Image Deblurring

4 code implementations • CVPR 2018 • Xin Tao, Hongyun Gao, Yi Wang, Xiaoyong Shen, Jue Wang, Jiaya Jia

In single image deblurring, the "coarse-to-fine" scheme, i. e. gradually restoring the sharp image on different resolutions in a pyramid, is very successful in both traditional optimization-based methods and recent neural-network-based approaches.

Ranked #3 on Image Deblurring on GoPro (Params (M) metric, using extra training data)

Deblurring Image Deblurring +1

708

Paper
Code

Makeup-Go: Blind Reversion of Portrait Edit

no code implementations • ICCV 2017 • Ying-Cong Chen, Xiaoyong Shen, Jiaya Jia

In this paper, we propose the task of restoring a portrait image from this process.

regression

Paper
Add Code

3D Graph Neural Networks for RGBD Semantic Segmentation

2 code implementations • ICCV 2017 • Xiaojuan Qi, Renjie Liao, Jiaya Jia, Sanja Fidler, Raquel Urtasun

Each node in the graph corresponds to a set of points and is associated with a hidden representation vector initialized with an appearance feature extracted by a unary CNN from 2D images.

Ranked #30 on Semantic Segmentation on SUN-RGBD (using extra training data)

RGBD Semantic Segmentation Semantic Segmentation

227

Paper
Code

SGN: Sequential Grouping Networks for Instance Segmentation

no code implementations • ICCV 2017 • Shu Liu, Jiaya Jia, Sanja Fidler, Raquel Urtasun

By exploiting two-directional information, the second network groups horizontal and vertical lines into connected components.

Instance Segmentation Object +1

Paper
Add Code

Unsupervised Learning of Stereo Matching

no code implementations • ICCV 2017 • Chao Zhou, Hong Zhang, Xiaoyong Shen, Jiaya Jia

However, due to the limitations of these datasets and the difficulty of collecting new stereo data, current methods fail in real-life cases.

Stereo Matching Stereo Matching Hand

Paper
Add Code

Situation Recognition with Graph Neural Networks

1 code implementation • ICCV 2017 • Ruiyu Li, Makarand Tapaswi, Renjie Liao, Jiaya Jia, Raquel Urtasun, Sanja Fidler

We address the problem of recognizing situations in images.

Ranked #9 on Situation Recognition on imSitu

Grounded Situation Recognition

Paper
Code

Automatic Real-time Background Cut for Portrait Videos

no code implementations • 28 Apr 2017 • Xiaoyong Shen, RuiXing Wang, Hengshuang Zhao, Jiaya Jia

A spatial-temporal refinement network is developed to further refine the segmentation errors in each frame and ensure temporal coherence in the segmentation map.

Segmentation Semantic Segmentation +2

Paper
Add Code

ICNet for Real-Time Semantic Segmentation on High-Resolution Images

17 code implementations • ECCV 2018 • Hengshuang Zhao, Xiaojuan Qi, Xiaoyong Shen, Jianping Shi, Jiaya Jia

We focus on the challenging task of real-time semantic segmentation in this paper.

Ranked #10 on Dichotomous Image Segmentation on DIS-TE4

Dichotomous Image Segmentation Real-Time Semantic Segmentation +3

2,917

Paper
Code

Zero-order Reverse Filtering

1 code implementation • ICCV 2017 • Xin Tao, Chao Zhou, Xiaoyong Shen, Jue Wang, Jiaya Jia

In this paper, we study an unconventional but practically meaningful reversibility problem of commonly used image filters.

Paper
Code

Detail-revealing Deep Video Super-resolution

1 code implementation • ICCV 2017 • Xin Tao, Hongyun Gao, Renjie Liao, Jue Wang, Jiaya Jia

In this paper, we show that proper frame alignment and motion compensation is crucial for achieving high quality results.

Ranked #11 on Video Super-Resolution on Vid4 - 4x upscaling

Image Super-Resolution Motion Compensation +1

259

Paper
Code

High-Quality Correspondence and Segmentation Estimation for Dual-Lens Smart-Phone Portraits

no code implementations • ICCV 2017 • Xiaoyong Shen, Hongyun Gao, Xin Tao, Chao Zhou, Jiaya Jia

Estimating correspondence between two images and extracting the foreground object are two challenges in computer vision.

Paper
Add Code

Convolutional Neural Pyramid for Image Processing

no code implementations • 7 Apr 2017 • Xiaoyong Shen, Ying-Cong Chen, Xin Tao, Jiaya Jia

We propose a principled convolutional neural pyramid (CNP) framework for general low-level vision and image processing tasks.

Colorization Image Enhancement +2

Paper
Add Code

Pyramid Scene Parsing Network

67 code implementations • CVPR 2017 • Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia

Scene parsing is challenging for unrestricted open vocabulary and diverse scenes.

Ranked #4 on Video Semantic Segmentation on Cityscapes val

Dichotomous Image Segmentation Image Classification +5

76,588

Paper
Code

Visual Question Answering with Question Representation Update (QRU)

no code implementations • NeurIPS 2016 • Ruiyu Li, Jiaya Jia

Our method aims at reasoning over natural language questions and visual images.

Question Answering Visual Question Answering

Paper
Add Code

Multi-Scale Patch Aggregation (MPA) for Simultaneous Detection and Segmentation

no code implementations • CVPR 2016 • Shu Liu, Xiaojuan Qi, Jianping Shi, Hong Zhang, Jiaya Jia

Aiming at simultaneous detection and segmentation (SDS), we propose a proposal-free framework, which detect and segment object instances via mid-level patches.

Object Object Proposal Generation +1

Paper
Add Code

ScribbleSup: Scribble-Supervised Convolutional Networks for Semantic Segmentation

no code implementations • CVPR 2016 • Di Lin, Jifeng Dai, Jiaya Jia, Kaiming He, Jian Sun

Large-scale data is of crucial importance for learning semantic segmentation models, but annotating per-pixel masks is a tedious and inefficient procedure.

Image Segmentation Segmentation +1

Paper
Add Code

A Closed-Form Solution to Tensor Voting: Theory and Applications

no code implementations • 19 Jan 2016 • Tai-Pang Wu, Sai-Kit Yeung, Jiaya Jia, Chi-Keung Tang, Gerard Medioni

We prove a closed-form solution to tensor voting (CFTV): given a point set in any dimensions, our closed-form solution provides an exact, continuous and efficient algorithm for computing a structure-aware tensor that simultaneously achieves salient structure detection and outlier attenuation.

Stereo Matching Stereo Matching Hand

Paper
Add Code

Contour Box: Rejecting Object Proposals Without Explicit Closed Contours

no code implementations • ICCV 2015 • Cewu Lu, Shu Liu, Jiaya Jia, Chi-Keung Tang

Closed contour is an important objectness indicator.

Object

Paper
Add Code

Box Aggregation for Proposal Decimation: Last Mile of Object Detection

no code implementations • ICCV 2015 • Shu Liu, Cewu Lu, Jiaya Jia

Regions-with-convolutional-neural-network (RCNN) is now a commonly employed object detection pipeline.

Object object-detection +1

Paper
Add Code

Semantic Segmentation With Object Clique Potential

no code implementations • ICCV 2015 • Xiaojuan Qi, Jianping Shi, Shu Liu, Renjie Liao, Jiaya Jia

In this paper, we propose an object clique potential for semantic segmentation.

Object Segmentation +1

Paper
Add Code

Mutual-Structure for Joint Filtering

no code implementations • ICCV 2015 • Xiaoyong Shen, Chao Zhou, Li Xu, Jiaya Jia

Previous joint/guided filters directly transfer the structural information in the reference image to the target one.

Depth Completion Image Enhancement +3

Paper
Add Code

Video Super-Resolution via Deep Draft-Ensemble Learning

no code implementations • ICCV 2015 • Renjie Liao, Xin Tao, Ruiyu Li, Ziyang Ma, Jiaya Jia

We propose a new direction for fast video super-resolution (VideoSR) via a SR draft ensemble, which is defined as the set of high-resolution patch candidates before final image deconvolution.

Ensemble Learning Image Deconvolution +1

Paper
Add Code

ENFT: Efficient Non-Consecutive Feature Tracking for Robust Structure-from-Motion

3 code implementations • 27 Oct 2015 • Guofeng Zhang, Hao-Min Liu, Zilong Dong, Jiaya Jia, Tien-Tsin Wong, Hujun Bao

Our framework consists of steps of solving the feature `dropout' problem when indistinctive structures, noise or large image distortion exists, and of rapidly recognizing and joining common features located in different subsequences.

250

Paper
Code

Just Noticeable Defocus Blur Detection and Estimation

no code implementations • CVPR 2015 • Jianping Shi, Li Xu, Jiaya Jia

We tackle a fundamental problem to detect and estimate just noticeable blur (JNB) caused by defocus that spans a small number of pixels in images.

Defocus Blur Detection

Paper
Add Code

Deep LAC: Deep Localization, Alignment and Classification for Fine-Grained Recognition

no code implementations • CVPR 2015 • Di Lin, Xiaoyong Shen, Cewu Lu, Jiaya Jia

Our major contribution is to propose a valve linkage function(VLF) for back-propagation chaining and form our deep localization, alignment and classification (LAC) system.

Classification General Classification

Paper
Add Code

Handling Motion Blur in Multi-Frame Super-Resolution

no code implementations • CVPR 2015 • Ziyang Ma, Renjie Liao, Xin Tao, Li Xu, Jiaya Jia, Enhua Wu

Ubiquitous motion blur easily fails multi-frame super-resolution (MFSR).

Image Reconstruction Multi-Frame Super-Resolution

Paper
Add Code

Bounded-Distortion Metric Learning

no code implementations • 10 May 2015 • Renjie Liao, Jianping Shi, Ziyang Ma, Jun Zhu, Jiaya Jia

Metric learning aims to embed one metric space into another to benefit tasks like classification and clustering.

Clustering General Classification +1

Paper
Add Code

Understanding and Diagnosing Visual Tracking Systems

no code implementations • ICCV 2015 • Naiyan Wang, Jianping Shi, Dit-yan Yeung, Jiaya Jia

Surprisingly, our findings are discrepant with some common beliefs in the visual tracking research community.

Visual Tracking

Paper
Add Code

Deep Convolutional Neural Network for Image Deconvolution

no code implementations • NeurIPS 2014 • Li Xu, Jimmy SJ. Ren, Ce Liu, Jiaya Jia

Many fundamental image-related problems involve deconvolution operators.

Ranked #1 on Image Compression on FER2013

Image Compression Image Deconvolution

Paper
Add Code

Hierarchical Saliency Detection on Extended CSSD

no code implementations • 11 Aug 2014 • Jianping Shi, Qiong Yan, Li Xu, Jiaya Jia

Complex structures commonly exist in natural images.

Saliency Detection

Paper
Add Code

Range-Sample Depth Feature for Action Recognition

no code implementations • CVPR 2014 • Cewu Lu, Jiaya Jia, Chi-Keung Tang

We propose binary range-sample feature in depth.

Action Recognition Temporal Action Localization

Paper
Add Code

Discriminative Blur Detection Features

no code implementations • CVPR 2014 • Jianping Shi, Li Xu, Jiaya Jia

Ubiquitous image blur brings out a practically important question what are effective features to differentiate between blurred and unblurred image regions.

Deblurring

Paper
Add Code

Learning Important Spatial Pooling Regions for Scene Classification

no code implementations • CVPR 2014 • Di Lin, Cewu Lu, Renjie Liao, Jiaya Jia

We address the false response influence problem when learning and applying discriminative parts to construct the mid-level representation in scene classification.

Classification General Classification +1

Paper
Add Code

100+ Times Faster Weighted Median Filter (WMF)

no code implementations • CVPR 2014 • Qi Zhang, Li Xu, Jiaya Jia

Weighted median, in the form of either solver or filter, has been employed in a wide range of computer vision solutions for its beneficial properties in sparsity representation.

2D Semantic Segmentation task 1 (8 classes) Optical Flow Estimation +2

Paper
Add Code

Two-Class Weather Classification

no code implementations • CVPR 2014 • Cewu Lu, Di Lin, Jiaya Jia, Chi-Keung Tang

Given a single outdoor image, this paper proposes a collaborative learning approach for labeling it as either sunny or cloudy.

Classification General Classification +1

Paper
Add Code

L0 Regularized Stationary Time Estimation for Crowd Group Analysis

no code implementations • CVPR 2014 • Shuai Yi, Xiaogang Wang, Cewu Lu, Jiaya Jia

We tackle stationary crowd analysis in this paper, which is similarly important as modeling mobile groups in crowd scenes and finds many applications in surveillance.

Paper
Add Code

ESSP: An Efficient Approach to Minimizing Dense and Nonsubmodular Energy Functions

no code implementations • 19 May 2014 • Wei Feng, Jiaya Jia, Zhi-Qiang Liu

From our study, we make some reasonable recommendations of combining existing methods that perform the best in different situations for this challenging problem.

Paper
Add Code

Dense Scattering Layer Removal

no code implementations • 13 Oct 2013 • Qiong Yan, Li Xu, Jiaya Jia

We propose a new model, together with advanced optimization, to separate a thick scattering media layer from a single natural image.

Paper
Add Code

Online Robust Dictionary Learning

no code implementations • CVPR 2013 • Cewu Lu, Jiaping Shi, Jiaya Jia

Online dictionary learning is particularly useful for processing large-scale and dynamic data in computer vision.

Dictionary Learning

Paper
Add Code

Unnatural L0 Sparse Representation for Natural Image Deblurring

no code implementations • CVPR 2013 • Li Xu, Shicheng Zheng, Jiaya Jia

We show in this paper that the success of previous maximum a posterior (MAP) based blur removal methods partly stems from their respective intermediate steps, which implicitly or explicitly create an unnatural representation containing salient image structures.

Ranked #13 on Deblurring on RealBlur-R (trained on GoPro) (SSIM (sRGB) metric)

Deblurring Image Deblurring

Paper
Add Code

Hierarchical Saliency Detection

no code implementations • CVPR 2013 • Qiong Yan, Li Xu, Jianping Shi, Jiaya Jia

When dealing with objects with complex structures, saliency detection confronts a critical problem namely that detection accuracy could be adversely affected if salient foreground or background in an image contains small-scale high-contrast patterns.

Saliency Detection

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.