Search Results for author: Zhiding Yu

Found 79 papers, 48 papers with code

UFO²: A Unified Framework towards Omni-supervised Object Detection

1 code implementation • ECCV 2020 • Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Alexander G. Schwing, Jan Kautz

Existing work on object detection often relies on a single form of annotation: the model is trained using either accurate yet costly bounding boxes or cheaper but less expressive image-level tags.

Object object-detection +1

358

Paper
Code

What is Point Supervision Worth in Video Instance Segmentation?

no code implementations • 1 Apr 2024 • Shuaiyi Huang, De-An Huang, Zhiding Yu, Shiyi Lan, Subhashree Radhakrishnan, Jose M. Alvarez, Abhinav Shrivastava, Anima Anandkumar

Video instance segmentation (VIS) is a challenging vision task that aims to detect, segment, and track objects in videos.

Instance Segmentation Object +2

Paper
Add Code

LITA: Language Instructed Temporal-Localization Assistant

1 code implementation • 27 Mar 2024 • De-An Huang, Shijia Liao, Subhashree Radhakrishnan, Hongxu Yin, Pavlo Molchanov, Zhiding Yu, Jan Kautz

In addition to leveraging existing video datasets with timestamps, we propose a new task, Reasoning Temporal Localization (RTL), along with the dataset, ActivityNet-RTL, for learning and evaluating this task.

Ranked #4 on Video-based Generative Performance Benchmarking on VideoInstruct

Instruction Following Temporal Localization +2

103

Paper
Code

Improving Distant 3D Object Detection Using 2D Box Supervision

no code implementations • 14 Mar 2024 • Zetong Yang, Zhiding Yu, Chris Choy, Renhao Wang, Anima Anandkumar, Jose M. Alvarez

This mapping allows the depth estimation of distant objects conditioned on their 2D boxes, making long-range 3D detection with 2D supervision feasible.

3D Object Detection Depth Estimation +2

Paper
Add Code

T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching

1 code implementation • 21 Feb 2024 • Zizheng Pan, Bohan Zhuang, De-An Huang, Weili Nie, Zhiding Yu, Chaowei Xiao, Jianfei Cai, Anima Anandkumar

Sampling from diffusion probabilistic models (DPMs) is often expensive for high-quality image generation and typically requires many steps with a large model.

Image Generation

Paper
Code

Fully Attentional Networks with Self-emerging Token Labeling

1 code implementation • ICCV 2023 • Bingyin Zhao, Zhiding Yu, Shiyi Lan, Yutao Cheng, Anima Anandkumar, Yingjie Lao, Jose M. Alvarez

With the proposed STL framework, our best model based on FAN-L-Hybrid (77. 3M parameters) achieves 84. 8% Top-1 accuracy and 42. 1% mCE on ImageNet-1K and ImageNet-C, and sets a new state-of-the-art for ImageNet-A (46. 1%) and ImageNet-R (56. 6%) without using extra data, outperforming the original FAN counterpart by significant margins.

Ranked #16 on Domain Generalization on ImageNet-C

Semantic Segmentation

Paper
Code

A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties

1 code implementation • 21 Dec 2023 • Junfei Xiao, Ziqi Zhou, Wenxuan Li, Shiyi Lan, Jieru Mei, Zhiding Yu, Alan Yuille, Yuyin Zhou, Cihang Xie

Instead of relying solely on category-specific annotations, ProLab uses descriptive properties grounded in common sense knowledge for supervising segmentation models.

Common Sense Reasoning Descriptive +1

Paper
Code

Is Ego Status All You Need for Open-Loop End-to-End Autonomous Driving?

1 code implementation • 5 Dec 2023 • Zhiqi Li, Zhiding Yu, Shiyi Lan, Jiahan Li, Jan Kautz, Tong Lu, Jose M. Alvarez

We initially observed that the nuScenes dataset, characterized by relatively simple driving scenarios, leads to an under-utilization of perception information in end-to-end models incorporating ego status, such as the ego vehicle's velocity.

Autonomous Driving

Paper
Code

FocalFormer3D : Focusing on Hard Instance for 3D Object Detection

1 code implementation • 8 Aug 2023 • Yilun Chen, Zhiding Yu, Yukang Chen, Shiyi Lan, Animashree Anandkumar, Jiaya Jia, Jose Alvarez

For 3D object detection, we instantiate this method as FocalFormer3D, a simple yet effective detector that excels at excavating difficult objects and improving prediction recall.

Ranked #8 on 3D Object Detection on nuScenes

3D Object Detection Autonomous Driving +2

130

Paper
Code

FB-BEV: BEV Representation from Forward-Backward View Transformations

1 code implementation • ICCV 2023 • Zhiqi Li, Zhiding Yu, Wenhai Wang, Anima Anandkumar, Tong Lu, Jose M. Alvarez

Currently, the two most prominent VTM paradigms are forward projection and backward projection.

537

Paper
Code

FB-OCC: 3D Occupancy Prediction based on Forward-Backward View Transformation

1 code implementation • 4 Jul 2023 • Zhiqi Li, Zhiding Yu, David Austin, Mingsheng Fang, Shiyi Lan, Jan Kautz, Jose M. Alvarez

This technical report summarizes the winning solution for the 3D Occupancy Prediction Challenge, which is held in conjunction with the CVPR 2023 Workshop on End-to-End Autonomous Driving and CVPR 23 Workshop on Vision-Centric Autonomous Driving Workshop.

Ranked #1 on Prediction Of Occupancy Grid Maps on Occ3D-nuScenes

Autonomous Driving Prediction Of Occupancy Grid Maps

537

Paper
Code

Differentially Private Video Activity Recognition

no code implementations • 27 Jun 2023 • Zelun Luo, Yuliang Zou, Yijin Yang, Zane Durante, De-An Huang, Zhiding Yu, Chaowei Xiao, Li Fei-Fei, Animashree Anandkumar

In recent years, differential privacy has seen significant advancements in image classification; however, its application to video activity recognition remains under-explored.

Activity Recognition Classification +2

Paper
Add Code

SSCBench: Monocular 3D Semantic Scene Completion Benchmark in Street Views

1 code implementation • 15 Jun 2023 • Yiming Li, Sihang Li, Xinhao Liu, Moonjun Gong, Kenan Li, Nuo Chen, Zijun Wang, Zhiheng Li, Tao Jiang, Fisher Yu, Yue Wang, Hang Zhao, Zhiding Yu, Chen Feng

Monocular scene understanding is a foundational component of autonomous systems.

3D Semantic Scene Completion 3D Semantic Scene Completion from a single 2D image

149

Paper
Code

Real-Time Radiance Fields for Single-Image Portrait View Synthesis

no code implementations • 3 May 2023 • Alex Trevithick, Matthew Chan, Michael Stengel, Eric R. Chan, Chao Liu, Zhiding Yu, Sameh Khamis, Manmohan Chandraker, Ravi Ramamoorthi, Koki Nagano

We present a one-shot method to infer and render a photorealistic 3D representation from a single unposed image (e. g., face portrait) in real-time.

Data Augmentation Novel View Synthesis

Paper
Add Code

Prismer: A Vision-Language Model with Multi-Task Experts

2 code implementations • 4 Mar 2023 • Shikun Liu, Linxi Fan, Edward Johns, Zhiding Yu, Chaowei Xiao, Anima Anandkumar

Recent vision-language models have shown impressive multi-modal generation capabilities.

Ranked #1 on Image Captioning on nocaps val

Few-Shot Learning Image Captioning +2

1,287

Paper
Code

VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion

1 code implementation • CVPR 2023 • Yiming Li, Zhiding Yu, Christopher Choy, Chaowei Xiao, Jose M. Alvarez, Sanja Fidler, Chen Feng, Anima Anandkumar

To enable such capability in AI systems, we propose VoxFormer, a Transformer-based semantic scene completion framework that can output complete 3D volumetric semantics from only 2D images.

Ranked #3 on 3D Semantic Scene Completion from a single RGB image on KITTI-360

3D Semantic Scene Completion from a single RGB image Depth Estimation

962

Paper
Code

Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning

no code implementations • 9 Feb 2023 • Zhuolin Yang, Wei Ping, Zihan Liu, Vijay Korthikanti, Weili Nie, De-An Huang, Linxi Fan, Zhiding Yu, Shiyi Lan, Bo Li, Ming-Yu Liu, Yuke Zhu, Mohammad Shoeybi, Bryan Catanzaro, Chaowei Xiao, Anima Anandkumar

Augmenting pretrained language models (LMs) with a vision encoder (e. g., Flamingo) has obtained the state-of-the-art results in image-to-text generation.

Few-Shot Learning Image Captioning +3

Paper
Add Code

Vision Transformers Are Good Mask Auto-Labelers

no code implementations • CVPR 2023 • Shiyi Lan, Xitong Yang, Zhiding Yu, Zuxuan Wu, Jose M. Alvarez, Anima Anandkumar

We propose Mask Auto-Labeler (MAL), a high-quality Transformer-based mask auto-labeling framework for instance segmentation using only box annotations.

Instance Segmentation Segmentation +1

Paper
Add Code

End-to-end 3D Tracking with Decoupled Queries

no code implementations • ICCV 2023 • Yanwei Li, Zhiding Yu, Jonah Philion, Anima Anandkumar, Sanja Fidler, Jiaya Jia, Jose Alvarez

In this work, we present an end-to-end framework for camera-based 3D multi-object tracking, called DQTrack.

3D Multi-Object Tracking

Paper
Add Code

FocalFormer3D: Focusing on Hard Instance for 3D Object Detection

1 code implementation • ICCV 2023 • Yilun Chen, Zhiding Yu, Yukang Chen, Shiyi Lan, Anima Anandkumar, Jiaya Jia, Jose M. Alvarez

For 3D object detection, we instantiate this method as FocalFormer3D, a simple yet effective detector that excels at excavating difficult objects and improving prediction recall.

3D Object Detection Autonomous Driving +2

130

Paper
Code

1st Place Solution of The Robust Vision Challenge 2022 Semantic Segmentation Track

1 code implementation • 23 Oct 2022 • Junfei Xiao, Zhichao Xu, Shiyi Lan, Zhiding Yu, Alan Yuille, Anima Anandkumar

The model is trained on a composite dataset consisting of images from 9 datasets (ADE20K, Cityscapes, Mapillary Vistas, ScanNet, VIPER, WildDash 2, IDD, BDD, and COCO) with a simple dataset balancing strategy.

Segmentation Semantic Segmentation

Paper
Code

Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models

2 code implementations • 15 Sep 2022 • Manli Shu, Weili Nie, De-An Huang, Zhiding Yu, Tom Goldstein, Anima Anandkumar, Chaowei Xiao

In evaluating cross-dataset generalization with unseen categories, TPT performs on par with the state-of-the-art approaches that use additional training data.

Image Classification Zero-shot Generalization

117

Paper
Code

PointDP: Diffusion-driven Purification against Adversarial Attacks on 3D Point Cloud Recognition

no code implementations • 21 Aug 2022 • Jiachen Sun, Weili Nie, Zhiding Yu, Z. Morley Mao, Chaowei Xiao

3D Point cloud is becoming a critical data representation in many real-world applications like autonomous driving, robotics, and medical imaging.

Autonomous Driving

Paper
Add Code

MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training

2 code implementations • 3 Aug 2022 • De-An Huang, Zhiding Yu, Anima Anandkumar

By only training a query-based image instance segmentation model, MinVIS outperforms the previous best result on the challenging Occluded VIS dataset by over 10% AP.

Ranked #13 on Video Instance Segmentation on YouTube-VIS validation

Instance Segmentation Segmentation +2

261

Paper
Code

How Much More Data Do I Need? Estimating Requirements for Downstream Tasks

no code implementations • CVPR 2022 • Rafid Mahmood, James Lucas, David Acuna, Daiqing Li, Jonah Philion, Jose M. Alvarez, Zhiding Yu, Sanja Fidler, Marc T. Law

Given a small training data set and a learning algorithm, how much more data is necessary to reach a target validation or test performance?

Autonomous Driving

Paper
Add Code

Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions

1 code implementation • CVPR 2022 • Huaizu Jiang, Xiaojian Ma, Weili Nie, Zhiding Yu, Yuke Zhu, Song-Chun Zhu, Anima Anandkumar

A significant gap remains between today's visual pattern recognition models and human-level visual cognition especially when it comes to few-shot learning and compositional reasoning of novel concepts.

Ranked #1 on Few-Shot Image Classification on Bongard-HOI

Benchmarking Few-Shot Image Classification +5

Paper
Code

Understanding The Robustness in Vision Transformers

2 code implementations • 26 Apr 2022 • Daquan Zhou, Zhiding Yu, Enze Xie, Chaowei Xiao, Anima Anandkumar, Jiashi Feng, Jose M. Alvarez

Our study is motivated by the intriguing properties of the emerging visual grouping in Vision Transformers, which indicates that self-attention may promote robustness through improved mid-level representations.

Ranked #4 on Domain Generalization on ImageNet-R (using extra training data)

Domain Generalization Image Classification +3

458

Paper
Code

RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning

1 code implementation • ICLR 2022 • Xiaojian Ma, Weili Nie, Zhiding Yu, Huaizu Jiang, Chaowei Xiao, Yuke Zhu, Song-Chun Zhu, Anima Anandkumar

This task remains challenging for current deep learning algorithms since it requires addressing three key technical problems jointly: 1) identifying object entities and their properties, 2) inferring semantic relations between pairs of entities, and 3) generalizing to novel object-relation combinations, i. e., systematic generalization.

Ranked #1 on Zero-Shot Human-Object Interaction Detection on HICO

Human-Object Interaction Detection Object +5

Paper
Code

M$^2$BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified Birds-Eye View Representation

no code implementations • 11 Apr 2022 • Enze Xie, Zhiding Yu, Daquan Zhou, Jonah Philion, Anima Anandkumar, Sanja Fidler, Ping Luo, Jose M. Alvarez

In this paper, we propose M$^2$BEV, a unified framework that jointly performs 3D object detection and map segmentation in the Birds Eye View~(BEV) space with multi-camera image inputs.

3D Object Detection object-detection +1

Paper
Add Code

CoordGAN: Self-Supervised Dense Correspondences Emerge from GANs

1 code implementation • CVPR 2022 • Jiteng Mu, Shalini De Mello, Zhiding Yu, Nuno Vasconcelos, Xiaolong Wang, Jan Kautz, Sifei Liu

We represent the correspondence maps of different images as warped coordinate frames transformed from a canonical coordinate frame, i. e., the correspondence map, which describes the structure (e. g., the shape of a face), is controlled via a transformation.

Disentanglement

Paper
Code

FreeSOLO: Learning to Segment Objects without Annotations

1 code implementation • CVPR 2022 • Xinlong Wang, Zhiding Yu, Shalini De Mello, Jan Kautz, Anima Anandkumar, Chunhua Shen, Jose M. Alvarez

FreeSOLO further demonstrates superiority as a strong pre-training method, outperforming state-of-the-art self-supervised pre-training methods by +9. 8% AP when fine-tuning instance segmentation with only 5% COCO masks.

Instance Segmentation object-detection +4

309

Paper
Code

Benchmarking Robustness of 3D Point Cloud Recognition Against Common Corruptions

5 code implementations • 28 Jan 2022 • Jiachen Sun, Qingzhao Zhang, Bhavya Kailkhura, Zhiding Yu, Chaowei Xiao, Z. Morley Mao

Deep neural networks on 3D point cloud data have been widely used in the real world, especially in safety-critical applications.

Ranked #1 on 3D Point Cloud Data Augmentation on ModelNet40-C

3D Point Cloud Classification 3D Point Cloud Data Augmentation +2

201

Paper
Code

Adversarially Robust 3D Point Cloud Recognition Using Self-Supervisions

no code implementations • NeurIPS 2021 • Jiachen Sun, Yulong Cao, Christopher B. Choy, Zhiding Yu, Anima Anandkumar, Zhuoqing Morley Mao, Chaowei Xiao

In this paper, we systematically study the impact of various self-supervised learning proxy tasks on different architectures and threat models for 3D point clouds with adversarial training.

Adversarial Robustness Autonomous Driving +1

Paper
Add Code

Coupled Segmentation and Edge Learning via Dynamic Graph Propagation

no code implementations • NeurIPS 2021 • Zhiding Yu, Rui Huang, Wonmin Byeon, Sifei Liu, Guilin Liu, Thomas Breuel, Anima Anandkumar, Jan Kautz

It is therefore interesting to study how these two tasks can be coupled to benefit each other.

Edge Detection Image Segmentation +2

Paper
Add Code

AugMax: Adversarial Composition of Random Augmentations for Robust Training

1 code implementation • NeurIPS 2021 • Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Anima Anandkumar, Zhangyang Wang

Diversity and hardness are two complementary dimensions of data augmentation to achieve robustness.

Data Augmentation

123

Paper
Code

Scaling Fair Learning to Hundreds of Intersectional Groups

no code implementations • 29 Sep 2021 • Eric Zhao, De-An Huang, Hao liu, Zhiding Yu, Anqi Liu, Olga Russakovsky, Anima Anandkumar

In real-world applications, however, there are multiple protected attributes yielding a large number of intersectional protected groups.

Attribute Fairness +1

Paper
Add Code

Learning Contrastive Representation for Semantic Correspondence

no code implementations • 22 Sep 2021 • Taihong Xiao, Sifei Liu, Shalini De Mello, Zhiding Yu, Jan Kautz, Ming-Hsuan Yang

Dense correspondence across semantically related images has been extensively studied, but still faces two challenges: 1) large variations in appearance, scale and pose exist even for objects from the same category, and 2) labeling pixel-level dense correspondences is labor intensive and infeasible to scale.

Contrastive Learning Semantic correspondence

Paper
Add Code

Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers

1 code implementation • CVPR 2022 • Zhiqi Li, Wenhai Wang, Enze Xie, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo, Tong Lu

Specifically, we supervise the attention modules in the mask decoder in a layer-wise manner.

Ranked #4 on Panoptic Segmentation on COCO test-dev

Instance Segmentation Panoptic Segmentation +1

195

Paper
Code

Not All Labels Are Equal: Rationalizing The Labeling Costs for Training Object Detection

no code implementations • CVPR 2022 • Ismail Elezi, Zhiding Yu, Anima Anandkumar, Laura Leal-Taixe, Jose M. Alvarez

Deep neural networks have reached high accuracy on object detection but their success hinges on large amounts of labeled data.

Active Learning object-detection +1

Paper
Add Code

SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies

1 code implementation • 17 Jun 2021 • Linxi Fan, Guanzhi Wang, De-An Huang, Zhiding Yu, Li Fei-Fei, Yuke Zhu, Anima Anandkumar

A student network then learns to mimic the expert policy by supervised learning with strong augmentations, making its representation more robust against visual variations compared to the expert.

Autonomous Driving Image Augmentation +3

Paper
Code

Taxonomy of Machine Learning Safety: A Survey and Primer

no code implementations • 9 Jun 2021 • Sina Mohseni, Haotao Wang, Zhiding Yu, Chaowei Xiao, Zhangyang Wang, Jay Yadawa

The open-world deployment of Machine Learning (ML) algorithms in safety-critical applications such as autonomous vehicles needs to address a variety of ML vulnerabilities such as interpretability, verifiability, and performance limitations.

Autonomous Vehicles BIG-bench Machine Learning +1

Paper
Add Code

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

23 code implementations • NeurIPS 2021 • Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo

We present SegFormer, a simple, efficient yet powerful semantic segmentation framework which unifies Transformers with lightweight multilayer perception (MLP) decoders.

Ranked #1 on Semantic Segmentation on COCO-Stuff full

C++ code Semantic Segmentation +1

124,793

Paper
Code

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

3 code implementations • ICCV 2021 • Shiyi Lan, Zhiding Yu, Christopher Choy, Subhashree Radhakrishnan, Guilin Liu, Yuke Zhu, Larry S. Davis, Anima Anandkumar

We introduce DiscoBox, a novel framework that jointly learns instance segmentation and semantic correspondence using bounding box supervision.

Ranked #1 on Weakly-supervised instance segmentation on COCO 2017 val

Box-supervised Instance Segmentation Segmentation +2

Paper
Code

Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection

1 code implementation • 12 Apr 2021 • Nadine Chang, Zhiding Yu, Yu-Xiong Wang, Anima Anandkumar, Sanja Fidler, Jose M. Alvarez

As a result, image resampling alone is not enough to yield a sufficiently balanced distribution at the object level.

Object

Paper
Code

Contrastive Syn-to-Real Generalization

2 code implementations • ICLR 2021 • Wuyang Chen, Zhiding Yu, Shalini De Mello, Sifei Liu, Jose M. Alvarez, Zhangyang Wang, Anima Anandkumar

Training on synthetic data can be beneficial for label or data-scarce scenarios.

Domain Generalization Inductive Bias

Paper
Code

Transferable Unsupervised Robust Representation Learning

no code implementations • 1 Jan 2021 • De-An Huang, Zhiding Yu, Anima Anandkumar

We upend this view and show that URRL improves both the natural accuracy of unsupervised representation learning and its robustness to corruptions and adversarial noise.

Data Augmentation Representation Learning +1

Paper
Add Code

UFO$^2$: A Unified Framework towards Omni-supervised Object Detection

no code implementations • 21 Oct 2020 • Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Alexander G. Schwing, Jan Kautz

Existing work on object detection often relies on a single form of annotation: the model is trained using either accurate yet costly bounding boxes or cheaper but less expressive image-level tags.

object-detection Object Detection

Paper
Add Code

Learning Calibrated Uncertainties for Domain Shift: A Distributionally Robust Learning Approach

no code implementations • 8 Oct 2020 • Haoxuan Wang, Zhiding Yu, Yisong Yue, Anima Anandkumar, Anqi Liu, Junchi Yan

We propose a framework for learning calibrated uncertainties under domain shifts, where the source (training) distribution differs from the target (test) distribution.

Density Ratio Estimation Unsupervised Domain Adaptation

Paper
Add Code

Bongard-LOGO: A New Benchmark for Human-Level Concept Learning and Reasoning

1 code implementation • NeurIPS 2020 • Weili Nie, Zhiding Yu, Lei Mao, Ankit B. Patel, Yuke Zhu, Animashree Anandkumar

Inspired by the original one hundred BPs, we propose a new benchmark Bongard-LOGO for human-level concept learning and reasoning.

Novel Concepts Representation Learning +1

Paper
Code

Distributionally Robust Learning for Unsupervised Domain Adaptation

no code implementations • 28 Sep 2020 • Haoxuan Wang, Anqi Liu, Zhiding Yu, Yisong Yue, Anima Anandkumar

This formulation motivates the use of two neural networks that are jointly trained --- a discriminative network between the source and target domains for density-ratio estimation, in addition to the standard classification network.

Density Ratio Estimation Unsupervised Domain Adaptation

Paper
Add Code

Delving Deeper into Anti-aliasing in ConvNets

2 code implementations • 21 Aug 2020 • Xueyan Zou, Fanyi Xiao, Zhiding Yu, Yong Jae Lee

Aliasing refers to the phenomenon that high frequency signals degenerate into completely different ones after sampling.

Instance Segmentation Segmentation +1

186

Paper
Code

Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification

1 code implementation • ECCV 2020 • Yang Zou, Xiaodong Yang, Zhiding Yu, B. V. K. Vijaya Kumar, Jan Kautz

To this end, we propose a joint learning framework that disentangles id-related/unrelated features and enforces adaptation to work on the id-related feature space exclusively.

Ranked #6 on Unsupervised Domain Adaptation on Market to MSMT

Person Re-Identification Unsupervised Domain Adaptation

Paper
Code

Unsupervised Controllable Generation with Self-Training

no code implementations • 17 Jul 2020 • Grigorios G. Chrysos, Jean Kossaifi, Zhiding Yu, Anima Anandkumar

Instead, we propose an unsupervised framework to learn a distribution of latent codes that control the generator through self-training.

Disentanglement

Paper
Add Code

Neural Networks with Recurrent Generative Feedback

1 code implementation • NeurIPS 2020 • Yujia Huang, James Gornet, Sihui Dai, Zhiding Yu, Tan Nguyen, Doris Y. Tsao, Anima Anandkumar

This mechanism can be interpreted as a form of self-consistency between the maximum a posteriori (MAP) estimation of an internal generative model and the external environment.

Adversarial Robustness

Paper
Code

Automated Synthetic-to-Real Generalization

1 code implementation • ICML 2020 • Wuyang Chen, Zhiding Yu, Zhangyang Wang, Anima Anandkumar

Models trained on synthetic images often face degraded generalization to real data.

Domain Adaptation

Paper
Code

Transposer: Universal Texture Synthesis Using Feature Maps as Transposed Convolution Filter

no code implementations • 14 Jul 2020 • Guilin Liu, Rohan Taori, Ting-Chun Wang, Zhiding Yu, Shiqiu Liu, Fitsum A. Reda, Karan Sapra, Andrew Tao, Bryan Catanzaro

Specifically, we directly treat the whole encoded feature map of the input texture as transposed convolution filters and the features' self-similarity map, which captures the auto-correlation information, as input to the transposed convolution.

Texture Synthesis

Paper
Add Code

Uncertainty-aware multi-view co-training for semi-supervised medical image segmentation and domain adaptation

no code implementations • 28 Jun 2020 • Yingda Xia, Dong Yang, Zhiding Yu, Fengze Liu, Jinzheng Cai, Lequan Yu, Zhuotun Zhu, Daguang Xu, Alan Yuille, Holger Roth

Experiments on the NIH pancreas segmentation dataset and a multi-organ segmentation dataset show state-of-the-art performance of the proposed framework on semi-supervised medical image segmentation.

Image Segmentation Organ Segmentation +6

Paper
Add Code

Instance-aware, Context-focused, and Memory-efficient Weakly Supervised Object Detection

2 code implementations • CVPR 2020 • Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Yong Jae Lee, Alexander G. Schwing, Jan Kautz

Weakly supervised learning has emerged as a compelling tool for object detection by reducing the need for strong supervision during training.

Ranked #1 on Weakly Supervised Object Detection on COCO test-dev

Object object-detection +3

358

Paper
Code

Angular Visual Hardness

no code implementations • ICML 2020 • Beidi Chen, Weiyang Liu, Zhiding Yu, Jan Kautz, Anshumali Shrivastava, Animesh Garg, Anima Anandkumar

We also find that AVH has a statistically significant correlation with human visual hardness.

Domain Generalization

Paper
Add Code

Confidence Regularized Self-Training

2 code implementations • ICCV 2019 • Yang Zou, Zhiding Yu, Xiaofeng Liu, B. V. K. Vijaya Kumar, Jinsong Wang

Recent advances in domain adaptation show that deep self-training presents a powerful means for unsupervised domain adaptation.

Ranked #17 on Domain Adaptation on VisDA2017

Image Classification Semantic Segmentation +2

226

Paper
Code

Regularizing Neural Networks via Minimizing Hyperspherical Energy

1 code implementation • CVPR 2020 • Rongmei Lin, Weiyang Liu, Zhen Liu, Chen Feng, Zhiding Yu, James M. Rehg, Li Xiong, Le Song

Inspired by the Thomson problem in physics where the distribution of multiple propelling electrons on a unit sphere can be modeled via minimizing some potential energy, hyperspherical energy minimization has demonstrated its potential in regularizing neural networks and improving their generalization power.

Paper
Code

Joint Discriminative and Generative Learning for Person Re-identification

12 code implementations • CVPR 2019 • Zhedong Zheng, Xiaodong Yang, Zhiding Yu, Liang Zheng, Yi Yang, Jan Kautz

To this end, we propose a joint learning framework that couples re-id learning and data generation end-to-end.

Ranked #1 on Person Re-Identification on UAV-Human

Image-to-Image Translation Unsupervised Domain Adaptation +1

3,948

Paper
Code

Partial Convolution based Padding

4 code implementations • 28 Nov 2018 • Guilin Liu, Kevin J. Shih, Ting-Chun Wang, Fitsum A. Reda, Karan Sapra, Zhiding Yu, Andrew Tao, Bryan Catanzaro

In this paper, we present a simple yet effective padding scheme that can be used as a drop-in module for existing convolutional neural networks.

General Classification Semantic Segmentation

1,198

Paper
Code

Domain Adaptation for Semantic Segmentation via Class-Balanced Self-Training

1 code implementation • 18 Oct 2018 • Yang Zou, Zhiding Yu, B. V. K. Vijaya Kumar, Jinsong Wang

In this paper, we propose a novel UDA framework based on an iterative self-training procedure, where the problem is formulated as latent variable loss minimization, and can be solved by alternatively generating pseudo labels on target data and re-training the model with these labels.

Ranked #54 on Synthetic-to-Real Translation on GTAV-to-Cityscapes Labels

Pseudo Label Semantic Segmentation +2

184

Paper
Code

Unsupervised Domain Adaptation for Semantic Segmentation via Class-Balanced Self-Training

1 code implementation • ECCV 2018 • Yang Zou, Zhiding Yu, B. V. K. Vijaya Kumar, Jinsong Wang

Recent deep networks achieved state of the art performanceon a variety of semantic segmentation tasks.

Ranked #6 on Semi-Supervised Semantic Segmentation on SemanticKITTI

Image-to-Image Translation Pseudo Label +2

184

Paper
Code

Simultaneous Edge Alignment and Learning

3 code implementations • ECCV 2018 • Zhiding Yu, Weiyang Liu, Yang Zou, Chen Feng, Srikumar Ramalingam, B. V. K. Vijaya Kumar, Jan Kautz

Edge detection is among the most fundamental vision problems for its role in perceptual grouping and its wide applications.

Edge Detection Representation Learning

126

Paper
Code

Learning towards Minimum Hyperspherical Energy

4 code implementations • NeurIPS 2018 • Weiyang Liu, Rongmei Lin, Zhen Liu, Lixin Liu, Zhiding Yu, Bo Dai, Le Song

In light of this intuition, we reduce the redundancy regularization problem to generic energy minimization, and propose a minimum hyperspherical energy (MHE) objective as generic regularization for neural networks.

148

Paper
Code

Decoupled Networks

1 code implementation • CVPR 2018 • Weiyang Liu, Zhen Liu, Zhiding Yu, Bo Dai, Rongmei Lin, Yisen Wang, James M. Rehg, Le Song

Inner product-based convolution has been a central component of convolutional neural networks (CNNs) and the key to learning visual representations.

Paper
Code

Learning Strict Identity Mappings in Deep Residual Networks

1 code implementation • CVPR 2018 • Xin Yu, Zhiding Yu, Srikumar Ramalingam

A family of super deep networks, referred to as residual networks or ResNet, achieved record-beating performance in various visual tasks such as image recognition, object detection, and semantic segmentation.

object-detection Object Detection +1

6,299

Paper
Code

Deep Hyperspherical Learning

no code implementations • NeurIPS 2017 • Weiyang Liu, Yan-Ming Zhang, Xingguo Li, Zhiding Yu, Bo Dai, Tuo Zhao, Le Song

In light of such challenges, we propose hyperspherical convolution (SphereConv), a novel learning framework that gives angular representations on hyperspheres.

Representation Learning

Paper
Add Code

CASENet: Deep Category-Aware Semantic Edge Detection

11 code implementations • CVPR 2017 • Zhiding Yu, Chen Feng, Ming-Yu Liu, Srikumar Ramalingam

To this end, we propose a novel end-to-end deep semantic edge learning architecture based on ResNet and a new skip-layer architecture where category-wise edge activations at the top convolution layer share and are fused with the same set of bottom layer features.

Ranked #1 on Edge Detection on Cityscapes test

Edge Detection Object Proposal Generation +1

212

Paper
Code

SphereFace: Deep Hypersphere Embedding for Face Recognition

21 code implementations • CVPR 2017 • Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj, Le Song

This paper addresses deep face recognition (FR) problem under open-set protocol, where ideal face features are expected to have smaller maximal intra-class distance than minimal inter-class distance under a suitably chosen metric space.

Ranked #1 on Face Verification on CK+

Face Identification Face Recognition +1

1,574

Paper
Code

Large-Margin Softmax Loss for Convolutional Neural Networks

2 code implementations • 7 Dec 2016 • Weiyang Liu, Yandong Wen, Zhiding Yu, Meng Yang

Cross-entropy loss together with softmax is arguably one of the most common used supervision components in convolutional neural networks (CNNs).

General Classification

346

Paper
Code

Jointly Learning Non-negative Projection and Dictionary with Discriminative Graph Constraints for Classification

no code implementations • 14 Nov 2015 • Weiyang Liu, Zhiding Yu, Yandong Wen, Rongmei Lin, Meng Yang

Sparse coding with dictionary learning (DL) has shown excellent classification performance.

Dictionary Learning General Classification

Paper
Add Code

Structured Hough Voting for Vision-based Highway Border Detection

no code implementations • 18 Nov 2014 • Zhiding Yu, Wende Zhang, B. V. K. Vijaya Kumar, Dan Levi

We propose a vision-based highway border detection algorithm using structured Hough voting.

Paper
Add Code

KCRC-LCD: Discriminative Kernel Collaborative Representation with Locality Constrained Dictionary for Visual Categorization

no code implementations • 17 Oct 2014 • Weiyang Liu, Zhiding Yu, Lijia Lu, Yandong Wen, Hui Li, Yuexian Zou

The LCD similarity measure can be kernelized under KCRC, which theoretically links CRC and LCD under the kernel method.

Classification General Classification +1

Paper
Add Code

Transitive Distance Clustering with K-Means Duality

no code implementations • CVPR 2014 • Zhiding Yu, Chunjing Xu, Deyu Meng, Zhuo Hui, Fanyi Xiao, Wenbo Liu, Jianzhuang Liu

We propose a very intuitive and simple approximation for the conventional spectral clustering methods.

Clustering Image Segmentation +1

Paper
Add Code

Multi-Task Regularization with Covariance Dictionary for Linear Classifiers

no code implementations • 21 Oct 2013 • Fanyi Xiao, Ruikun Luo, Zhiding Yu

In this paper we propose a multi-task linear classifier learning problem called D-SVM (Dictionary SVM).

Transfer Learning valid

Paper
Add Code

Constructing the L2-Graph for Robust Subspace Learning and Subspace Clustering

no code implementations • 5 Sep 2012 • Xi Peng, Zhiding Yu, Huajin Tang, Zhang Yi

Under the framework of graph-based learning, the key to robust subspace clustering and subspace learning is to obtain a good similarity graph that eliminates the effects of errors and retains only connections between the data points from the same subspace (i. e., intra-subspace data points).

Clustering Image Clustering +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.