Search Results for author: Yukun Zhu

Found 27 papers, 14 papers with code

Rethinking Deep Face Restoration

no code implementations29 Sep 2021 Yang Zhao, Yu-Chuan Su, Chun-Te Chu, Yandong Li, Marius Renn, Yukun Zhu, Changyou Chen, Xuhui Jia

While existing approaches for face restoration make significant progress in generating high-quality faces, they often fail to preserve facial features and cannot authentically reconstruct the faces.

Face Generation Face Reconstruction

Federated Multi-Target Domain Adaptation

no code implementations17 Aug 2021 Chun-Han Yao, Boqing Gong, Yin Cui, Hang Qi, Yukun Zhu, Ming-Hsuan Yang

We further take the server-client and inter-client domain shifts into account and pose a domain adaptation problem with one source (centralized server data) and multiple targets (distributed client data).

Domain Adaptation Federated Learning +3

DeepLab2: A TensorFlow Library for Deep Labeling

1 code implementation17 Jun 2021 Mark Weber, Huiyu Wang, Siyuan Qiao, Jun Xie, Maxwell D. Collins, Yukun Zhu, Liangzhe Yuan, Dahun Kim, Qihang Yu, Daniel Cremers, Laura Leal-Taixe, Alan L. Yuille, Florian Schroff, Hartwig Adam, Liang-Chieh Chen

DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a state-of-the-art and easy-to-use TensorFlow codebase for general dense pixel prediction problems in computer vision.

BasisNet: Two-stage Model Synthesis for Efficient Inference

no code implementations7 May 2021 Mingda Zhang, Chun-Te Chu, Andrey Zhmoginov, Andrew Howard, Brendan Jou, Yukun Zhu, Li Zhang, Rebecca Hwa, Adriana Kovashka

With early termination, the average cost can be further reduced to 198M MAdds while maintaining accuracy of 80. 0% on ImageNet.

Joint Representation Learning and Novel Category Discovery on Single- and Multi-modal Data

no code implementations ICCV 2021 Xuhui Jia, Kai Han, Yukun Zhu, Bradley Green

This paper studies the problem of novel category discovery on single- and multi-modal data with labels from different but relevant categories.

Contrastive Learning Representation Learning

A Flexible Framework for Discovering Novel Categories with Contrastive Learning

no code implementations1 Jan 2021 Xuhui Jia, Kai Han, Yukun Zhu, Bradley Green

This paper studies the problem of novel category discovery on single- and multi-modal data with labels from different but relevant categories.

Contrastive Learning Representation Learning

ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic Segmentation

1 code implementation CVPR 2021 Siyuan Qiao, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

We name this joint task as Depth-aware Video Panoptic Segmentation, and propose a new evaluation metric along with two derived datasets for it, which will be made available to the public.

Monocular Depth Estimation Panoptic Segmentation

MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers

2 code implementations CVPR 2021 Huiyu Wang, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

As a result, MaX-DeepLab shows a significant 7. 1% PQ gain in the box-free regime on the challenging COCO dataset, closing the gap between box-based and box-free methods for the first time.

Panoptic Segmentation

Ranking Neural Checkpoints

no code implementations CVPR 2021 Yandong Li, Xuhui Jia, Ruoxin Sang, Yukun Zhu, Bradley Green, Liqiang Wang, Boqing Gong

This paper is concerned with ranking many pre-trained deep neural networks (DNNs), called checkpoints, for the transfer learning to a downstream task.

Transfer Learning

Boosting Image-based Mutual Gaze Detection using Pseudo 3D Gaze

no code implementations15 Oct 2020 Bardia Doosti, Ching-Hui Chen, Raviteja Vemulapalli, Xuhui Jia, Yukun Zhu, Bradley Green

In this work, we focus on the task of image-based mutual gaze detection, and propose a simple and effective approach to boost the performance by using an auxiliary 3D gaze estimation task during the training phase.

Gaze Estimation Mutual Gaze

Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation

6 code implementations CVPR 2020 Bowen Cheng, Maxwell D. Collins, Yukun Zhu, Ting Liu, Thomas S. Huang, Hartwig Adam, Liang-Chieh Chen

In this work, we introduce Panoptic-DeepLab, a simple, strong, and fast system for panoptic segmentation, aiming to establish a solid baseline for bottom-up methods that can achieve comparable performance of two-stage methods while yielding fast inference speed.

Ranked #3 on Instance Segmentation on Cityscapes test (using extra training data)

Instance Segmentation Panoptic Segmentation

Search to Distill: Pearls are Everywhere but not the Eyes

no code implementations CVPR 2020 Yu Liu, Xuhui Jia, Mingxing Tan, Raviteja Vemulapalli, Yukun Zhu, Bradley Green, Xiaogang Wang

Standard Knowledge Distillation (KD) approaches distill the knowledge of a cumbersome teacher model into the parameters of a student model with a pre-defined architecture.

Ensemble Learning Face Recognition +3


2 code implementations10 Oct 2019 Bowen Cheng, Maxwell D. Collins, Yukun Zhu, Ting Liu, Thomas S. Huang, Hartwig Adam, Liang-Chieh Chen

The semantic segmentation branch is the same as the typical design of any semantic segmentation model (e. g., DeepLab), while the instance segmentation branch is class-agnostic, involving a simple instance center regression.

Instance Segmentation Panoptic Segmentation

SPGNet: Semantic Prediction Guidance for Scene Parsing

no code implementations ICCV 2019 Bowen Cheng, Liang-Chieh Chen, Yunchao Wei, Yukun Zhu, Zilong Huang, JinJun Xiong, Thomas Huang, Wen-mei Hwu, Honghui Shi

The multi-scale context module refers to the operations to aggregate feature responses from a large spatial extent, while the single-stage encoder-decoder structure encodes the high-level semantic information in the encoder path and recovers the boundary information in the decoder path.

Pose Estimation Scene Parsing +1

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

66 code implementations ECCV 2018 Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, Hartwig Adam

The former networks are able to encode multi-scale contextual information by probing the incoming features with filters or pooling operations at multiple rates and multiple effective fields-of-view, while the latter networks can capture sharper object boundaries by gradually recovering the spatial information.

Image Classification Lesion Segmentation +1

Spatially Adaptive Computation Time for Residual Networks

1 code implementation CVPR 2017 Michael Figurnov, Maxwell D. Collins, Yukun Zhu, Li Zhang, Jonathan Huang, Dmitry Vetrov, Ruslan Salakhutdinov

This paper proposes a deep learning architecture based on Residual Network that dynamically adjusts the number of executed layers for the regions of the image.

Classification General Classification +3

Skip-Thought Vectors

16 code implementations NeurIPS 2015 Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Antonio Torralba, Raquel Urtasun, Sanja Fidler

The end result is an off-the-shelf encoder that can produce highly generic sentence representations that are robust and perform well in practice.

Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books

3 code implementations ICCV 2015 Yukun Zhu, Ryan Kiros, Richard Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, Sanja Fidler

Books are a rich source of both fine-grained information, how a character, an object or a scene looks like, as well as high-level semantics, what someone is thinking, feeling and how these states evolve through a story.

Sentence Embedding Sentence-Embedding

Cannot find the paper you are looking for? You can Submit a new open access paper.