Search Results for author: Ming-Yu Liu

Found 56 papers, 27 papers with code

UFO²: A Unified Framework towards Omni-supervised Object Detection

1 code implementation ECCV 2020 Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Alexander G. Schwing, Jan Kautz

Existing work on object detection often relies on a single form of annotation: the model is trained using either accurate yet costly bounding boxes or cheaper but less expressive image-level tags.

object-detection Object Detection

Generating Long Videos of Dynamic Scenes

no code implementations7 Jun 2022 Tim Brooks, Janne Hellsten, Miika Aittala, Ting-Chun Wang, Timo Aila, Jaakko Lehtinen, Ming-Yu Liu, Alexei A. Efros, Tero Karras

Existing video generation methods often fail to produce new content as a function of time while maintaining consistencies expected in real environments, such as plausible dynamics and object persistence.

Video Generation

Multimodal Conditional Image Synthesis with Product-of-Experts GANs

no code implementations9 Dec 2021 Xun Huang, Arun Mallya, Ting-Chun Wang, Ming-Yu Liu

Existing conditional image synthesis frameworks generate images based on user inputs in a single modality, such as text, segmentation, sketch, or style reference.

Image-to-Image Translation

Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis

no code implementations NeurIPS 2021 Tianchang Shen, Jun Gao, Kangxue Yin, Ming-Yu Liu, Sanja Fidler

The core of DMTet includes a deformable tetrahedral grid that encodes a discretized signed distance function and a differentiable marching tetrahedra layer that converts the implicit signed distance representation to the explicit surface mesh representation.

Low-Precision Training in Logarithmic Number System using Multiplicative Weight Update

no code implementations26 Jun 2021 Jiawei Zhao, Steve Dai, Rangharajan Venkatesan, Ming-Yu Liu, Brucek Khailany, Bill Dally, Anima Anandkumar

Representing deep neural networks (DNNs) in low-precision is a promising approach to enable efficient acceleration and memory reduction.

Computer Vision Quantization

GANcraft: Unsupervised 3D Neural Rendering of Minecraft Worlds

no code implementations ICCV 2021 Zekun Hao, Arun Mallya, Serge Belongie, Ming-Yu Liu

We represent the world as a continuous volumetric function and train our model to render view-consistent photorealistic images for a user-controlled camera.

Neural Rendering

One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing

1 code implementation CVPR 2021 Ting-Chun Wang, Arun Mallya, Ming-Yu Liu

We propose a neural talking-head video synthesis model and demonstrate its application to video conferencing.

UFO$^2$: A Unified Framework towards Omni-supervised Object Detection

no code implementations21 Oct 2020 Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Alexander G. Schwing, Jan Kautz

Existing work on object detection often relies on a single form of annotation: the model is trained using either accurate yet costly bounding boxes or cheaper but less expressive image-level tags.

object-detection Object Detection

Generative Adversarial Networks for Image and Video Synthesis: Algorithms and Applications

no code implementations6 Aug 2020 Ming-Yu Liu, Xun Huang, Jiahui Yu, Ting-Chun Wang, Arun Mallya

The generative adversarial network (GAN) framework has emerged as a powerful tool for various image and video synthesis tasks, allowing the synthesis of visual content in an unconditional or input-conditional manner.

Neural Rendering Translation

World-Consistent Video-to-Video Synthesis

no code implementations ECCV 2020 Arun Mallya, Ting-Chun Wang, Karan Sapra, Ming-Yu Liu

This is because they lack knowledge of the 3D world being rendered and generate each frame only based on the past few frames.

Video-to-Video Synthesis

COCO-FUNIT: Few-Shot Unsupervised Image Translation with a Content Conditioned Style Encoder

1 code implementation ECCV 2020 Kuniaki Saito, Kate Saenko, Ming-Yu Liu

Unsupervised image-to-image translation intends to learn a mapping of an image in a given domain to an analogous image in a different domain, without explicit supervision of the mapping.

Translation Unsupervised Image-To-Image Translation

Learning compositional functions via multiplicative weight updates

1 code implementation NeurIPS 2020 Jeremy Bernstein, Jia-Wei Zhao, Markus Meister, Ming-Yu Liu, Anima Anandkumar, Yisong Yue

This paper proves that multiplicative weight updates satisfy a descent lemma tailored to compositional functions.

Style Example-Guided Text Generation using Generative Adversarial Transformers

no code implementations2 Mar 2020 Kuo-Hao Zeng, Mohammad Shoeybi, Ming-Yu Liu

The style encoder extracts a style code from the reference example, and the text decoder generates texts based on the style code and the context.

Text Generation

Learning to Generate Multiple Style Transfer Outputs for an Input Sentence

no code implementations WS 2020 Kevin Lin, Ming-Yu Liu, Ming-Ting Sun, Jan Kautz

Specifically, we decompose the latent representation of the input sentence to a style code that captures the language style variation and a content code that encodes the language style-independent content.

Style Transfer Text Style Transfer

On the distance between two neural networks and the stability of learning

1 code implementation NeurIPS 2020 Jeremy Bernstein, Arash Vahdat, Yisong Yue, Ming-Yu Liu

This paper relates parameter distance to gradient breakdown for a broad class of nonlinear compositional functions.

UNAS: Differentiable Architecture Search Meets Reinforcement Learning

1 code implementation CVPR 2020 Arash Vahdat, Arun Mallya, Ming-Yu Liu, Jan Kautz

Our framework brings the best of both worlds, and it enables us to search for architectures with both differentiable and non-differentiable criteria in one unified framework while maintaining a low search cost.

Neural Architecture Search reinforcement-learning

Dancing to Music

1 code implementation NeurIPS 2019 Hsin-Ying Lee, Xiaodong Yang, Ming-Yu Liu, Ting-Chun Wang, Yu-Ding Lu, Ming-Hsuan Yang, Jan Kautz

In the analysis phase, we decompose a dance into a series of basic dance units, through which the model learns how to move.

Few-shot Video-to-Video Synthesis

6 code implementations NeurIPS 2019 Ting-Chun Wang, Ming-Yu Liu, Andrew Tao, Guilin Liu, Jan Kautz, Bryan Catanzaro

To address the limitations, we propose a few-shot vid2vid framework, which learns to synthesize videos of previously unseen subjects or scenes by leveraging few example images of the target at test time.

Video-to-Video Synthesis

Neural Turtle Graphics for Modeling City Road Layouts

no code implementations ICCV 2019 Hang Chu, Daiqing Li, David Acuna, Amlan Kar, Maria Shugrina, Xinkai Wei, Ming-Yu Liu, Antonio Torralba, Sanja Fidler

We propose Neural Turtle Graphics (NTG), a novel generative model for spatial graphs, and demonstrate its applications in modeling city road layouts.

PointFlow: 3D Point Cloud Generation with Continuous Normalizing Flows

11 code implementations ICCV 2019 Guandao Yang, Xun Huang, Zekun Hao, Ming-Yu Liu, Serge Belongie, Bharath Hariharan

Specifically, we learn a two-level hierarchy of distributions where the first level is the distribution of shapes and the second level is the distribution of points given a shape.

Point Cloud Generation Variational Inference

Few-Shot Unsupervised Image-to-Image Translation

9 code implementations ICCV 2019 Ming-Yu Liu, Xun Huang, Arun Mallya, Tero Karras, Timo Aila, Jaakko Lehtinen, Jan Kautz

Unsupervised image-to-image translation methods learn to map images in a given class to an analogous image in a different class, drawing on unstructured (non-registered) datasets of images.

Translation Unsupervised Image-To-Image Translation

Meta-Sim: Learning to Generate Synthetic Datasets

no code implementations ICCV 2019 Amlan Kar, Aayush Prakash, Ming-Yu Liu, Eric Cameracci, Justin Yuan, Matt Rusiniak, David Acuna, Antonio Torralba, Sanja Fidler

Training models to high-end performance requires availability of large labeled datasets, which are expensive to get.

STEP: Spatio-Temporal Progressive Learning for Video Action Detection

1 code implementation CVPR 2019 Xitong Yang, Xiaodong Yang, Ming-Yu Liu, Fanyi Xiao, Larry Davis, Jan Kautz

In this paper, we propose Spatio-TEmporal Progressive (STEP) action detector---a progressive learning framework for spatio-temporal action detection in videos.

Action Detection Action Recognition

Semantic Image Synthesis with Spatially-Adaptive Normalization

25 code implementations CVPR 2019 Taesung Park, Ming-Yu Liu, Ting-Chun Wang, Jun-Yan Zhu

Previous methods directly feed the semantic layout as input to the deep network, which is then processed through stacks of convolution, normalization, and nonlinearity layers.

Image-to-Image Translation Sketch-to-Image Translation

Context-Aware Synthesis and Placement of Object Instances

1 code implementation NeurIPS 2018 Donghoon Lee, Sifei Liu, Jinwei Gu, Ming-Yu Liu, Ming-Hsuan Yang, Jan Kautz

Learning to insert an object instance into an image in a semantically coherent manner is a challenging and interesting problem.

Scene Parsing

Models Matter, So Does Training: An Empirical Study of CNNs for Optical Flow Estimation

1 code implementation14 Sep 2018 Deqing Sun, Xiaodong Yang, Ming-Yu Liu, Jan Kautz

We investigate two crucial and closely related aspects of CNNs for optical flow estimation: models and training.

Optical Flow Estimation

Unsupervised Stylish Image Description Generation via Domain Layer Norm

no code implementations11 Sep 2018 Cheng Kuan Chen, Zhu Feng Pan, Min Sun, Ming-Yu Liu

It can learn to generate stylish image descriptions that are more related to image content and can be trained with the arbitrary monolingual corpus without collecting new paired image and stylish descriptions.

Video-to-Video Synthesis

11 code implementations NeurIPS 2018 Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, Bryan Catanzaro

We study the problem of video-to-video synthesis, whose goal is to learn a mapping function from an input source video (e. g., a sequence of semantic segmentation masks) to an output photorealistic video that precisely depicts the content of the source video.

Semantic Segmentation Video Prediction +1

Superpixel Sampling Networks

2 code implementations ECCV 2018 Varun Jampani, Deqing Sun, Ming-Yu Liu, Ming-Hsuan Yang, Jan Kautz

Superpixels provide an efficient low/mid-level representation of image data, which greatly reduces the number of image primitives for subsequent vision tasks.

Superpixels

Domain Stylization: A Strong, Simple Baseline for Synthetic to Real Image Domain Adaptation

no code implementations24 Jul 2018 Aysegul Dundar, Ming-Yu Liu, Ting-Chun Wang, John Zedlewski, Jan Kautz

Deep neural networks have largely failed to effectively utilize synthetic data when applied to real images due to the covariate shift problem.

Domain Adaptation object-detection +4

A Closed-form Solution to Photorealistic Image Stylization

12 code implementations ECCV 2018 Yijun Li, Ming-Yu Liu, Xueting Li, Ming-Hsuan Yang, Jan Kautz

Photorealistic image stylization concerns transferring style of a reference photo to a content photo with the constraint that the stylized photo should remain photorealistic.

Image Stylization

Localization-Aware Active Learning for Object Detection

no code implementations16 Jan 2018 Chieh-Chi Kao, Teng-Yok Lee, Pradeep Sen, Ming-Yu Liu

Active learning - a class of algorithms that iteratively searches for the most informative samples to include in a training dataset - has been shown to be effective at annotating data for image classification.

Active Learning Classification +6

Reblur2Deblur: Deblurring Videos via Self-Supervised Learning

no code implementations16 Jan 2018 Huaijin Chen, Jinwei Gu, Orazio Gallo, Ming-Yu Liu, Ashok Veeraraghavan, Jan Kautz

Motion blur is a fundamental problem in computer vision as it impacts image quality and hinders inference.

Computer Vision Deblurring +2

Learning Binary Residual Representations for Domain-specific Video Streaming

no code implementations14 Dec 2017 Yi-Hsuan Tsai, Ming-Yu Liu, Deqing Sun, Ming-Hsuan Yang, Jan Kautz

Specifically, we target a streaming setting where the videos to be streamed from a server to a client are all in the same domain and they have to be compressed to a small size for low-latency transmission.

Video Compression

High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

16 code implementations CVPR 2018 Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, Bryan Catanzaro

We present a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs).

Conditional Image Generation Fundus to Angiography Generation +3

Detecting Adversarial Attacks on Neural Network Policies with Visual Foresight

1 code implementation2 Oct 2017 Yen-Chen Lin, Ming-Yu Liu, Min Sun, Jia-Bin Huang

Our core idea is that the adversarial examples targeting at a neural network-based policy are not effective for the frame prediction model.

Autonomous Vehicles Decision Making +1

Deep 360 Pilot: Learning a Deep Agent for Piloting Through 360deg Sports Videos

no code implementations CVPR 2017 Hou-Ning Hu, Yen-Chen Lin, Ming-Yu Liu, Hsien-Tzu Cheng, Yung-Ju Chang, Min Sun

Given the main object and previously selected viewing angles, our method regresses a shift in viewing angle to move to the next one.

CASENet: Deep Category-Aware Semantic Edge Detection

11 code implementations CVPR 2017 Zhiding Yu, Chen Feng, Ming-Yu Liu, Srikumar Ramalingam

To this end, we propose a novel end-to-end deep semantic edge learning architecture based on ResNet and a new skip-layer architecture where category-wise edge activations at the top convolution layer share and are fused with the same set of bottom layer features.

Edge Detection Object Proposal Generation +1

Deep 360 Pilot: Learning a Deep Agent for Piloting through 360° Sports Video

1 code implementation CVPR 2017 Hou-Ning Hu, Yen-Chen Lin, Ming-Yu Liu, Hsien-Tzu Cheng, Yung-Ju Chang, Min Sun

Watching a 360{\deg} sports video requires a viewer to continuously select a viewing angle, either through a sequence of mouse clicks or head movements.

Tactics of Adversarial Attack on Deep Reinforcement Learning Agents

no code implementations8 Mar 2017 Yen-Chen Lin, Zhang-Wei Hong, Yuan-Hong Liao, Meng-Li Shih, Ming-Yu Liu, Min Sun

In the strategically-timed attack, the adversary aims at minimizing the agent's reward by only attacking the agent at a small subset of time steps in an episode.

Adversarial Attack Atari Games +1

Unsupervised Image-to-Image Translation Networks

9 code implementations NeurIPS 2017 Ming-Yu Liu, Thomas Breuel, Jan Kautz

Unsupervised image-to-image translation aims at learning a joint distribution of images in different domains by using images from the marginal distributions in individual domains.

Domain Adaptation Multimodal Unsupervised Image-To-Image Translation +2

Attentional Network for Visual Object Detection

no code implementations6 Feb 2017 Kota Hara, Ming-Yu Liu, Oncel Tuzel, Amir-Massoud Farahmand

We propose augmenting deep neural networks with an attention mechanism for the visual object detection task.

object-detection Object Detection

Gaussian Conditional Random Field Network for Semantic Segmentation

no code implementations CVPR 2016 Raviteja Vemulapalli, Oncel Tuzel, Ming-Yu Liu, Rama Chellapa

In contrast to the existing approaches that use discrete Conditional Random Field (CRF) models, we propose to use a Gaussian CRF model for the task of semantic segmentation.

Semantic Segmentation

Learning to Remove Multipath Distortions in Time-of-Flight Range Images for a Robotic Arm Setup

no code implementations8 Jan 2016 Kilho Son, Ming-Yu Liu, Yuichi Taguchi

We use the robotic arm to automatically collect a large amount of ToF range images containing various multipath distortions.

Layered Interpretation of Street View Images

no code implementations15 Jun 2015 Ming-Yu Liu, Shuoxin Lin, Srikumar Ramalingam, Oncel Tuzel

We propose a layered street view model to encode both depth and semantic information on street view images for autonomous driving.

Autonomous Driving Scene Labeling +1

Unsupervised Network Pretraining via Encoding Human Design

no code implementations19 Feb 2015 Ming-Yu Liu, Arun Mallya, Oncel C. Tuzel, Xi Chen

Our idea is to pretrain the network through the task of replicating the process of hand-designed feature extraction.

Computer Vision Object Recognition

Recursive Context Propagation Network for Semantic Scene Labeling

no code implementations NeurIPS 2014 Abhishek Sharma, Oncel Tuzel, Ming-Yu Liu

Then a top-down propagation of the aggregated information takes place that enhances the contextual information of each local feature.

Scene Labeling

Joint Geodesic Upsampling of Depth Images

no code implementations CVPR 2013 Ming-Yu Liu, Oncel Tuzel, Yuichi Taguchi

We propose an algorithm utilizing geodesic distances to upsample a low resolution depth image using a registered high resolution color image.

Cannot find the paper you are looking for? You can Submit a new open access paper.