1 code implementation • ECCV 2020 • Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Alexander G. Schwing, Jan Kautz
Existing work on object detection often relies on a single form of annotation: the model is trained using either accurate yet costly bounding boxes or cheaper but less expressive image-level tags.
3 code implementations • 7 Jan 2025 • Nvidia, :, Niket Agarwal, Arslan Ali, Maciej Bala, Yogesh Balaji, Erik Barker, Tiffany Cai, Prithvijit Chattopadhyay, Yongxin Chen, Yin Cui, Yifan Ding, Daniel Dworakowski, Jiaojiao Fan, Michele Fenzi, Francesco Ferroni, Sanja Fidler, Dieter Fox, Songwei Ge, Yunhao Ge, Jinwei Gu, Siddharth Gururani, Ethan He, Jiahui Huang, Jacob Huffman, Pooya Jannaty, Jingyi Jin, Seung Wook Kim, Gergely Klár, Grace Lam, Shiyi Lan, Laura Leal-Taixe, Anqi Li, Zhaoshuo Li, Chen-Hsuan Lin, Tsung-Yi Lin, Huan Ling, Ming-Yu Liu, Xian Liu, Alice Luo, Qianli Ma, Hanzi Mao, Kaichun Mo, Arsalan Mousavian, Seungjun Nah, Sriharsha Niverty, David Page, Despoina Paschalidou, Zeeshan Patel, Lindsey Pavao, Morteza Ramezanali, Fitsum Reda, Xiaowei Ren, Vasanth Rao Naik Sabavat, Ed Schmerling, Stella Shi, Bartosz Stefaniak, Shitao Tang, Lyne Tchapmi, Przemek Tredak, Wei-Cheng Tseng, Jibin Varghese, Hao Wang, Haoxiang Wang, Heng Wang, Ting-Chun Wang, Fangyin Wei, Xinyue Wei, Jay Zhangjie Wu, Jiashu Xu, Wei Yang, Lin Yen-Chen, Xiaohui Zeng, Yu Zeng, Jing Zhang, Qinsheng Zhang, Yuxuan Zhang, Qingqing Zhao, Artur Zolkowski
We position a world foundation model as a general-purpose world model that can be fine-tuned into customized world models for downstream applications.
no code implementations • 12 Dec 2024 • Zekun Hao, David W. Romero, Tsung-Yi Lin, Ming-Yu Liu
Hence, scaling both the maximum face count and vertex coordinate resolution is crucial to producing high-quality meshes of realistic, complex 3D objects.
no code implementations • 11 Nov 2024 • Nvidia, :, Yuval Atzmon, Maciej Bala, Yogesh Balaji, Tiffany Cai, Yin Cui, Jiaojiao Fan, Yunhao Ge, Siddharth Gururani, Jacob Huffman, Ronald Isaac, Pooya Jannaty, Tero Karras, Grace Lam, J. P. Lewis, Aaron Licata, Yen-Chen Lin, Ming-Yu Liu, Qianli Ma, Arun Mallya, Ashlee Martino-Tarr, Doug Mendez, Seungjun Nah, Chris Pruett, Fitsum Reda, Jiaming Song, Ting-Chun Wang, Fangyin Wei, Xiaohui Zeng, Yu Zeng, Qinsheng Zhang
We introduce Edify Image, a family of diffusion models capable of generating photorealistic image content with pixel-perfect accuracy.
no code implementations • 11 Nov 2024 • Nvidia, :, Maciej Bala, Yin Cui, Yifan Ding, Yunhao Ge, Zekun Hao, Jon Hasselgren, Jacob Huffman, Jingyi Jin, J. P. Lewis, Zhaoshuo Li, Chen-Hsuan Lin, Yen-Chen Lin, Tsung-Yi Lin, Ming-Yu Liu, Alice Luo, Qianli Ma, Jacob Munkberg, Stella Shi, Fangyin Wei, Donglai Xiang, Jiashu Xu, Xiaohui Zeng, Qinsheng Zhang
We introduce Edify 3D, an advanced solution designed for high-quality 3D asset generation.
no code implementations • 28 Oct 2024 • Zhendong Wang, Zhaoshuo Li, Ajay Mandlekar, Zhenjia Xu, Jiaojiao Fan, Yashraj Narang, Linxi Fan, Yuke Zhu, Yogesh Balaji, Mingyuan Zhou, Ming-Yu Liu, Yu Zeng
Diffusion models, praised for their success in generative tasks, are increasingly being applied to robotics, demonstrating exceptional performance in behavior cloning.
no code implementations • 26 Sep 2024 • Jiaxiang Tang, Zhaoshuo Li, Zekun Hao, Xian Liu, Gang Zeng, Ming-Yu Liu, Qinsheng Zhang
Current auto-regressive mesh generation methods suffer from issues such as incompleteness, insufficient detail, and poor generalization.
no code implementations • 4 Sep 2024 • Kaiwen Zheng, Yongxin Chen, Hanzi Mao, Ming-Yu Liu, Jun Zhu, Qinsheng Zhang
Masked diffusion models (MDMs) have emerged as a popular research topic for generative modeling of discrete data, thanks to their superior performance over other discrete diffusion models, and are rivaling the auto-regressive models (ARMs) for language modeling tasks.
no code implementations • 26 Jul 2024 • Boyi Li, Ligeng Zhu, Ran Tian, Shuhan Tan, Yuxiao Chen, Yao Lu, Yin Cui, Sushant Veer, Max Ehrlich, Jonah Philion, Xinshuo Weng, Fuzhao Xue, Andrew Tao, Ming-Yu Liu, Sanja Fidler, Boris Ivanovic, Trevor Darrell, Jitendra Malik, Song Han, Marco Pavone
Finally, we establish a benchmark for video captioning and introduce a leaderboard, aiming to accelerate advancements in video understanding, captioning, and data alignment.
no code implementations • CVPR 2024 • Yu Zeng, Vishal M. Patel, Haochen Wang, Xun Huang, Ting-Chun Wang, Ming-Yu Liu, Yogesh Balaji
Personalized text-to-image generation models enable users to create images that depict their individual possessions in diverse scenes, finding applications in various domains.
no code implementations • CVPR 2024 • Yunhao Ge, Xiaohui Zeng, Jacob Samuel Huffman, Tsung-Yi Lin, Ming-Yu Liu, Yin Cui
VFC consists of three steps: 1) proposal, where image-to-text captioning models propose multiple initial captions; 2) verification, where a large language model (LLM) utilizes tools such as object detection and VQA models to fact-check proposed captions; 3) captioning, where an LLM generates the final caption by summarizing caption proposals and the fact check verification results.
no code implementations • CVPR 2024 • Han Cai, Muyang Li, Zhuoyang Zhang, Qinsheng Zhang, Ming-Yu Liu, Song Han
In parallel to prior conditional control methods, CAN controls the image generation process by dynamically manipulating the weight of the neural network.
2 code implementations • CVPR 2024 • Muyang Li, Tianle Cai, Jiaxin Cao, Qinsheng Zhang, Han Cai, Junjie Bai, Yangqing Jia, Ming-Yu Liu, Kai Li, Song Han
To overcome this dilemma, we observe the high similarity between the input from adjacent diffusion steps and propose displaced patch parallelism, which takes advantage of the sequential nature of the diffusion process by reusing the pre-computed feature maps from the previous timestep to provide context for the current step.
no code implementations • ICCV 2023 • Jonathan Lorraine, Kevin Xie, Xiaohui Zeng, Chen-Hsuan Lin, Towaki Takikawa, Nicholas Sharp, Tsung-Yi Lin, Ming-Yu Liu, Sanja Fidler, James Lucas
Text-to-3D modelling has seen exciting progress by combining generative text-to-image models with image-to-3D methods like Neural Radiance Fields.
1 code implementation • CVPR 2023 • Zhaoshuo Li, Thomas Müller, Alex Evans, Russell H. Taylor, Mathias Unberath, Ming-Yu Liu, Chen-Hsuan Lin
Neural surface reconstruction has been shown to be powerful for recovering dense 3D surfaces via image-based neural rendering.
no code implementations • ICCV 2023 • Songwei Ge, Seungjun Nah, Guilin Liu, Tyler Poon, Andrew Tao, Bryan Catanzaro, David Jacobs, Jia-Bin Huang, Ming-Yu Liu, Yogesh Balaji
Despite tremendous progress in generating high-quality images using diffusion models, synthesizing a sequence of animated frames that are both photorealistic and temporally coherent is still in its infancy.
Ranked #7 on
Text-to-Video Generation
on UCF-101
no code implementations • CVPR 2023 • Qinsheng Zhang, Jiaming Song, Xun Huang, Yongxin Chen, Ming-Yu Liu
We present DiffCollage, a compositional diffusion model that can generate large content by leveraging diffusion models trained on generating pieces of the large content.
no code implementations • 9 Feb 2023 • Zhuolin Yang, Wei Ping, Zihan Liu, Vijay Korthikanti, Weili Nie, De-An Huang, Linxi Fan, Zhiding Yu, Shiyi Lan, Bo Li, Ming-Yu Liu, Yuke Zhu, Mohammad Shoeybi, Bryan Catanzaro, Chaowei Xiao, Anima Anandkumar
Augmenting pretrained language models (LMs) with a vision encoder (e. g., Flamingo) has obtained the state-of-the-art results in image-to-text generation.
1 code implementation • CVPR 2023 • Chen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, Tsung-Yi Lin
DreamFusion has recently demonstrated the utility of a pre-trained text-to-image diffusion model to optimize Neural Radiance Fields (NeRF), achieving remarkable text-to-3D synthesis results.
Ranked #2 on
Text to 3D
on T$^3$Bench
no code implementations • ICCV 2023 • Siddharth Gururani, Arun Mallya, Ting-Chun Wang, Rafael Valle, Ming-Yu Liu
It uses a multi-stage approach, combining the controllability of facial landmarks with the high-quality synthesis power of a pretrained face generator.
2 code implementations • 2 Nov 2022 • Yogesh Balaji, Seungjun Nah, Xun Huang, Arash Vahdat, Jiaming Song, Qinsheng Zhang, Karsten Kreis, Miika Aittala, Timo Aila, Samuli Laine, Bryan Catanzaro, Tero Karras, Ming-Yu Liu
Therefore, in contrast to existing works, we propose to train an ensemble of text-to-image diffusion models specialized for different synthesis stages.
Ranked #14 on
Text-to-Image Generation
on MS COCO
no code implementations • 4 Oct 2022 • Arun Mallya, Ting-Chun Wang, Ming-Yu Liu
We present a new implicit warping framework for image animation using sets of source images through the transfer of the motion of a driving video.
no code implementations • 21 Sep 2022 • Yu-Ying Yeh, Koki Nagano, Sameh Khamis, Jan Kautz, Ming-Yu Liu, Ting-Chun Wang
An effective approach is to supervise the training of deep neural networks with a high-fidelity dataset of desired input-output pairs, captured with a light stage.
1 code implementation • 7 Jun 2022 • Tim Brooks, Janne Hellsten, Miika Aittala, Ting-Chun Wang, Timo Aila, Jaakko Lehtinen, Ming-Yu Liu, Alexei A. Efros, Tero Karras
Existing video generation methods often fail to produce new content as a function of time while maintaining consistencies expected in real environments, such as plausible dynamics and object persistence.
no code implementations • 9 Dec 2021 • Xun Huang, Arun Mallya, Ting-Chun Wang, Ming-Yu Liu
Existing conditional image synthesis frameworks generate images based on user inputs in a single modality, such as text, segmentation, sketch, or style reference.
no code implementations • NeurIPS 2021 • Tianchang Shen, Jun Gao, Kangxue Yin, Ming-Yu Liu, Sanja Fidler
The core of DMTet includes a deformable tetrahedral grid that encodes a discretized signed distance function and a differentiable marching tetrahedra layer that converts the implicit signed distance representation to the explicit surface mesh representation.
no code implementations • 26 Jun 2021 • Jiawei Zhao, Steve Dai, Rangharajan Venkatesan, Brian Zimmer, Mustafa Ali, Ming-Yu Liu, Brucek Khailany, Bill Dally, Anima Anandkumar
Representing deep neural networks (DNNs) in low-precision is a promising approach to enable efficient acceleration and memory reduction.
no code implementations • ICCV 2021 • Zekun Hao, Arun Mallya, Serge Belongie, Ming-Yu Liu
We represent the world as a continuous volumetric function and train our model to render view-consistent photorealistic images for a user-controlled camera.
2 code implementations • CVPR 2021 • Ting-Chun Wang, Arun Mallya, Ming-Yu Liu
We propose a neural talking-head video synthesis model and demonstrate its application to video conferencing.
no code implementations • 21 Oct 2020 • Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Alexander G. Schwing, Jan Kautz
Existing work on object detection often relies on a single form of annotation: the model is trained using either accurate yet costly bounding boxes or cheaper but less expressive image-level tags.
no code implementations • 6 Aug 2020 • Ming-Yu Liu, Xun Huang, Jiahui Yu, Ting-Chun Wang, Arun Mallya
The generative adversarial network (GAN) framework has emerged as a powerful tool for various image and video synthesis tasks, allowing the synthesis of visual content in an unconditional or input-conditional manner.
no code implementations • ECCV 2020 • Arun Mallya, Ting-Chun Wang, Karan Sapra, Ming-Yu Liu
This is because they lack knowledge of the 3D world being rendered and generate each frame only based on the past few frames.
1 code implementation • ECCV 2020 • Kuniaki Saito, Kate Saenko, Ming-Yu Liu
Unsupervised image-to-image translation intends to learn a mapping of an image in a given domain to an analogous image in a different domain, without explicit supervision of the mapping.
1 code implementation • NeurIPS 2020 • Jeremy Bernstein, Jia-Wei Zhao, Markus Meister, Ming-Yu Liu, Anima Anandkumar, Yisong Yue
This paper proves that multiplicative weight updates satisfy a descent lemma tailored to compositional functions.
2 code implementations • CVPR 2020 • Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Yong Jae Lee, Alexander G. Schwing, Jan Kautz
Weakly supervised learning has emerged as a compelling tool for object detection by reducing the need for strong supervision during training.
Ranked #1 on
Weakly Supervised Object Detection
on COCO test-dev
no code implementations • 2 Mar 2020 • Kuo-Hao Zeng, Mohammad Shoeybi, Ming-Yu Liu
The style encoder extracts a style code from the reference example, and the text decoder generates texts based on the style code and the context.
no code implementations • WS 2020 • Kevin Lin, Ming-Yu Liu, Ming-Ting Sun, Jan Kautz
Specifically, we decompose the latent representation of the input sentence to a style code that captures the language style variation and a content code that encodes the language style-independent content.
2 code implementations • NeurIPS 2020 • Jeremy Bernstein, Arash Vahdat, Yisong Yue, Ming-Yu Liu
This paper relates parameter distance to gradient breakdown for a broad class of nonlinear compositional functions.
1 code implementation • CVPR 2020 • Arash Vahdat, Arun Mallya, Ming-Yu Liu, Jan Kautz
Our framework brings the best of both worlds, and it enables us to search for architectures with both differentiable and non-differentiable criteria in one unified framework while maintaining a low search cost.
2 code implementations • NeurIPS 2019 • Hsin-Ying Lee, Xiaodong Yang, Ming-Yu Liu, Ting-Chun Wang, Yu-Ding Lu, Ming-Hsuan Yang, Jan Kautz
In the analysis phase, we decompose a dance into a series of basic dance units, through which the model learns how to move.
Ranked #3 on
Motion Synthesis
on BRACE
6 code implementations • NeurIPS 2019 • Ting-Chun Wang, Ming-Yu Liu, Andrew Tao, Guilin Liu, Jan Kautz, Bryan Catanzaro
To address the limitations, we propose a few-shot vid2vid framework, which learns to synthesize videos of previously unseen subjects or scenes by leveraging few example images of the target at test time.
Ranked #1 on
Video-to-Video Synthesis
on YouTube Dancing
no code implementations • ICCV 2019 • Hang Chu, Daiqing Li, David Acuna, Amlan Kar, Maria Shugrina, Xinkai Wei, Ming-Yu Liu, Antonio Torralba, Sanja Fidler
We propose Neural Turtle Graphics (NTG), a novel generative model for spatial graphs, and demonstrate its applications in modeling city road layouts.
12 code implementations • ICCV 2019 • Guandao Yang, Xun Huang, Zekun Hao, Ming-Yu Liu, Serge Belongie, Bharath Hariharan
Specifically, we learn a two-level hierarchy of distributions where the first level is the distribution of shapes and the second level is the distribution of points given a shape.
Ranked #4 on
Point Cloud Generation
on ShapeNet Car
10 code implementations • ICCV 2019 • Ming-Yu Liu, Xun Huang, Arun Mallya, Tero Karras, Timo Aila, Jaakko Lehtinen, Jan Kautz
Unsupervised image-to-image translation methods learn to map images in a given class to an analogous image in a different class, drawing on unstructured (non-registered) datasets of images.
no code implementations • ICCV 2019 • Amlan Kar, Aayush Prakash, Ming-Yu Liu, Eric Cameracci, Justin Yuan, Matt Rusiniak, David Acuna, Antonio Torralba, Sanja Fidler
Training models to high-end performance requires availability of large labeled datasets, which are expensive to get.
1 code implementation • CVPR 2019 • Xitong Yang, Xiaodong Yang, Ming-Yu Liu, Fanyi Xiao, Larry Davis, Jan Kautz
In this paper, we propose Spatio-TEmporal Progressive (STEP) action detector---a progressive learning framework for spatio-temporal action detection in videos.
Ranked #7 on
Action Detection
on UCF101-24
no code implementations • CVPR 2019 • Zheng Tang, Milind Naphade, Ming-Yu Liu, Xiaodong Yang, Stan Birchfield, Shuo Wang, Ratnesh Kumar, David Anastasiu, Jenq-Neng Hwang
Urban traffic optimization using traffic cameras as sensors is driving the need to advance state-of-the-art multi-target multi-camera (MTMC) tracking.
25 code implementations • CVPR 2019 • Taesung Park, Ming-Yu Liu, Ting-Chun Wang, Jun-Yan Zhu
Previous methods directly feed the semantic layout as input to the deep network, which is then processed through stacks of convolution, normalization, and nonlinearity layers.
Ranked #3 on
Sketch-to-Image Translation
on COCO-Stuff
2 code implementations • NeurIPS 2018 • Donghoon Lee, Sifei Liu, Jinwei Gu, Ming-Yu Liu, Ming-Hsuan Yang, Jan Kautz
Learning to insert an object instance into an image in a semantically coherent manner is a challenging and interesting problem.
2 code implementations • 14 Sep 2018 • Deqing Sun, Xiaodong Yang, Ming-Yu Liu, Jan Kautz
We investigate two crucial and closely related aspects of CNNs for optical flow estimation: models and training.
Ranked #7 on
Optical Flow Estimation
on KITTI 2012
no code implementations • 11 Sep 2018 • Cheng Kuan Chen, Zhu Feng Pan, Min Sun, Ming-Yu Liu
It can learn to generate stylish image descriptions that are more related to image content and can be trained with the arbitrary monolingual corpus without collecting new paired image and stylish descriptions.
10 code implementations • NeurIPS 2018 • Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, Bryan Catanzaro
We study the problem of video-to-video synthesis, whose goal is to learn a mapping function from an input source video (e. g., a sequence of semantic segmentation masks) to an output photorealistic video that precisely depicts the content of the source video.
Ranked #4 on
Video deraining
on Video Waterdrop Removal Dataset
2 code implementations • ECCV 2018 • Varun Jampani, Deqing Sun, Ming-Yu Liu, Ming-Hsuan Yang, Jan Kautz
Superpixels provide an efficient low/mid-level representation of image data, which greatly reduces the number of image primitives for subsequent vision tasks.
no code implementations • 24 Jul 2018 • Aysegul Dundar, Ming-Yu Liu, Ting-Chun Wang, John Zedlewski, Jan Kautz
Deep neural networks have largely failed to effectively utilize synthetic data when applied to real images due to the covariate shift problem.
no code implementations • CVPR 2018 • Wei-Chih Tu, Ming-Yu Liu, Varun Jampani, Deqing Sun, Shao-Yi Chien, Ming-Hsuan Yang, Jan Kautz
Specifically, we propose a new loss function that takes the segmentation error into account for affinity learning.
14 code implementations • ECCV 2018 • Xun Huang, Ming-Yu Liu, Serge Belongie, Jan Kautz
To translate an image to another domain, we recombine its content code with a random style code sampled from the style space of the target domain.
Multimodal Unsupervised Image-To-Image Translation
Translation
+1
12 code implementations • ECCV 2018 • Yijun Li, Ming-Yu Liu, Xueting Li, Ming-Hsuan Yang, Jan Kautz
Photorealistic image stylization concerns transferring style of a reference photo to a content photo with the constraint that the stylized photo should remain photorealistic.
no code implementations • 16 Jan 2018 • Chieh-Chi Kao, Teng-Yok Lee, Pradeep Sen, Ming-Yu Liu
Active learning - a class of algorithms that iteratively searches for the most informative samples to include in a training dataset - has been shown to be effective at annotating data for image classification.
no code implementations • 16 Jan 2018 • Huaijin Chen, Jinwei Gu, Orazio Gallo, Ming-Yu Liu, Ashok Veeraraghavan, Jan Kautz
Motion blur is a fundamental problem in computer vision as it impacts image quality and hinders inference.
no code implementations • 14 Dec 2017 • Yi-Hsuan Tsai, Ming-Yu Liu, Deqing Sun, Ming-Hsuan Yang, Jan Kautz
Specifically, we target a streaming setting where the videos to be streamed from a server to a client are all in the same domain and they have to be compressed to a small size for low-latency transmission.
21 code implementations • CVPR 2018 • Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, Bryan Catanzaro
We present a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs).
Ranked #2 on
Sketch-to-Image Translation
on COCO-Stuff
Conditional Image Generation
Fundus to Angiography Generation
+5
2 code implementations • 2 Oct 2017 • Yen-Chen Lin, Ming-Yu Liu, Min Sun, Jia-Bin Huang
Our core idea is that the adversarial examples targeting at a neural network-based policy are not effective for the frame prediction model.
21 code implementations • CVPR 2018 • Deqing Sun, Xiaodong Yang, Ming-Yu Liu, Jan Kautz
It then uses the warped features and features of the first image to construct a cost volume, which is processed by a CNN to estimate the optical flow.
Ranked #3 on
Dense Pixel Correspondence Estimation
on HPatches
Dense Pixel Correspondence Estimation
Optical Flow Estimation
5 code implementations • CVPR 2018 • Sergey Tulyakov, Ming-Yu Liu, Xiaodong Yang, Jan Kautz
The proposed framework generates a video by mapping a sequence of random vectors to a sequence of video frames.
no code implementations • CVPR 2017 • Hou-Ning Hu, Yen-Chen Lin, Ming-Yu Liu, Hsien-Tzu Cheng, Yung-Ju Chang, Min Sun
Given the main object and previously selected viewing angles, our method regresses a shift in viewing angle to move to the next one.
11 code implementations • CVPR 2017 • Zhiding Yu, Chen Feng, Ming-Yu Liu, Srikumar Ramalingam
To this end, we propose a novel end-to-end deep semantic edge learning architecture based on ResNet and a new skip-layer architecture where category-wise edge activations at the top convolution layer share and are fused with the same set of bottom layer features.
Ranked #1 on
Edge Detection
on SBD
1 code implementation • CVPR 2017 • Hou-Ning Hu, Yen-Chen Lin, Ming-Yu Liu, Hsien-Tzu Cheng, Yung-Ju Chang, Min Sun
Watching a 360{\deg} sports video requires a viewer to continuously select a viewing angle, either through a sequence of mouse clicks or head movements.
no code implementations • 8 Mar 2017 • Yen-Chen Lin, Zhang-Wei Hong, Yuan-Hong Liao, Meng-Li Shih, Ming-Yu Liu, Min Sun
In the strategically-timed attack, the adversary aims at minimizing the agent's reward by only attacking the agent at a small subset of time steps in an episode.
8 code implementations • NeurIPS 2017 • Ming-Yu Liu, Thomas Breuel, Jan Kautz
Unsupervised image-to-image translation aims at learning a joint distribution of images in different domains by using images from the marginal distributions in individual domains.
Domain Adaptation
Multimodal Unsupervised Image-To-Image Translation
+2
no code implementations • 6 Feb 2017 • Kota Hara, Ming-Yu Liu, Oncel Tuzel, Amir-Massoud Farahmand
We propose augmenting deep neural networks with an attention mechanism for the visual object detection task.
4 code implementations • NeurIPS 2016 • Ming-Yu Liu, Oncel Tuzel
We propose coupled generative adversarial network (CoGAN) for learning a joint distribution of multi-domain images.
no code implementations • CVPR 2016 • Raviteja Vemulapalli, Oncel Tuzel, Ming-Yu Liu, Rama Chellapa
In contrast to the existing approaches that use discrete Conditional Random Field (CRF) models, we propose to use a Gaussian CRF model for the task of semantic segmentation.
no code implementations • 8 Jan 2016 • Kilho Son, Ming-Yu Liu, Yuichi Taguchi
We use the robotic arm to automatically collect a large amount of ToF range images containing various multipath distortions.
no code implementations • CVPR 2016 • Raviteja Vemulapalli, Oncel Tuzel, Ming-Yu Liu
We propose a novel deep network architecture for image\\ denoising based on a Gaussian Conditional Random Field (GCRF) model.
no code implementations • 15 Jun 2015 • Ming-Yu Liu, Shuoxin Lin, Srikumar Ramalingam, Oncel Tuzel
We propose a layered street view model to encode both depth and semantic information on street view images for autonomous driving.
no code implementations • 19 Feb 2015 • Ming-Yu Liu, Arun Mallya, Oncel C. Tuzel, Xi Chen
Our idea is to pretrain the network through the task of replicating the process of hand-designed feature extraction.
no code implementations • NeurIPS 2014 • Abhishek Sharma, Oncel Tuzel, Ming-Yu Liu
Then a top-down propagation of the aggregated information takes place that enhances the contextual information of each local feature.
no code implementations • CVPR 2013 • Ming-Yu Liu, Oncel Tuzel, Yuichi Taguchi
We propose an algorithm utilizing geodesic distances to upsample a low resolution depth image using a registered high resolution color image.