Search Results for author: Jan Kautz

Found 145 papers, 64 papers with code

UFO²: A Unified Framework towards Omni-supervised Object Detection

1 code implementation ECCV 2020 Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Alexander G. Schwing, Jan Kautz

Existing work on object detection often relies on a single form of annotation: the model is trained using either accurate yet costly bounding boxes or cheaper but less expressive image-level tags.

object-detection Object Detection

Global Context Vision Transformers

1 code implementation20 Jun 2022 Ali Hatamizadeh, Hongxu Yin, Jan Kautz, Pavlo Molchanov

We propose global context vision transformer (GC ViT), a novel architecture that enhances parameter and compute utilization.

Image Classification Inductive Bias +4

RTMV: A Ray-Traced Multi-View Synthetic Dataset for Novel View Synthesis

no code implementations14 May 2022 Jonathan Tremblay, Moustafa Meshry, Alex Evans, Jan Kautz, Alexander Keller, Sameh Khamis, Charles Loop, Nathan Morrical, Koki Nagano, Towaki Takikawa, Stan Birchfield

We present a large-scale synthetic dataset for novel view synthesis consisting of ~300k images rendered from nearly 2000 complex scenes using high-quality ray tracing at high resolution (1600 x 1600 pixels).

Novel View Synthesis

CoordGAN: Self-Supervised Dense Correspondences Emerge from GANs

no code implementations CVPR 2022 Jiteng Mu, Shalini De Mello, Zhiding Yu, Nuno Vasconcelos, Xiaolong Wang, Jan Kautz, Sifei Liu

We represent the correspondence maps of different images as warped coordinate frames transformed from a canonical coordinate frame, i. e., the correspondence map, which describes the structure (e. g., the shape of a face), is controlled via a transformation.

Disentanglement

DRaCoN -- Differentiable Rasterization Conditioned Neural Radiance Fields for Articulated Avatars

no code implementations29 Mar 2022 Amit Raj, Umar Iqbal, Koki Nagano, Sameh Khamis, Pavlo Molchanov, James Hays, Jan Kautz

In this work, we present, DRaCoN, a framework for learning full-body volumetric avatars which exploits the advantages of both the 2D and 3D neural rendering techniques.

Neural Rendering

GradViT: Gradient Inversion of Vision Transformers

no code implementations CVPR 2022 Ali Hatamizadeh, Hongxu Yin, Holger Roth, Wenqi Li, Jan Kautz, Daguang Xu, Pavlo Molchanov

In this work we demonstrate the vulnerability of vision transformers (ViTs) to gradient-based inversion attacks.

Physics Informed RNN-DCT Networks for Time-Dependent Partial Differential Equations

no code implementations24 Feb 2022 Benjamin Wu, Oliver Hennigh, Jan Kautz, Sanjay Choudhry, Wonmin Byeon

This efficiently and flexibly produces a compressed representation which is used for additional conditioning of physics-informed models.

FreeSOLO: Learning to Segment Objects without Annotations

1 code implementation CVPR 2022 Xinlong Wang, Zhiding Yu, Shalini De Mello, Jan Kautz, Anima Anandkumar, Chunhua Shen, Jose M. Alvarez

FreeSOLO further demonstrates superiority as a strong pre-training method, outperforming state-of-the-art self-supervised pre-training methods by +9. 8% AP when fine-tuning instance segmentation with only 5% COCO masks.

Instance Segmentation object-detection +2

GroupViT: Semantic Segmentation Emerges from Text Supervision

1 code implementation CVPR 2022 Jiarui Xu, Shalini De Mello, Sifei Liu, Wonmin Byeon, Thomas Breuel, Jan Kautz, Xiaolong Wang

With only text supervision and without any pixel-level annotations, GroupViT learns to group together semantic regions and successfully transfers to the task of semantic segmentation in a zero-shot manner, i. e., without any further fine-tuning.

object-detection Object Detection +3

Federated Learning with Heterogeneous Architectures using Graph HyperNetworks

no code implementations20 Jan 2022 Or Litany, Haggai Maron, David Acuna, Jan Kautz, Gal Chechik, Sanja Fidler

Standard Federated Learning (FL) techniques are limited to clients with identical network architectures.

Federated Learning

A-ViT: Adaptive Tokens for Efficient Vision Transformer

no code implementations CVPR 2022 Hongxu Yin, Arash Vahdat, Jose Alvarez, Arun Mallya, Jan Kautz, Pavlo Molchanov

A-ViT achieves this by automatically reducing the number of tokens in vision transformers that are processed in the network as inference proceeds.

Image Classification

GLAMR: Global Occlusion-Aware Human Mesh Recovery with Dynamic Cameras

1 code implementation CVPR 2022 Ye Yuan, Umar Iqbal, Pavlo Molchanov, Kris Kitani, Jan Kautz

Since the joint reconstruction of human motions and camera poses is underconstrained, we propose a global trajectory predictor that generates global human trajectories based on local body movements.

3D Human Pose Estimation Human Mesh Recovery

Learning Continuous Environment Fields via Implicit Functions

no code implementations ICLR 2022 Xueting Li, Shalini De Mello, Xiaolong Wang, Ming-Hsuan Yang, Jan Kautz, Sifei Liu

We propose a novel scene representation that encodes reaching distance -- the distance between any position in the scene to a goal along a feasible trajectory.

Trajectory Prediction

NViT: Vision Transformer Compression and Parameter Redistribution

no code implementations10 Oct 2021 Huanrui Yang, Hongxu Yin, Pavlo Molchanov, Hai Li, Jan Kautz

On ImageNet-1K, we prune the DEIT-Base (Touvron et al., 2021) model to a 2. 6x FLOPs reduction, 5. 1x parameter reduction, and 1. 9x run-time speedup with only 0. 07% loss in accuracy.

Hardware-Aware Network Transformation

no code implementations29 Sep 2021 Pavlo Molchanov, Jimmy Hall, Hongxu Yin, Jan Kautz, Nicolo Fusi, Arash Vahdat

In the second phase, it solves the combinatorial selection of efficient operations using a novel constrained integer linear optimization approach.

Neural Architecture Search

Learning Contrastive Representation for Semantic Correspondence

no code implementations22 Sep 2021 Taihong Xiao, Sifei Liu, Shalini De Mello, Zhiding Yu, Jan Kautz, Ming-Hsuan Yang

Dense correspondence across semantically related images has been extensively studied, but still faces two challenges: 1) large variations in appearance, scale and pose exist even for objects from the same category, and 2) labeling pixel-level dense correspondences is labor intensive and infeasible to scale.

Contrastive Learning Semantic correspondence

Learning Indoor Inverse Rendering with 3D Spatially-Varying Lighting

no code implementations ICCV 2021 Zian Wang, Jonah Philion, Sanja Fidler, Jan Kautz

In this paper, we propose a unified, learning-based inverse rendering framework that formulates 3D spatially-varying lighting.

Image-to-Image Translation Translation

Deep Neural Networks are Surprisingly Reversible: A Baseline for Zero-Shot Inversion

no code implementations13 Jul 2021 Xin Dong, Hongxu Yin, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov

Understanding the behavior and vulnerability of pre-trained deep neural networks (DNNs) can help to improve them.

LANA: Latency Aware Network Acceleration

no code implementations12 Jul 2021 Pavlo Molchanov, Jimmy Hall, Hongxu Yin, Jan Kautz, Nicolo Fusi, Arash Vahdat

We analyze three popular network architectures: EfficientNetV1, EfficientNetV2 and ResNeST, and achieve accuracy improvement for all models (up to $3. 0\%$) when compressing larger models to the latency level of smaller models.

Neural Architecture Search Quantization

Self-Supervised Learning on 3D Point Clouds by Learning Discrete Generative Models

no code implementations CVPR 2021 Benjamin Eckart, Wentao Yuan, Chao Liu, Jan Kautz

In this work, we introduce a general method for 3D self-supervised representation learning that 1) remains agnostic to the underlying neural network architecture, and 2) specifically leverages the geometric nature of 3D point cloud data.

Point Cloud Segmentation Representation Learning +3

Score-based Generative Modeling in Latent Space

1 code implementation NeurIPS 2021 Arash Vahdat, Karsten Kreis, Jan Kautz

Moving from data to latent space allows us to train more expressive generative models, apply SGMs to non-continuous data, and learn smoother SGMs in a smaller space, resulting in fewer network evaluations and faster sampling.

Image Generation

Adversarial Motion Modelling helps Semi-supervised Hand Pose Estimation

no code implementations10 Jun 2021 Adrian Spurr, Pavlo Molchanov, Umar Iqbal, Jan Kautz, Otmar Hilliges

Hand pose estimation is difficult due to different environmental conditions, object- and self-occlusion as well as diversity in hand shape and appearance.

Hand Pose Estimation

Weakly-Supervised Physically Unconstrained Gaze Estimation

1 code implementation CVPR 2021 Rakshit Kothari, Shalini De Mello, Umar Iqbal, Wonmin Byeon, Seonwook Park, Jan Kautz

A major challenge for physically unconstrained gaze estimation is acquiring training data with 3D gaze annotations for in-the-wild and outdoor scenarios.

Domain Generalization Gaze Estimation

KAMA: 3D Keypoint Aware Body Mesh Articulation

no code implementations27 Apr 2021 Umar Iqbal, Kevin Xie, Yunrong Guo, Jan Kautz, Pavlo Molchanov

We present KAMA, a 3D Keypoint Aware Mesh Articulation approach that allows us to estimate a human body mesh from the positions of 3D body keypoints.

3D Human Pose Estimation 3D Human Shape Estimation +1

See through Gradients: Image Batch Recovery via GradInversion

2 code implementations CVPR 2021 Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov

In this work, we introduce GradInversion, using which input images from a larger batch (8 - 48 images) can also be recovered for large networks such as ResNets (50 layers), on complex datasets such as ImageNet (1000 classes, 224x224 px).

Federated Learning Inference Attack +1

Learning to Track Instances without Video Annotations

no code implementations CVPR 2021 Yang Fu, Sifei Liu, Umar Iqbal, Shalini De Mello, Humphrey Shi, Jan Kautz

Tracking segmentation masks of multiple instances has been intensively studied, but still faces two fundamental challenges: 1) the requirement of large-scale, frame-wise annotation, and 2) the complexity of two-stage approaches.

Instance Segmentation Pose Estimation +1

Neural 3D Clothes Retargeting from a Single Image

no code implementations29 Jan 2021 Jae Shin Yoon, Kihwan Kim, Jan Kautz, Hyun Soo Park

In this paper, we present a method of clothes retargeting; generating the potential poses and deformations of a given 3D clothing template model to fit onto a person in a single RGB image.

Binary TTC: A Temporal Geofence for Autonomous Navigation

1 code implementation CVPR 2021 Abhishek Badki, Orazio Gallo, Jan Kautz, Pradeep Sen

Time-to-contact (TTC), the time for an object to collide with the observer's plane, is a powerful tool for path planning: it is potentially more informative than the depth, velocity, and acceleration of objects in the scene -- even for humans.

Autonomous Navigation Quantization

Parameter Efficient Multimodal Transformers for Video Representation Learning

no code implementations ICLR 2021 Sangho Lee, Youngjae Yu, Gunhee Kim, Thomas Breuel, Jan Kautz, Yale Song

The recent success of Transformers in the language domain has motivated adapting it to a multimodal setting, where a new visual model is trained in tandem with an already pretrained language model.

Language Modelling Representation Learning

Online Adaptation for Consistent Mesh Reconstruction in the Wild

no code implementations NeurIPS 2020 Xueting Li, Sifei Liu, Shalini De Mello, Kihwan Kim, Xiaolong Wang, Ming-Hsuan Yang, Jan Kautz

This paper presents an algorithm to reconstruct temporally consistent 3D meshes of deformable object instances from videos in the wild.

3D Reconstruction

UFO$^2$: A Unified Framework towards Omni-supervised Object Detection

no code implementations21 Oct 2020 Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Alexander G. Schwing, Jan Kautz

Existing work on object detection often relies on a single form of annotation: the model is trained using either accurate yet costly bounding boxes or cheaper but less expressive image-level tags.

object-detection Object Detection

A Contrastive Learning Approach for Training Variational Autoencoder Priors

no code implementations NeurIPS 2021 Jyoti Aneja, Alexander Schwing, Jan Kautz, Arash Vahdat

To tackle this issue, we propose an energy-based prior defined by the product of a base prior distribution and a reweighting factor, designed to bring the base closer to the aggregate posterior.

Ranked #2 on Image Generation on CelebA 256x256 (FID metric)

Contrastive Learning Image Generation

VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models

1 code implementation ICLR 2021 Zhisheng Xiao, Karsten Kreis, Jan Kautz, Arash Vahdat

VAEBM captures the overall mode structure of the data distribution using a state-of-the-art VAE and it relies on its EBM component to explicitly exclude non-data-like regions from the model and refine the image samples.

Image Generation Out-of-Distribution Detection

NCP-VAE: Variational Autoencoders with Noise Contrastive Priors

no code implementations28 Sep 2020 Jyoti Aneja, Alex Schwing, Jan Kautz, Arash Vahdat

To tackle this issue, we propose an energy-based prior defined by the product of a base prior distribution and a reweighting factor, designed to bring the base closer to the aggregate posterior.

Improving Deep Stereo Network Generalization with Geometric Priors

no code implementations25 Aug 2020 Jialiang Wang, Varun Jampani, Deqing Sun, Charles Loop, Stan Birchfield, Jan Kautz

End-to-end deep learning methods have advanced stereo vision in recent years and obtained excellent results when the training and test data are similar.

Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification

1 code implementation ECCV 2020 Yang Zou, Xiaodong Yang, Zhiding Yu, B. V. K. Vijaya Kumar, Jan Kautz

To this end, we propose a joint learning framework that disentangles id-related/unrelated features and enforces adaptation to work on the id-related feature space exclusively.

Person Re-Identification Unsupervised Domain Adaptation

Hierarchical Contrastive Motion Learning for Video Action Recognition

no code implementations20 Jul 2020 Xitong Yang, Xiaodong Yang, Sifei Liu, Deqing Sun, Larry Davis, Jan Kautz

Thus, the motion features at higher levels are trained to gradually capture semantic dynamics and evolve more discriminative for action recognition.

Action Recognition Contrastive Learning +1

NVAE: A Deep Hierarchical Variational Autoencoder

7 code implementations NeurIPS 2020 Arash Vahdat, Jan Kautz

For example, on CIFAR-10, NVAE pushes the state-of-the-art from 2. 98 to 2. 91 bits per dimension, and it produces high-quality images on CelebA HQ.

Ranked #3 on Image Generation on FFHQ 256 x 256 (bits/dimension metric)

Image Generation

Contrastive Learning for Weakly Supervised Phrase Grounding

1 code implementation ECCV 2020 Tanmay Gupta, Arash Vahdat, Gal Chechik, Xiaodong Yang, Jan Kautz, Derek Hoiem

Given pairs of images and captions, we maximize compatibility of the attention-weighted regions and the words in the corresponding caption, compared to non-corresponding pairs of images and captions.

Contrastive Learning Language Modelling +1

Bi3D: Stereo Depth Estimation via Binary Classifications

1 code implementation CVPR 2020 Abhishek Badki, Alejandro Troccoli, Kihwan Kim, Jan Kautz, Pradeep Sen, Orazio Gallo

Given a strict time budget, Bi3D can detect objects closer than a given distance in as little as a few milliseconds, or estimate depth with arbitrarily coarse quantization, with complexity linear with the number of quantization levels.

Autonomous Navigation Quantization +1

How to Close Sim-Real Gap? Transfer with Segmentation!

no code implementations14 May 2020 Mengyuan Yan, Qingyun Sun, Iuri Frosio, Stephen Tyree, Jan Kautz

Combining the control policy learned from simulation with the perception model, we achieve an impressive $\bf{88\%}$ success rate in grasping a tiny sphere with a real robot.

Robotics

Novel View Synthesis of Dynamic Scenes with Globally Coherent Depths from a Monocular Camera

no code implementations CVPR 2020 Jae Shin Yoon, Kihwan Kim, Orazio Gallo, Hyun Soo Park, Jan Kautz

Our insight is that although its scale and quality are inconsistent with other views, the depth estimation from a single view can be used to reason about the globally coherent geometry of dynamic contents.

Depth Estimation Novel View Synthesis

Two-shot Spatially-varying BRDF and Shape Estimation

1 code implementation CVPR 2020 Mark Boss, Varun Jampani, Kihwan Kim, Hendrik P. A. Lensch, Jan Kautz

Extensive experiments on both synthetic and real-world datasets show that our network trained on a synthetic dataset can generalize well to real-world images.

Self-supervised Single-view 3D Reconstruction via Semantic Consistency

no code implementations ECCV 2020 Xueting Li, Sifei Liu, Kihwan Kim, Shalini De Mello, Varun Jampani, Ming-Hsuan Yang, Jan Kautz

To the best of our knowledge, we are the first to try and solve the single-view reconstruction problem without a category-specific template mesh or semantic keypoints.

3D Reconstruction Single-View 3D Reconstruction

Convolutional Tensor-Train LSTM for Spatio-temporal Learning

2 code implementations NeurIPS 2020 Jiahao Su, Wonmin Byeon, Jean Kossaifi, Furong Huang, Jan Kautz, Animashree Anandkumar

Learning from spatio-temporal data has numerous applications such as human-behavior analysis, object tracking, video compression, and physics simulation. However, existing methods still perform poorly on challenging video tasks such as long-term forecasting.

 Ranked #1 on Video Prediction on KTH (Cond metric)

Activity Recognition Video Compression +1

Learning to Generate Multiple Style Transfer Outputs for an Input Sentence

no code implementations WS 2020 Kevin Lin, Ming-Yu Liu, Ming-Ting Sun, Jan Kautz

Specifically, we decompose the latent representation of the input sentence to a style code that captures the language style variation and a content code that encodes the language style-independent content.

Style Transfer Text Style Transfer

Exploiting Semantics for Face Image Deblurring

no code implementations19 Jan 2020 Ziyi Shen, Wei-Sheng Lai, Tingfa Xu, Jan Kautz, Ming-Hsuan Yang

Specifically, we first use a coarse deblurring network to reduce the motion blur on the input face image.

Deblurring Face Recognition +1

Discovering Nonlinear Relations with Minimum Predictive Information Regularization

1 code implementation7 Jan 2020 Tailin Wu, Thomas Breuel, Michael Skuhersky, Jan Kautz

Identifying the underlying directional relations from observational time series with nonlinear interactions and complex relational structures is key to a wide range of applications, yet remains a hard problem.

Time Series

Meshlet Priors for 3D Mesh Reconstruction

1 code implementation CVPR 2020 Abhishek Badki, Orazio Gallo, Jan Kautz, Pradeep Sen

Meshlets act as a dictionary of local features and thus allow to use learned priors to reconstruct object meshes in any pose and from unseen classes, even when the noise is large and the samples sparse.

UNAS: Differentiable Architecture Search Meets Reinforcement Learning

1 code implementation CVPR 2020 Arash Vahdat, Arun Mallya, Ming-Yu Liu, Jan Kautz

Our framework brings the best of both worlds, and it enables us to search for architectures with both differentiable and non-differentiable criteria in one unified framework while maintaining a low search cost.

Neural Architecture Search reinforcement-learning

Dancing to Music

1 code implementation NeurIPS 2019 Hsin-Ying Lee, Xiaodong Yang, Ming-Yu Liu, Ting-Chun Wang, Yu-Ding Lu, Ming-Hsuan Yang, Jan Kautz

In the analysis phase, we decompose a dance into a series of basic dance units, through which the model learns how to move.

Few-shot Video-to-Video Synthesis

6 code implementations NeurIPS 2019 Ting-Chun Wang, Ming-Yu Liu, Andrew Tao, Guilin Liu, Jan Kautz, Bryan Catanzaro

To address the limitations, we propose a few-shot vid2vid framework, which learns to synthesize videos of previously unseen subjects or scenes by leveraging few example images of the target at test time.

Video-to-Video Synthesis

SENSE: a Shared Encoder Network for Scene-flow Estimation

1 code implementation ICCV 2019 Huaizu Jiang, Deqing Sun, Varun Jampani, Zhaoyang Lv, Erik Learned-Miller, Jan Kautz

We introduce a compact network for holistic scene flow estimation, called SENSE, which shares common encoder features among four closely-related tasks: optical flow estimation, disparity estimation from stereo, occlusion estimation, and semantic segmentation.

Disparity Estimation Occlusion Estimation +3

Joint-task Self-supervised Learning for Temporal Correspondence

2 code implementations NeurIPS 2019 Xueting Li, Sifei Liu, Shalini De Mello, Xiaolong Wang, Jan Kautz, Ming-Hsuan Yang

Our learning process integrates two highly related tasks: tracking large image regions \emph{and} establishing fine-grained pixel-level associations between consecutive video frames.

Object Tracking Self-Supervised Learning +2

Convolutional Tensor-Train LSTM for Long-Term Video Prediction

no code implementations25 Sep 2019 Jiahao Su, Wonmin Byeon, Furong Huang, Jan Kautz, Animashree Anandkumar

Long-term video prediction is highly challenging since it entails simultaneously capturing spatial and temporal information across a long range of image frames. Standard recurrent models are ineffective since they are prone to error propagation and cannot effectively capture higher-order correlations.

Video Prediction

Learning Propagation for Arbitrarily-structured Data

no code implementations ICCV 2019 Sifei Liu, Xueting Li, Varun Jampani, Shalini De Mello, Jan Kautz

We experiment with semantic segmentation networks, where we use our propagation module to jointly train on different data -- images, superpixels and point clouds.

Point Cloud Segmentation Semantic Segmentation +1

Long History Short-Term Memory for Long-Term Video Prediction

no code implementations25 Sep 2019 Wonmin Byeon, Jan Kautz

While video prediction approaches have advanced considerably in recent years, learning to predict long-term future is challenging — ambiguous future or error propagation over time yield blurry predictions.

Video Prediction

Video Stitching for Linear Camera Arrays

no code implementations31 Jul 2019 Wei-Sheng Lai, Orazio Gallo, Jinwei Gu, Deqing Sun, Ming-Hsuan Yang, Jan Kautz

Despite the long history of image and video stitching research, existing academic and commercial solutions still produce strong artifacts.

Autonomous Driving Spatial Interpolation

Importance Estimation for Neural Network Pruning

3 code implementations CVPR 2019 Pavlo Molchanov, Arun Mallya, Stephen Tyree, Iuri Frosio, Jan Kautz

On ResNet-101, we achieve a 40% FLOPS reduction by removing 30% of the parameters, with a loss of 0. 02% in the top-1 accuracy on ImageNet.

Network Pruning

Unsupervised Video Interpolation Using Cycle Consistency

1 code implementation ICCV 2019 Fitsum A. Reda, Deqing Sun, Aysegul Dundar, Mohammad Shoeybi, Guilin Liu, Kevin J. Shih, Andrew Tao, Jan Kautz, Bryan Catanzaro

We further introduce a pseudo supervised loss term that enforces the interpolated frames to be consistent with predictions of a pre-trained interpolation model.

 Ranked #1 on Video Frame Interpolation on UCF101 (PSNR (sRGB) metric)

Video Frame Interpolation

Few-Shot Viewpoint Estimation

no code implementations13 May 2019 Hung-Yu Tseng, Shalini De Mello, Jonathan Tremblay, Sifei Liu, Stan Birchfield, Ming-Hsuan Yang, Jan Kautz

Through extensive experimentation on the ObjectNet3D and Pascal3D+ benchmark datasets, we demonstrate that our framework, which we call MetaView, significantly outperforms fine-tuning the state-of-the-art models with few examples, and that the specific architectural innovations of our method are crucial to achieving good performance.

Meta-Learning Viewpoint Estimation

Few-Shot Adaptive Gaze Estimation

1 code implementation ICCV 2019 Seonwook Park, Shalini De Mello, Pavlo Molchanov, Umar Iqbal, Otmar Hilliges, Jan Kautz

Inter-personal anatomical differences limit the accuracy of person-independent gaze estimation networks.

 Ranked #1 on Gaze Estimation on MPII Gaze (using extra training data)

Gaze Estimation Meta-Learning

Few-Shot Unsupervised Image-to-Image Translation

9 code implementations ICCV 2019 Ming-Yu Liu, Xun Huang, Arun Mallya, Tero Karras, Timo Aila, Jaakko Lehtinen, Jan Kautz

Unsupervised image-to-image translation methods learn to map images in a given class to an analogous image in a different class, drawing on unstructured (non-registered) datasets of images.

Translation Unsupervised Image-To-Image Translation

SCOPS: Self-Supervised Co-Part Segmentation

1 code implementation CVPR 2019 Wei-Chih Hung, Varun Jampani, Sifei Liu, Pavlo Molchanov, Ming-Hsuan Yang, Jan Kautz

Parts provide a good intermediate representation of objects that is robust with respect to the camera, pose and appearance variations.

Neural Causal Discovery with Learnable Input Noise

no code implementations ICLR 2019 Tailin Wu, Thomas Breuel, Jan Kautz

Learning causal relations from observational time series with nonlinear interactions and complex causal structures is a key component of human intelligence, and has a wide range of applications.

Causal Discovery EEG +1

STEP: Spatio-Temporal Progressive Learning for Video Action Detection

1 code implementation CVPR 2019 Xitong Yang, Xiaodong Yang, Ming-Yu Liu, Fanyi Xiao, Larry Davis, Jan Kautz

In this paper, we propose Spatio-TEmporal Progressive (STEP) action detector---a progressive learning framework for spatio-temporal action detection in videos.

Action Detection Action Recognition

Pixel-Adaptive Convolutional Neural Networks

2 code implementations CVPR 2019 Hang Su, Varun Jampani, Deqing Sun, Orazio Gallo, Erik Learned-Miller, Jan Kautz

In addition, we also demonstrate that PAC can be used as a drop-in replacement for convolution layers in pre-trained networks, resulting in consistent performance improvements.

Towards annotation-efficient segmentation via image-to-image translation

no code implementations2 Apr 2019 Eugene Vorontsov, Pavlo Molchanov, Christopher Beckham, Jan Kautz, Samuel Kadoury

Specifically, we propose a semi-supervised framework that employs unpaired image-to-image translation between two domains, presence vs. absence of cancer, as the unsupervised objective.

Brain Tumor Segmentation Image-to-Image Translation +2

Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments

no code implementations CVPR 2019 Xueting Li, Sifei Liu, Kihwan Kim, Xiaolong Wang, Ming-Hsuan Yang, Jan Kautz

In order to predict valid affordances and learn possible 3D human poses in indoor scenes, we need to understand the semantic and geometric structure of a scene as well as its potential interactions with a human.

NRMVS: Non-Rigid Multi-View Stereo

no code implementations12 Jan 2019 Matthias Innmann, Kihwan Kim, Jinwei Gu, Matthias Niessner, Charles Loop, Marc Stamminger, Jan Kautz

We show that creating a dense 4D structure from a few RGB images with non-rigid changes is possible, and demonstrate that our method can be used to interpolate novel deformed scenes from various combinations of these deformation estimates derived from the sparse views.

3D Reconstruction Depth Estimation

Neural Inverse Rendering of an Indoor Scene from a Single Image

no code implementations ICCV 2019 Soumyadip Sengupta, Jinwei Gu, Kihwan Kim, Guilin Liu, David W. Jacobs, Jan Kautz

Inverse rendering aims to estimate physical attributes of a scene, e. g., reflectance, geometry, and lighting, from image(s).

Self-Supervised Learning

Extreme View Synthesis

1 code implementation ICCV 2019 Inchang Choi, Orazio Gallo, Alejandro Troccoli, Min H. Kim, Jan Kautz

We present Extreme View Synthesis, a solution for novel view extrapolation that works even when the number of input images is small--as few as two.

PlaneRCNN: 3D Plane Detection and Reconstruction from a Single Image

2 code implementations CVPR 2019 Chen Liu, Kihwan Kim, Jinwei Gu, Yasutaka Furukawa, Jan Kautz

This paper proposes a deep neural architecture, PlaneRCNN, that detects and reconstructs piecewise planar surfaces from a single RGB image.

3D Plane Detection 3D Reconstruction

Context-Aware Synthesis and Placement of Object Instances

1 code implementation NeurIPS 2018 Donghoon Lee, Sifei Liu, Jinwei Gu, Ming-Yu Liu, Ming-Hsuan Yang, Jan Kautz

Learning to insert an object instance into an image in a semantically coherent manner is a challenging and interesting problem.

Scene Parsing

A Fusion Approach for Multi-Frame Optical Flow Estimation

2 code implementations23 Oct 2018 Zhile Ren, Orazio Gallo, Deqing Sun, Ming-Hsuan Yang, Erik B. Sudderth, Jan Kautz

To date, top-performing optical flow estimation methods only take pairs of consecutive frames into account.

Optical Flow Estimation

Models Matter, So Does Training: An Empirical Study of CNNs for Optical Flow Estimation

1 code implementation14 Sep 2018 Deqing Sun, Xiaodong Yang, Ming-Yu Liu, Jan Kautz

We investigate two crucial and closely related aspects of CNNs for optical flow estimation: models and training.

Optical Flow Estimation

Video-to-Video Synthesis

11 code implementations NeurIPS 2018 Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, Bryan Catanzaro

We study the problem of video-to-video synthesis, whose goal is to learn a mapping function from an input source video (e. g., a sequence of semantic segmentation masks) to an output photorealistic video that precisely depicts the content of the source video.

Semantic Segmentation Video Prediction +1

Learning Linear Transformations for Fast Arbitrary Style Transfer

1 code implementation14 Aug 2018 Xueting Li, Sifei Liu, Jan Kautz, Ming-Hsuan Yang

Recent arbitrary style transfer methods transfer second order statistics from reference image onto content image via a multiplication between content image features and a transformation matrix, which is computed from features with a pre-determined algorithm.

Domain Adaptation Style Transfer

EOE: Expected Overlap Estimation over Unstructured Point Cloud Data

no code implementations6 Aug 2018 Ben Eckart, Kihwan Kim, Jan Kautz

We present an iterative overlap estimation technique to augment existing point cloud registration algorithms that can achieve high performance in difficult real-world situations where large pose displacement and non-overlapping geometry would otherwise cause traditional methods to fail.

Point Cloud Registration

Simultaneous Edge Alignment and Learning

3 code implementations ECCV 2018 Zhiding Yu, Weiyang Liu, Yang Zou, Chen Feng, Srikumar Ramalingam, B. V. K. Vijaya Kumar, Jan Kautz

Edge detection is among the most fundamental vision problems for its role in perceptual grouping and its wide applications.

Edge Detection Representation Learning

Superpixel Sampling Networks

2 code implementations ECCV 2018 Varun Jampani, Deqing Sun, Ming-Yu Liu, Ming-Hsuan Yang, Jan Kautz

Superpixels provide an efficient low/mid-level representation of image data, which greatly reduces the number of image primitives for subsequent vision tasks.

Superpixels

Tackling 3D ToF Artifacts Through Learning and the FLAT Dataset

no code implementations ECCV 2018 Qi Guo, Iuri Frosio, Orazio Gallo, Todd Zickler, Jan Kautz

Scene motion, multiple reflections, and sensor noise introduce artifacts in the depth reconstruction performed by time-of-flight cameras.

Domain Stylization: A Strong, Simple Baseline for Synthetic to Real Image Domain Adaptation

no code implementations24 Jul 2018 Aysegul Dundar, Ming-Yu Liu, Ting-Chun Wang, John Zedlewski, Jan Kautz

Deep neural networks have largely failed to effectively utilize synthetic data when applied to real images due to the covariate shift problem.

Domain Adaptation object-detection +4

Fast and Accurate Point Cloud Registration using Trees of Gaussian Mixtures

1 code implementation6 Jul 2018 Ben Eckart, Kihwan Kim, Jan Kautz

Point cloud registration sits at the core of many important and challenging 3D perception problems including autonomous navigation, SLAM, object/scene recognition, and augmented reality.

Autonomous Navigation Point Cloud Registration +1

Making Convolutional Networks Recurrent for Visual Sequence Learning

no code implementations CVPR 2018 Xiaodong Yang, Pavlo Molchanov, Jan Kautz

Recurrent neural networks (RNNs) have emerged as a powerful model for a broad range of machine learning problems that involve sequential data.

Action Recognition Face Alignment +3

Learning Superpixels With Segmentation-Aware Affinity Loss

no code implementations CVPR 2018 Wei-Chih Tu, Ming-Yu Liu, Varun Jampani, Deqing Sun, Shao-Yi Chien, Ming-Hsuan Yang, Jan Kautz

Specifically, we propose a new loss function that takes the segmentation error into account for affinity learning.

Superpixels

Synthetically Trained Neural Networks for Learning Human-Readable Plans from Real-World Demonstrations

1 code implementation18 May 2018 Jonathan Tremblay, Thang To, Artem Molchanov, Stephen Tyree, Jan Kautz, Stan Birchfield

We present a system to infer and execute a human-readable program from a real-world demonstration.

Robotics

Light-weight Head Pose Invariant Gaze Tracking

no code implementations23 Apr 2018 Rajeev Ranjan, Shalini De Mello, Jan Kautz

Unconstrained remote gaze tracking using off-the-shelf cameras is a challenging problem.

Gaze Estimation Transfer Learning +1

Switchable Temporal Propagation Network

1 code implementation ECCV 2018 Sifei Liu, Guangyu Zhong, Shalini De Mello, Jinwei Gu, Varun Jampani, Ming-Hsuan Yang, Jan Kautz

Our approach is based on a temporal propagation network (TPN), which models the transition-related affinity between a pair of frames in a purely data-driven manner.

Video Compression

Deep Semantic Face Deblurring

no code implementations CVPR 2018 Ziyi Shen, Wei-Sheng Lai, Tingfa Xu, Jan Kautz, Ming-Hsuan Yang

In this paper, we present an effective and efficient face deblurring algorithm by exploiting semantic cues via deep convolutional neural networks (CNNs).

Deblurring Face Recognition

SPLATNet: Sparse Lattice Networks for Point Cloud Processing

2 code implementations CVPR 2018 Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz

We present a network architecture for processing point clouds that directly operates on a collection of points represented as a sparse set of samples in a high-dimensional lattice.

3D Part Segmentation 3D Semantic Segmentation

A Closed-form Solution to Photorealistic Image Stylization

12 code implementations ECCV 2018 Yijun Li, Ming-Yu Liu, Xueting Li, Ming-Hsuan Yang, Jan Kautz

Photorealistic image stylization concerns transferring style of a reference photo to a content photo with the constraint that the stylized photo should remain photorealistic.

Image Stylization

Learning Binary Residual Representations for Domain-specific Video Streaming

no code implementations14 Dec 2017 Yi-Hsuan Tsai, Ming-Yu Liu, Deqing Sun, Ming-Hsuan Yang, Jan Kautz

Specifically, we target a streaming setting where the videos to be streamed from a server to a client are all in the same domain and they have to be compressed to a small size for low-latency transmission.

Video Compression

Geometry-Aware Learning of Maps for Camera Localization

1 code implementation CVPR 2018 Samarth Brahmbhatt, Jinwei Gu, Kihwan Kim, James Hays, Jan Kautz

Maps are a key component in image-based camera localization and visual SLAM systems: they are used to establish geometric constraints between images, correct drift in relative pose estimation, and relocalize cameras after lost tracking.

Camera Localization

Separating Reflection and Transmission Images in the Wild

no code implementations ECCV 2018 Patrick Wieschollek, Orazio Gallo, Jinwei Gu, Jan Kautz

The reflections caused by common semi-reflectors, such as glass windows, can impact the performance of computer vision algorithms.

Synthetic Data Generation

High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

15 code implementations CVPR 2018 Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, Bryan Catanzaro

We present a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs).

Conditional Image Generation Fundus to Angiography Generation +3

Budget-Aware Activity Detection with A Recurrent Policy Network

no code implementations30 Nov 2017 Behrooz Mahasseni, Xiaodong Yang, Pavlo Molchanov, Jan Kautz

In this paper, we address the challenging problem of efficient temporal activity detection in untrimmed long videos.

Action Detection Activity Detection

On Nearest Neighbors in Non Local Means Denoising

no code implementations20 Nov 2017 Iuri Frosio, Jan Kautz

To denoise a reference patch, the Non-Local-Means denoising filter processes a set of neighbor patches.

Denoising

Multiframe Scene Flow with Piecewise Rigid Motion

no code implementations5 Oct 2017 Vladislav Golyanik, Kihwan Kim, Robert Maier, Matthias Nießner, Didier Stricker, Jan Kautz

We introduce a novel multiframe scene flow approach that jointly optimizes the consistency of the patch appearances and their local rigid motions from RGB-D image sequences.

Scene Flow Estimation

Learning Affinity via Spatial Propagation Network

no code implementations3 Oct 2017 Sifei Liu, Shalini De Mello, Jinwei Gu, Guangyu Zhong, Ming-Hsuan Yang, Jan Kautz

Specifically, we develop a three-way connection for the linear propagation model, which (a) formulates a sparse transformation matrix, where all elements can be the output from a deep CNN, but (b) results in a dense affinity matrix that effectively models any task-specific pairwise similarity matrix.

Colorization Face Parsing +2

Learning Affinity via Spatial Propagation Networks

no code implementations NeurIPS 2017 Sifei Liu, Shalini De Mello, Jinwei Gu, Guangyu Zhong, Ming-Hsuan Yang, Jan Kautz

Specifically, we develop a three-way connection for the linear propagation model, which (a) formulates a sparse transformation matrix, where all elements can be the output from a deep CNN, but (b) results in a dense affinity matrix that effectively models any task-specific pairwise similarity matrix.

Colorization Face Parsing +2

Learning to Segment Instances in Videos with Spatial Propagation Network

no code implementations14 Sep 2017 Jingchun Cheng, Sifei Liu, Yi-Hsuan Tsai, Wei-Chih Hung, Shalini De Mello, Jinwei Gu, Jan Kautz, Shengjin Wang, Ming-Hsuan Yang

In addition, we apply a filter on the refined score map that aims to recognize the best connected region using spatial and temporal consistencies in the video.

Semantic Segmentation

Improving Landmark Localization with Semi-Supervised Learning

no code implementations CVPR 2018 Sina Honari, Pavlo Molchanov, Stephen Tyree, Pascal Vincent, Christopher Pal, Jan Kautz

First, we propose the framework of sequential multitasking and explore it here through an architecture for landmark localization where training with class labels acts as an auxiliary signal to guide the landmark localization on unlabeled data.

Small Data Image Classification

Cascaded Scene Flow Prediction using Semantic Segmentation

no code implementations26 Jul 2017 Zhile Ren, Deqing Sun, Jan Kautz, Erik B. Sudderth

Given two consecutive frames from a pair of stereo cameras, 3D scene flow methods simultaneously estimate the 3D geometry and motion of the observed scene.

Autonomous Driving General Classification +3

Polarimetric Multi-View Stereo

no code implementations CVPR 2017 Zhaopeng Cui, Jinwei Gu, Boxin Shi, Ping Tan, Jan Kautz

Multi-view stereo relies on feature correspondences for 3D reconstruction, and thus is fundamentally flawed in dealing with featureless scenes.

3D Reconstruction

Dynamic Facial Analysis: From Bayesian Filtering to Recurrent Neural Network

no code implementations CVPR 2017 Jinwei Gu, Xiaodong Yang, Shalini De Mello, Jan Kautz

We are inspired by the fact that the computation performed in an RNN bears resemblance to Bayesian filters, which have been used for tracking in many previous methods for facial analysis from videos.

Face Alignment Feature Engineering +3

A Lightweight Approach for On-the-Fly Reflectance Estimation

no code implementations ICCV 2017 Kihwan Kim, Jinwei Gu, Stephen Tyree, Pavlo Molchanov, Matthias Nießner, Jan Kautz

In addition, we have created a large synthetic dataset, SynBRDF, which comprises a total of $500$K RGBD images rendered with a physically-based ray tracer under a variety of natural illumination, covering $5000$ materials and $5000$ shapes.

Color Constancy

Unsupervised Image-to-Image Translation Networks

9 code implementations NeurIPS 2017 Ming-Yu Liu, Thomas Breuel, Jan Kautz

Unsupervised image-to-image translation aims at learning a joint distribution of images in different domains by using images from the marginal distributions in individual domains.

Domain Adaptation Multimodal Unsupervised Image-To-Image Translation +2

Pruning Convolutional Neural Networks for Resource Efficient Inference

9 code implementations19 Nov 2016 Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, Jan Kautz

We propose a new criterion based on Taylor expansion that approximates the change in the cost function induced by pruning network parameters.

Transfer Learning

Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU

3 code implementations18 Nov 2016 Mohammad Babaeizadeh, Iuri Frosio, Stephen Tyree, Jason Clemons, Jan Kautz

We introduce a hybrid CPU/GPU version of the Asynchronous Advantage Actor-Critic (A3C) algorithm, currently the state-of-the-art method in reinforcement learning for various gaming tasks.

reinforcement-learning

Learning Adaptive Parameter Tuning for Image Processing

no code implementations28 Oct 2016 Jingming Dong, Iuri Frosio, Jan Kautz

The non-stationary nature of image characteristics calls for adaptive processing, based on the local image content.

Deblurring Demosaicking +1

Online Detection and Classification of Dynamic Hand Gestures With Recurrent 3D Convolutional Neural Network

no code implementations CVPR 2016 Pavlo Molchanov, Xiaodong Yang, Shalini Gupta, Kihwan Kim, Stephen Tyree, Jan Kautz

Automatic detection and classification of dynamic hand gestures in real-world systems intended for human computer interaction is challenging as: 1) there is a large diversity in how people perform gestures, making detection and classification difficult; 2) the system must work online in order to avoid noticeable lag between performing a gesture and its classification; in fact, a negative lag (classification before the gesture is finished) is desirable, as feedback to the user can then be truly instantaneous.

Classification General Classification +1

Accelerated Generative Models for 3D Point Cloud Data

no code implementations CVPR 2016 Benjamin Eckart, Kihwan Kim, Alejandro Troccoli, Alonzo Kelly, Jan Kautz

In this paper we introduce a method for constructing compact generative representations of PCD at multiple levels of detail.

Loss Functions for Neural Networks for Image Processing

2 code implementations28 Nov 2015 Hang Zhao, Orazio Gallo, Iuri Frosio, Jan Kautz

Neural networks are becoming central in several areas of computer vision and image processing and different architectures have been proposed to solve specific problems.

Image Restoration

Modeling Object Appearance Using Context-Conditioned Component Analysis

no code implementations CVPR 2015 Daniyar Turmukhambetov, Neill D. F. Campbell, Simon J. D. Prince, Jan Kautz

In this work we remove the image space alignment limitations of existing subspace models by conditioning the models on a shape dependent context that allows for the complex, non-linear structure of the appearance of the visual object to be captured and shared.

Hierarchical Subquery Evaluation for Active Learning on a Graph

no code implementations CVPR 2014 Oisin Mac Aodha, Neill D. F. Campbell, Jan Kautz, Gabriel J. Brostow

Under some specific circumstances, Expected Error Reduction has been one of the strongest-performing informativeness criteria for active learning.

Active Learning graph construction +1

Cannot find the paper you are looking for? You can Submit a new open access paper.