Search Results for author: Jan Kautz

Found 175 papers, 83 papers with code

GroupViT: Semantic Segmentation Emerges from Text Supervision

2 code implementations CVPR 2022 Jiarui Xu, Shalini De Mello, Sifei Liu, Wonmin Byeon, Thomas Breuel, Jan Kautz, Xiaolong Wang

With only text supervision and without any pixel-level annotations, GroupViT learns to group together semantic regions and successfully transfers to the task of semantic segmentation in a zero-shot manner, i. e., without any further fine-tuning.

Object Detection Scene Understanding +3

Global Context Vision Transformers

8 code implementations20 Jun 2022 Ali Hatamizadeh, Hongxu Yin, Greg Heinrich, Jan Kautz, Pavlo Molchanov

Pre-trained GC ViT backbones in downstream tasks of object detection, instance segmentation, and semantic segmentation using MS COCO and ADE20K datasets outperform prior work consistently.

Image Classification Inductive Bias +4

Unsupervised Image-to-Image Translation Networks

8 code implementations NeurIPS 2017 Ming-Yu Liu, Thomas Breuel, Jan Kautz

Unsupervised image-to-image translation aims at learning a joint distribution of images in different domains by using images from the marginal distributions in individual domains.

Domain Adaptation Multimodal Unsupervised Image-To-Image Translation +2

A Closed-form Solution to Photorealistic Image Stylization

12 code implementations ECCV 2018 Yijun Li, Ming-Yu Liu, Xueting Li, Ming-Hsuan Yang, Jan Kautz

Photorealistic image stylization concerns transferring style of a reference photo to a content photo with the constraint that the stylized photo should remain photorealistic.

Image Stylization

Loss Functions for Neural Networks for Image Processing

2 code implementations28 Nov 2015 Hang Zhao, Orazio Gallo, Iuri Frosio, Jan Kautz

Neural networks are becoming central in several areas of computer vision and image processing and different architectures have been proposed to solve specific problems.

Image Restoration

Video-to-Video Synthesis

11 code implementations NeurIPS 2018 Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, Bryan Catanzaro

We study the problem of video-to-video synthesis, whose goal is to learn a mapping function from an input source video (e. g., a sequence of semantic segmentation masks) to an output photorealistic video that precisely depicts the content of the source video.

2k Semantic Segmentation +2

High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

20 code implementations CVPR 2018 Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, Bryan Catanzaro

We present a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs).

Conditional Image Generation Fundus to Angiography Generation +5

Few-shot Video-to-Video Synthesis

6 code implementations NeurIPS 2019 Ting-Chun Wang, Ming-Yu Liu, Andrew Tao, Guilin Liu, Jan Kautz, Bryan Catanzaro

To address the limitations, we propose a few-shot vid2vid framework, which learns to synthesize videos of previously unseen subjects or scenes by leveraging few example images of the target at test time.

Video-to-Video Synthesis

Models Matter, So Does Training: An Empirical Study of CNNs for Optical Flow Estimation

2 code implementations14 Sep 2018 Deqing Sun, Xiaodong Yang, Ming-Yu Liu, Jan Kautz

We investigate two crucial and closely related aspects of CNNs for optical flow estimation: models and training.

Optical Flow Estimation

A Fusion Approach for Multi-Frame Optical Flow Estimation

2 code implementations23 Oct 2018 Zhile Ren, Orazio Gallo, Deqing Sun, Ming-Hsuan Yang, Erik B. Sudderth, Jan Kautz

To date, top-performing optical flow estimation methods only take pairs of consecutive frames into account.

Optical Flow Estimation

Few-Shot Unsupervised Image-to-Image Translation

10 code implementations ICCV 2019 Ming-Yu Liu, Xun Huang, Arun Mallya, Tero Karras, Timo Aila, Jaakko Lehtinen, Jan Kautz

Unsupervised image-to-image translation methods learn to map images in a given class to an analogous image in a different class, drawing on unstructured (non-registered) datasets of images.

Translation Unsupervised Image-To-Image Translation

NVAE: A Deep Hierarchical Variational Autoencoder

8 code implementations NeurIPS 2020 Arash Vahdat, Jan Kautz

For example, on CIFAR-10, NVAE pushes the state-of-the-art from 2. 98 to 2. 91 bits per dimension, and it produces high-quality images on CelebA HQ.

Ranked #3 on Image Generation on FFHQ 256 x 256 (bits/dimension metric)

Image Generation

BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects

1 code implementation CVPR 2023 Bowen Wen, Jonathan Tremblay, Valts Blukis, Stephen Tyree, Thomas Muller, Alex Evans, Dieter Fox, Jan Kautz, Stan Birchfield

We present a near real-time method for 6-DoF tracking of an unknown object from a monocular RGBD video sequence, while simultaneously performing neural 3D reconstruction of the object.

3D Object Tracking 3D Reconstruction +5

FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects

1 code implementation13 Dec 2023 Bowen Wen, Wei Yang, Jan Kautz, Stan Birchfield

We present FoundationPose, a unified foundation model for 6D object pose estimation and tracking, supporting both model-based and model-free setups.

3D Object Detection 3D Object Tracking +7

Fast and Accurate Point Cloud Registration using Trees of Gaussian Mixtures

1 code implementation6 Jul 2018 Ben Eckart, Kihwan Kim, Jan Kautz

Point cloud registration sits at the core of many important and challenging 3D perception problems including autonomous navigation, SLAM, object/scene recognition, and augmented reality.

Autonomous Navigation Point Cloud Registration +1

Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU

3 code implementations18 Nov 2016 Mohammad Babaeizadeh, Iuri Frosio, Stephen Tyree, Jason Clemons, Jan Kautz

We introduce a hybrid CPU/GPU version of the Asynchronous Advantage Actor-Critic (A3C) algorithm, currently the state-of-the-art method in reinforcement learning for various gaming tasks.

reinforcement-learning Reinforcement Learning (RL) +1

Synthetically Trained Neural Networks for Learning Human-Readable Plans from Real-World Demonstrations

1 code implementation18 May 2018 Jonathan Tremblay, Thang To, Artem Molchanov, Stephen Tyree, Jan Kautz, Stan Birchfield

We present a system to infer and execute a human-readable program from a real-world demonstration.

Robotics

FB-OCC: 3D Occupancy Prediction based on Forward-Backward View Transformation

1 code implementation4 Jul 2023 Zhiqi Li, Zhiding Yu, David Austin, Mingsheng Fang, Shiyi Lan, Jan Kautz, Jose M. Alvarez

This technical report summarizes the winning solution for the 3D Occupancy Prediction Challenge, which is held in conjunction with the CVPR 2023 Workshop on End-to-End Autonomous Driving and CVPR 23 Workshop on Vision-Centric Autonomous Driving Workshop.

Autonomous Driving Prediction Of Occupancy Grid Maps

PlaneRCNN: 3D Plane Detection and Reconstruction from a Single Image

2 code implementations CVPR 2019 Chen Liu, Kihwan Kim, Jinwei Gu, Yasutaka Furukawa, Jan Kautz

This paper proposes a deep neural architecture, PlaneRCNN, that detects and reconstructs piecewise planar surfaces from a single RGB image.

3D Plane Detection 3D Reconstruction +1

Dancing to Music

2 code implementations NeurIPS 2019 Hsin-Ying Lee, Xiaodong Yang, Ming-Yu Liu, Ting-Chun Wang, Yu-Ding Lu, Ming-Hsuan Yang, Jan Kautz

In the analysis phase, we decompose a dance into a series of basic dance units, through which the model learns how to move.

Motion Synthesis Pose Estimation

Pixel-Adaptive Convolutional Neural Networks

2 code implementations CVPR 2019 Hang Su, Varun Jampani, Deqing Sun, Orazio Gallo, Erik Learned-Miller, Jan Kautz

In addition, we also demonstrate that PAC can be used as a drop-in replacement for convolution layers in pre-trained networks, resulting in consistent performance improvements.

Learning Linear Transformations for Fast Arbitrary Style Transfer

1 code implementation14 Aug 2018 Xueting Li, Sifei Liu, Jan Kautz, Ming-Hsuan Yang

Recent arbitrary style transfer methods transfer second order statistics from reference image onto content image via a multiplication between content image features and a transformation matrix, which is computed from features with a pre-determined algorithm.

Domain Adaptation Style Transfer

UFO²: A Unified Framework towards Omni-supervised Object Detection

1 code implementation ECCV 2020 Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Alexander G. Schwing, Jan Kautz

Existing work on object detection often relies on a single form of annotation: the model is trained using either accurate yet costly bounding boxes or cheaper but less expressive image-level tags.

Object object-detection +1

Superpixel Sampling Networks

2 code implementations ECCV 2018 Varun Jampani, Deqing Sun, Ming-Yu Liu, Ming-Hsuan Yang, Jan Kautz

Superpixels provide an efficient low/mid-level representation of image data, which greatly reduces the number of image primitives for subsequent vision tasks.

Segmentation Superpixels

Geometry-Aware Learning of Maps for Camera Localization

1 code implementation CVPR 2018 Samarth Brahmbhatt, Jinwei Gu, Kihwan Kim, James Hays, Jan Kautz

Maps are a key component in image-based camera localization and visual SLAM systems: they are used to establish geometric constraints between images, correct drift in relative pose estimation, and relocalize cameras after lost tracking.

Camera Localization Visual Localization

GLAMR: Global Occlusion-Aware Human Mesh Recovery with Dynamic Cameras

1 code implementation CVPR 2022 Ye Yuan, Umar Iqbal, Pavlo Molchanov, Kris Kitani, Jan Kautz

Since the joint reconstruction of human motions and camera poses is underconstrained, we propose a global trajectory predictor that generates global human trajectories based on local body movements.

Global 3D Human Pose Estimation Human Mesh Recovery

Score-based Generative Modeling in Latent Space

1 code implementation NeurIPS 2021 Arash Vahdat, Karsten Kreis, Jan Kautz

Moving from data to latent space allows us to train more expressive generative models, apply SGMs to non-continuous data, and learn smoother SGMs in a smaller space, resulting in fewer network evaluations and faster sampling.

Ranked #3 on Image Generation on CIFAR-10 (FD metric)

Image Generation

See through Gradients: Image Batch Recovery via GradInversion

2 code implementations CVPR 2021 Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov

In this work, we introduce GradInversion, using which input images from a larger batch (8 - 48 images) can also be recovered for large networks such as ResNets (50 layers), on complex datasets such as ImageNet (1000 classes, 224x224 px).

Federated Learning Inference Attack +1

DiffiT: Diffusion Vision Transformers for Image Generation

1 code implementation4 Dec 2023 Ali Hatamizadeh, Jiaming Song, Guilin Liu, Jan Kautz, Arash Vahdat

In this paper, we study the effectiveness of ViTs in diffusion-based generative learning and propose a new model denoted as Diffusion Vision Transformers (DiffiT).

Denoising Image Generation

FreeSOLO: Learning to Segment Objects without Annotations

1 code implementation CVPR 2022 Xinlong Wang, Zhiding Yu, Shalini De Mello, Jan Kautz, Anima Anandkumar, Chunhua Shen, Jose M. Alvarez

FreeSOLO further demonstrates superiority as a strong pre-training method, outperforming state-of-the-art self-supervised pre-training methods by +9. 8% AP when fine-tuning instance segmentation with only 5% COCO masks.

Instance Segmentation object-detection +4

Few-Shot Adaptive Gaze Estimation

1 code implementation ICCV 2019 Seonwook Park, Shalini De Mello, Pavlo Molchanov, Umar Iqbal, Otmar Hilliges, Jan Kautz

Inter-personal anatomical differences limit the accuracy of person-independent gaze estimation networks.

 Ranked #1 on Gaze Estimation on MPII Gaze (using extra training data)

Gaze Estimation Meta-Learning

Importance Estimation for Neural Network Pruning

3 code implementations CVPR 2019 Pavlo Molchanov, Arun Mallya, Stephen Tyree, Iuri Frosio, Jan Kautz

On ResNet-101, we achieve a 40% FLOPS reduction by removing 30% of the parameters, with a loss of 0. 02% in the top-1 accuracy on ImageNet.

Network Pruning

SPLATNet: Sparse Lattice Networks for Point Cloud Processing

2 code implementations CVPR 2018 Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz

We present a network architecture for processing point clouds that directly operates on a collection of points represented as a sparse set of samples in a high-dimensional lattice.

3D Part Segmentation 3D Semantic Segmentation

STEP: Spatio-Temporal Progressive Learning for Video Action Detection

1 code implementation CVPR 2019 Xitong Yang, Xiaodong Yang, Ming-Yu Liu, Fanyi Xiao, Larry Davis, Jan Kautz

In this paper, we propose Spatio-TEmporal Progressive (STEP) action detector---a progressive learning framework for spatio-temporal action detection in videos.

Action Detection Action Recognition

Self-supervised Single-view 3D Reconstruction via Semantic Consistency

1 code implementation ECCV 2020 Xueting Li, Sifei Liu, Kihwan Kim, Shalini De Mello, Varun Jampani, Ming-Hsuan Yang, Jan Kautz

To the best of our knowledge, we are the first to try and solve the single-view reconstruction problem without a category-specific template mesh or semantic keypoints.

3D Reconstruction Object +1

SCOPS: Self-Supervised Co-Part Segmentation

1 code implementation CVPR 2019 Wei-Chih Hung, Varun Jampani, Sifei Liu, Pavlo Molchanov, Ming-Hsuan Yang, Jan Kautz

Parts provide a good intermediate representation of objects that is robust with respect to the camera, pose and appearance variations.

Object Segmentation +4

Switchable Temporal Propagation Network

1 code implementation ECCV 2018 Sifei Liu, Guangyu Zhong, Shalini De Mello, Jinwei Gu, Varun Jampani, Ming-Hsuan Yang, Jan Kautz

Our approach is based on a temporal propagation network (TPN), which models the transition-related affinity between a pair of frames in a purely data-driven manner.

Video Compression

Joint-task Self-supervised Learning for Temporal Correspondence

2 code implementations NeurIPS 2019 Xueting Li, Sifei Liu, Shalini De Mello, Xiaolong Wang, Jan Kautz, Ming-Hsuan Yang

Our learning process integrates two highly related tasks: tracking large image regions \emph{and} establishing fine-grained pixel-level associations between consecutive video frames.

Object Tracking Self-Supervised Learning +2

Bi3D: Stereo Depth Estimation via Binary Classifications

1 code implementation CVPR 2020 Abhishek Badki, Alejandro Troccoli, Kihwan Kim, Jan Kautz, Pradeep Sen, Orazio Gallo

Given a strict time budget, Bi3D can detect objects closer than a given distance in as little as a few milliseconds, or estimate depth with arbitrarily coarse quantization, with complexity linear with the number of quantization levels.

Autonomous Navigation Quantization +1

Pruning Convolutional Neural Networks for Resource Efficient Inference

9 code implementations19 Nov 2016 Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, Jan Kautz

We propose a new criterion based on Taylor expansion that approximates the change in the cost function induced by pruning network parameters.

Transfer Learning

AdaViT: Adaptive Tokens for Efficient Vision Transformer

1 code implementation CVPR 2022 Hongxu Yin, Arash Vahdat, Jose Alvarez, Arun Mallya, Jan Kautz, Pavlo Molchanov

A-ViT achieves this by automatically reducing the number of tokens in vision transformers that are processed in the network as inference proceeds.

Efficient ViTs Token Reduction

Simultaneous Edge Alignment and Learning

3 code implementations ECCV 2018 Zhiding Yu, Weiyang Liu, Yang Zou, Chen Feng, Srikumar Ramalingam, B. V. K. Vijaya Kumar, Jan Kautz

Edge detection is among the most fundamental vision problems for its role in perceptual grouping and its wide applications.

Edge Detection Representation Learning

Convolutional Tensor-Train LSTM for Spatio-temporal Learning

2 code implementations NeurIPS 2020 Jiahao Su, Wonmin Byeon, Jean Kossaifi, Furong Huang, Jan Kautz, Animashree Anandkumar

Learning from spatio-temporal data has numerous applications such as human-behavior analysis, object tracking, video compression, and physics simulation. However, existing methods still perform poorly on challenging video tasks such as long-term forecasting.

 Ranked #1 on Video Prediction on KTH (Cond metric)

Activity Recognition Video Compression +1

Unsupervised Video Interpolation Using Cycle Consistency

1 code implementation ICCV 2019 Fitsum A. Reda, Deqing Sun, Aysegul Dundar, Mohammad Shoeybi, Guilin Liu, Kevin J. Shih, Andrew Tao, Jan Kautz, Bryan Catanzaro

We further introduce a pseudo supervised loss term that enforces the interpolated frames to be consistent with predictions of a pre-trained interpolation model.

 Ranked #1 on Video Frame Interpolation on UCF101 (PSNR (sRGB) metric)

Video Frame Interpolation

LITA: Language Instructed Temporal-Localization Assistant

1 code implementation27 Mar 2024 De-An Huang, Shijia Liao, Subhashree Radhakrishnan, Hongxu Yin, Pavlo Molchanov, Zhiding Yu, Jan Kautz

In addition to leveraging existing video datasets with timestamps, we propose a new task, Reasoning Temporal Localization (RTL), along with the dataset, ActivityNet-RTL, for learning and evaluating this task.

Instruction Following Temporal Localization +2

Is Ego Status All You Need for Open-Loop End-to-End Autonomous Driving?

1 code implementation5 Dec 2023 Zhiqi Li, Zhiding Yu, Shiyi Lan, Jiahan Li, Jan Kautz, Tong Lu, Jose M. Alvarez

We initially observed that the nuScenes dataset, characterized by relatively simple driving scenarios, leads to an under-utilization of perception information in end-to-end models incorporating ego status, such as the ego vehicle's velocity.

Autonomous Driving

Context-Aware Synthesis and Placement of Object Instances

2 code implementations NeurIPS 2018 Donghoon Lee, Sifei Liu, Jinwei Gu, Ming-Yu Liu, Ming-Hsuan Yang, Jan Kautz

Learning to insert an object instance into an image in a semantically coherent manner is a challenging and interesting problem.

Object Scene Parsing

Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification

1 code implementation ECCV 2020 Yang Zou, Xiaodong Yang, Zhiding Yu, B. V. K. Vijaya Kumar, Jan Kautz

To this end, we propose a joint learning framework that disentangles id-related/unrelated features and enforces adaptation to work on the id-related feature space exclusively.

Person Re-Identification Unsupervised Domain Adaptation

Two-shot Spatially-varying BRDF and Shape Estimation

1 code implementation CVPR 2020 Mark Boss, Varun Jampani, Kihwan Kim, Hendrik P. A. Lensch, Jan Kautz

Extensive experiments on both synthetic and real-world datasets show that our network trained on a synthetic dataset can generalize well to real-world images.

Vocal Bursts Valence Prediction

SENSE: a Shared Encoder Network for Scene-flow Estimation

1 code implementation ICCV 2019 Huaizu Jiang, Deqing Sun, Varun Jampani, Zhaoyang Lv, Erik Learned-Miller, Jan Kautz

We introduce a compact network for holistic scene flow estimation, called SENSE, which shares common encoder features among four closely-related tasks: optical flow estimation, disparity estimation from stereo, occlusion estimation, and semantic segmentation.

Disparity Estimation Occlusion Estimation +3

Extreme View Synthesis

1 code implementation ICCV 2019 Inchang Choi, Orazio Gallo, Alejandro Troccoli, Min H. Kim, Jan Kautz

We present Extreme View Synthesis, a solution for novel view extrapolation that works even when the number of input images is small--as few as two.

AM-RADIO: Agglomerative Vision Foundation Model -- Reduce All Domains Into One

1 code implementation10 Dec 2023 Mike Ranzinger, Greg Heinrich, Jan Kautz, Pavlo Molchanov

A handful of visual foundation models (VFMs) have recently emerged as the backbones for numerous downstream tasks.

Benchmarking object-detection +2

Contrastive Learning for Weakly Supervised Phrase Grounding

1 code implementation ECCV 2020 Tanmay Gupta, Arash Vahdat, Gal Chechik, Xiaodong Yang, Jan Kautz, Derek Hoiem

Given pairs of images and captions, we maximize compatibility of the attention-weighted regions and the words in the corresponding caption, compared to non-corresponding pairs of images and captions.

Contrastive Learning Language Modelling +1

UNAS: Differentiable Architecture Search Meets Reinforcement Learning

1 code implementation CVPR 2020 Arash Vahdat, Arun Mallya, Ming-Yu Liu, Jan Kautz

Our framework brings the best of both worlds, and it enables us to search for architectures with both differentiable and non-differentiable criteria in one unified framework while maintaining a low search cost.

Neural Architecture Search reinforcement-learning +1

VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models

1 code implementation ICLR 2021 Zhisheng Xiao, Karsten Kreis, Jan Kautz, Arash Vahdat

VAEBM captures the overall mode structure of the data distribution using a state-of-the-art VAE and it relies on its EBM component to explicitly exclude non-data-like regions from the model and refine the image samples.

Image Generation Out-of-Distribution Detection

Binary TTC: A Temporal Geofence for Autonomous Navigation

1 code implementation CVPR 2021 Abhishek Badki, Orazio Gallo, Jan Kautz, Pradeep Sen

Time-to-contact (TTC), the time for an object to collide with the observer's plane, is a powerful tool for path planning: it is potentially more informative than the depth, velocity, and acceleration of objects in the scene -- even for humans.

Autonomous Navigation Quantization

A Variational Perspective on Solving Inverse Problems with Diffusion Models

1 code implementation7 May 2023 Morteza Mardani, Jiaming Song, Jan Kautz, Arash Vahdat

To cope with this challenge, we propose a variational approach that by design seeks to approximate the true posterior distribution.

Denoising Image Restoration +1

Meshlet Priors for 3D Mesh Reconstruction

1 code implementation CVPR 2020 Abhishek Badki, Orazio Gallo, Jan Kautz, Pradeep Sen

Meshlets act as a dictionary of local features and thus allow to use learned priors to reconstruct object meshes in any pose and from unseen classes, even when the noise is large and the samples sparse.

Object

Recurrence without Recurrence: Stable Video Landmark Detection with Deep Equilibrium Models

1 code implementation CVPR 2023 Paul Micaelli, Arash Vahdat, Hongxu Yin, Jan Kautz, Pavlo Molchanov

Our Landmark DEQ (LDEQ) achieves state-of-the-art performance on the challenging WFLW facial landmark dataset, reaching $3. 92$ NME with fewer parameters and a training memory cost of $\mathcal{O}(1)$ in the number of recurrent modules.

Face Alignment

Weakly-Supervised Physically Unconstrained Gaze Estimation

1 code implementation CVPR 2021 Rakshit Kothari, Shalini De Mello, Umar Iqbal, Wonmin Byeon, Seonwook Park, Jan Kautz

A major challenge for physically unconstrained gaze estimation is acquiring training data with 3D gaze annotations for in-the-wild and outdoor scenarios.

Domain Generalization Gaze Estimation

The Best Defense is a Good Offense: Adversarial Augmentation against Adversarial Attacks

1 code implementation CVPR 2023 Iuri Frosio, Jan Kautz

Many defenses against adversarial attacks (\eg robust classifiers, randomization, or image purification) use countermeasures put to work only after the attack has been crafted.

CoordGAN: Self-Supervised Dense Correspondences Emerge from GANs

1 code implementation CVPR 2022 Jiteng Mu, Shalini De Mello, Zhiding Yu, Nuno Vasconcelos, Xiaolong Wang, Jan Kautz, Sifei Liu

We represent the correspondence maps of different images as warped coordinate frames transformed from a canonical coordinate frame, i. e., the correspondence map, which describes the structure (e. g., the shape of a face), is controlled via a transformation.

Disentanglement

Discovering Nonlinear Relations with Minimum Predictive Information Regularization

1 code implementation7 Jan 2020 Tailin Wu, Thomas Breuel, Michael Skuhersky, Jan Kautz

Identifying the underlying directional relations from observational time series with nonlinear interactions and complex relational structures is key to a wide range of applications, yet remains a hard problem.

Time Series Time Series Analysis

SMRD: SURE-based Robust MRI Reconstruction with Diffusion Models

2 code implementations3 Oct 2023 Batu Ozturkler, Chao Liu, Benjamin Eckart, Morteza Mardani, Jiaming Song, Jan Kautz

However, diffusion models require careful tuning of inference hyperparameters on a validation set and are still sensitive to distribution shifts during testing.

MRI Reconstruction

ViR: Towards Efficient Vision Retention Backbones

1 code implementation30 Oct 2023 Ali Hatamizadeh, Michael Ranzinger, Shiyi Lan, Jose M. Alvarez, Sanja Fidler, Jan Kautz

Inspired by this trend, we propose a new class of computer vision models, dubbed Vision Retention Networks (ViR), with dual parallel and recurrent formulations, which strike an optimal balance between fast inference and parallel training with competitive performance.

Zero-shot Pose Transfer for Unrigged Stylized 3D Characters

1 code implementation CVPR 2023 Jiashun Wang, Xueting Li, Sifei Liu, Shalini De Mello, Orazio Gallo, Xiaolong Wang, Jan Kautz

We present a zero-shot approach that requires only the widely available deformed non-stylized avatars in training, and deforms stylized characters of significantly different shapes at inference.

Pose Transfer

Global Vision Transformer Pruning with Hessian-Aware Saliency

1 code implementation CVPR 2023 Huanrui Yang, Hongxu Yin, Maying Shen, Pavlo Molchanov, Hai Li, Jan Kautz

This work aims on challenging the common design philosophy of the Vision Transformer (ViT) model with uniform dimension across all the stacked blocks in a model stage, where we redistribute the parameters both across transformer blocks and between different structures within the block via the first systematic attempt on global structural pruning.

Efficient ViTs Philosophy

Improving Landmark Localization with Semi-Supervised Learning

no code implementations CVPR 2018 Sina Honari, Pavlo Molchanov, Stephen Tyree, Pascal Vincent, Christopher Pal, Jan Kautz

First, we propose the framework of sequential multitasking and explore it here through an architecture for landmark localization where training with class labels acts as an auxiliary signal to guide the landmark localization on unlabeled data.

Face Alignment Small Data Image Classification

Budget-Aware Activity Detection with A Recurrent Policy Network

no code implementations30 Nov 2017 Behrooz Mahasseni, Xiaodong Yang, Pavlo Molchanov, Jan Kautz

In this paper, we address the challenging problem of efficient temporal activity detection in untrimmed long videos.

Action Detection Activity Detection

Light-weight Head Pose Invariant Gaze Tracking

no code implementations23 Apr 2018 Rajeev Ranjan, Shalini De Mello, Jan Kautz

Unconstrained remote gaze tracking using off-the-shelf cameras is a challenging problem.

Gaze Estimation Transfer Learning +1

A Lightweight Approach for On-the-Fly Reflectance Estimation

no code implementations ICCV 2017 Kihwan Kim, Jinwei Gu, Stephen Tyree, Pavlo Molchanov, Matthias Nießner, Jan Kautz

In addition, we have created a large synthetic dataset, SynBRDF, which comprises a total of $500$K RGBD images rendered with a physically-based ray tracer under a variety of natural illumination, covering $5000$ materials and $5000$ shapes.

Color Constancy

Deep Semantic Face Deblurring

no code implementations CVPR 2018 Ziyi Shen, Wei-Sheng Lai, Tingfa Xu, Jan Kautz, Ming-Hsuan Yang

In this paper, we present an effective and efficient face deblurring algorithm by exploiting semantic cues via deep convolutional neural networks (CNNs).

Deblurring Face Recognition

Learning Adaptive Parameter Tuning for Image Processing

no code implementations28 Oct 2016 Jingming Dong, Iuri Frosio, Jan Kautz

The non-stationary nature of image characteristics calls for adaptive processing, based on the local image content.

Deblurring Demosaicking +1

Learning Binary Residual Representations for Domain-specific Video Streaming

no code implementations14 Dec 2017 Yi-Hsuan Tsai, Ming-Yu Liu, Deqing Sun, Ming-Hsuan Yang, Jan Kautz

Specifically, we target a streaming setting where the videos to be streamed from a server to a client are all in the same domain and they have to be compressed to a small size for low-latency transmission.

Video Compression

Separating Reflection and Transmission Images in the Wild

no code implementations ECCV 2018 Patrick Wieschollek, Orazio Gallo, Jinwei Gu, Jan Kautz

The reflections caused by common semi-reflectors, such as glass windows, can impact the performance of computer vision algorithms.

Synthetic Data Generation

On Nearest Neighbors in Non Local Means Denoising

no code implementations20 Nov 2017 Iuri Frosio, Jan Kautz

To denoise a reference patch, the Non-Local-Means denoising filter processes a set of neighbor patches.

Denoising

Cascaded Scene Flow Prediction using Semantic Segmentation

no code implementations26 Jul 2017 Zhile Ren, Deqing Sun, Jan Kautz, Erik B. Sudderth

Given two consecutive frames from a pair of stereo cameras, 3D scene flow methods simultaneously estimate the 3D geometry and motion of the observed scene.

Autonomous Driving General Classification +3

Multiframe Scene Flow with Piecewise Rigid Motion

no code implementations5 Oct 2017 Vladislav Golyanik, Kihwan Kim, Robert Maier, Matthias Nießner, Didier Stricker, Jan Kautz

We introduce a novel multiframe scene flow approach that jointly optimizes the consistency of the patch appearances and their local rigid motions from RGB-D image sequences.

Scene Flow Estimation

Learning Affinity via Spatial Propagation Networks

no code implementations NeurIPS 2017 Sifei Liu, Shalini De Mello, Jinwei Gu, Guangyu Zhong, Ming-Hsuan Yang, Jan Kautz

Specifically, we develop a three-way connection for the linear propagation model, which (a) formulates a sparse transformation matrix, where all elements can be the output from a deep CNN, but (b) results in a dense affinity matrix that effectively models any task-specific pairwise similarity matrix.

Colorization Face Parsing +4

Learning to Segment Instances in Videos with Spatial Propagation Network

no code implementations14 Sep 2017 Jingchun Cheng, Sifei Liu, Yi-Hsuan Tsai, Wei-Chih Hung, Shalini De Mello, Jinwei Gu, Jan Kautz, Shengjin Wang, Ming-Hsuan Yang

In addition, we apply a filter on the refined score map that aims to recognize the best connected region using spatial and temporal consistencies in the video.

Object Segmentation +1

Hierarchical Subquery Evaluation for Active Learning on a Graph

no code implementations CVPR 2014 Oisin Mac Aodha, Neill D. F. Campbell, Jan Kautz, Gabriel J. Brostow

Under some specific circumstances, Expected Error Reduction has been one of the strongest-performing informativeness criteria for active learning.

Active Learning graph construction +1

Domain Stylization: A Strong, Simple Baseline for Synthetic to Real Image Domain Adaptation

no code implementations24 Jul 2018 Aysegul Dundar, Ming-Yu Liu, Ting-Chun Wang, John Zedlewski, Jan Kautz

Deep neural networks have largely failed to effectively utilize synthetic data when applied to real images due to the covariate shift problem.

Domain Adaptation object-detection +5

Tackling 3D ToF Artifacts Through Learning and the FLAT Dataset

no code implementations ECCV 2018 Qi Guo, Iuri Frosio, Orazio Gallo, Todd Zickler, Jan Kautz

Scene motion, multiple reflections, and sensor noise introduce artifacts in the depth reconstruction performed by time-of-flight cameras.

EOE: Expected Overlap Estimation over Unstructured Point Cloud Data

no code implementations6 Aug 2018 Ben Eckart, Kihwan Kim, Jan Kautz

We present an iterative overlap estimation technique to augment existing point cloud registration algorithms that can achieve high performance in difficult real-world situations where large pose displacement and non-overlapping geometry would otherwise cause traditional methods to fail.

Point Cloud Registration

Making Convolutional Networks Recurrent for Visual Sequence Learning

no code implementations CVPR 2018 Xiaodong Yang, Pavlo Molchanov, Jan Kautz

Recurrent neural networks (RNNs) have emerged as a powerful model for a broad range of machine learning problems that involve sequential data.

Action Recognition Face Alignment +6

Neural Causal Discovery with Learnable Input Noise

no code implementations ICLR 2019 Tailin Wu, Thomas Breuel, Jan Kautz

Learning causal relations from observational time series with nonlinear interactions and complex causal structures is a key component of human intelligence, and has a wide range of applications.

Causal Discovery EEG +2

NRMVS: Non-Rigid Multi-View Stereo

no code implementations12 Jan 2019 Matthias Innmann, Kihwan Kim, Jinwei Gu, Matthias Niessner, Charles Loop, Marc Stamminger, Jan Kautz

We show that creating a dense 4D structure from a few RGB images with non-rigid changes is possible, and demonstrate that our method can be used to interpolate novel deformed scenes from various combinations of these deformation estimates derived from the sparse views.

3D Reconstruction Depth Estimation

Modeling Object Appearance Using Context-Conditioned Component Analysis

no code implementations CVPR 2015 Daniyar Turmukhambetov, Neill D. F. Campbell, Simon J. D. Prince, Jan Kautz

In this work we remove the image space alignment limitations of existing subspace models by conditioning the models on a shape dependent context that allows for the complex, non-linear structure of the appearance of the visual object to be captured and shared.

Object

Online Detection and Classification of Dynamic Hand Gestures With Recurrent 3D Convolutional Neural Network

no code implementations CVPR 2016 Pavlo Molchanov, Xiaodong Yang, Shalini Gupta, Kihwan Kim, Stephen Tyree, Jan Kautz

Automatic detection and classification of dynamic hand gestures in real-world systems intended for human computer interaction is challenging as: 1) there is a large diversity in how people perform gestures, making detection and classification difficult; 2) the system must work online in order to avoid noticeable lag between performing a gesture and its classification; in fact, a negative lag (classification before the gesture is finished) is desirable, as feedback to the user can then be truly instantaneous.

Classification General Classification +1

Accelerated Generative Models for 3D Point Cloud Data

no code implementations CVPR 2016 Benjamin Eckart, Kihwan Kim, Alejandro Troccoli, Alonzo Kelly, Jan Kautz

In this paper we introduce a method for constructing compact generative representations of PCD at multiple levels of detail.

Dynamic Facial Analysis: From Bayesian Filtering to Recurrent Neural Network

no code implementations CVPR 2017 Jinwei Gu, Xiaodong Yang, Shalini De Mello, Jan Kautz

We are inspired by the fact that the computation performed in an RNN bears resemblance to Bayesian filters, which have been used for tracking in many previous methods for facial analysis from videos.

 Ranked #1 on Head Pose Estimation on BIWI (MAE (trained with BIWI data) metric, using extra training data)

Face Alignment Feature Engineering +3

Polarimetric Multi-View Stereo

no code implementations CVPR 2017 Zhaopeng Cui, Jinwei Gu, Boxin Shi, Ping Tan, Jan Kautz

Multi-view stereo relies on feature correspondences for 3D reconstruction, and thus is fundamentally flawed in dealing with featureless scenes.

3D Reconstruction

Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments

no code implementations CVPR 2019 Xueting Li, Sifei Liu, Kihwan Kim, Xiaolong Wang, Ming-Hsuan Yang, Jan Kautz

In order to predict valid affordances and learn possible 3D human poses in indoor scenes, we need to understand the semantic and geometric structure of a scene as well as its potential interactions with a human.

valid

Towards annotation-efficient segmentation via image-to-image translation

no code implementations2 Apr 2019 Eugene Vorontsov, Pavlo Molchanov, Christopher Beckham, Jan Kautz, Samuel Kadoury

Specifically, we propose a semi-supervised framework that employs unpaired image-to-image translation between two domains, presence vs. absence of cancer, as the unsupervised objective.

Brain Tumor Segmentation Image-to-Image Translation +3

Few-Shot Viewpoint Estimation

no code implementations13 May 2019 Hung-Yu Tseng, Shalini De Mello, Jonathan Tremblay, Sifei Liu, Stan Birchfield, Ming-Hsuan Yang, Jan Kautz

Through extensive experimentation on the ObjectNet3D and Pascal3D+ benchmark datasets, we demonstrate that our framework, which we call MetaView, significantly outperforms fine-tuning the state-of-the-art models with few examples, and that the specific architectural innovations of our method are crucial to achieving good performance.

Meta-Learning Viewpoint Estimation

Video Stitching for Linear Camera Arrays

no code implementations31 Jul 2019 Wei-Sheng Lai, Orazio Gallo, Jinwei Gu, Deqing Sun, Ming-Hsuan Yang, Jan Kautz

Despite the long history of image and video stitching research, existing academic and commercial solutions still produce strong artifacts.

Autonomous Driving Spatial Interpolation

Learning Propagation for Arbitrarily-structured Data

no code implementations ICCV 2019 Sifei Liu, Xueting Li, Varun Jampani, Shalini De Mello, Jan Kautz

We experiment with semantic segmentation networks, where we use our propagation module to jointly train on different data -- images, superpixels and point clouds.

Point Cloud Segmentation Segmentation +2

Exploiting Semantics for Face Image Deblurring

no code implementations19 Jan 2020 Ziyi Shen, Wei-Sheng Lai, Tingfa Xu, Jan Kautz, Ming-Hsuan Yang

Specifically, we first use a coarse deblurring network to reduce the motion blur on the input face image.

Deblurring Face Recognition +1

Learning to Generate Multiple Style Transfer Outputs for an Input Sentence

no code implementations WS 2020 Kevin Lin, Ming-Yu Liu, Ming-Ting Sun, Jan Kautz

Specifically, we decompose the latent representation of the input sentence to a style code that captures the language style variation and a content code that encodes the language style-independent content.

Sentence Style Transfer +1

Novel View Synthesis of Dynamic Scenes with Globally Coherent Depths from a Monocular Camera

no code implementations CVPR 2020 Jae Shin Yoon, Kihwan Kim, Orazio Gallo, Hyun Soo Park, Jan Kautz

Our insight is that although its scale and quality are inconsistent with other views, the depth estimation from a single view can be used to reason about the globally coherent geometry of dynamic contents.

Depth Estimation Novel View Synthesis

How to Close Sim-Real Gap? Transfer with Segmentation!

no code implementations14 May 2020 Mengyuan Yan, Qingyun Sun, Iuri Frosio, Stephen Tyree, Jan Kautz

Combining the control policy learned from simulation with the perception model, we achieve an impressive $\bf{88\%}$ success rate in grasping a tiny sphere with a real robot.

Robotics

Hierarchical Contrastive Motion Learning for Video Action Recognition

no code implementations20 Jul 2020 Xitong Yang, Xiaodong Yang, Sifei Liu, Deqing Sun, Larry Davis, Jan Kautz

Thus, the motion features at higher levels are trained to gradually capture semantic dynamics and evolve more discriminative for action recognition.

Action Recognition Contrastive Learning +2

Improving Deep Stereo Network Generalization with Geometric Priors

no code implementations25 Aug 2020 Jialiang Wang, Varun Jampani, Deqing Sun, Charles Loop, Stan Birchfield, Jan Kautz

End-to-end deep learning methods have advanced stereo vision in recent years and obtained excellent results when the training and test data are similar.

A Contrastive Learning Approach for Training Variational Autoencoder Priors

no code implementations NeurIPS 2021 Jyoti Aneja, Alexander Schwing, Jan Kautz, Arash Vahdat

To tackle this issue, we propose an energy-based prior defined by the product of a base prior distribution and a reweighting factor, designed to bring the base closer to the aggregate posterior.

Ranked #6 on Image Generation on CelebA 256x256 (FID metric)

Contrastive Learning Image Generation

UFO$^2$: A Unified Framework towards Omni-supervised Object Detection

no code implementations21 Oct 2020 Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Alexander G. Schwing, Jan Kautz

Existing work on object detection often relies on a single form of annotation: the model is trained using either accurate yet costly bounding boxes or cheaper but less expressive image-level tags.

object-detection Object Detection

Online Adaptation for Consistent Mesh Reconstruction in the Wild

no code implementations NeurIPS 2020 Xueting Li, Sifei Liu, Shalini De Mello, Kihwan Kim, Xiaolong Wang, Ming-Hsuan Yang, Jan Kautz

This paper presents an algorithm to reconstruct temporally consistent 3D meshes of deformable object instances from videos in the wild.

3D Reconstruction

Parameter Efficient Multimodal Transformers for Video Representation Learning

no code implementations ICLR 2021 Sangho Lee, Youngjae Yu, Gunhee Kim, Thomas Breuel, Jan Kautz, Yale Song

The recent success of Transformers in the language domain has motivated adapting it to a multimodal setting, where a new visual model is trained in tandem with an already pretrained language model.

Language Modelling Representation Learning

Neural 3D Clothes Retargeting from a Single Image

no code implementations29 Jan 2021 Jae Shin Yoon, Kihwan Kim, Jan Kautz, Hyun Soo Park

In this paper, we present a method of clothes retargeting; generating the potential poses and deformations of a given 3D clothing template model to fit onto a person in a single RGB image.

Learning Affinity via Spatial Propagation Network

no code implementations3 Oct 2017 Sifei Liu, Shalini De Mello, Jinwei Gu, Guangyu Zhong, Ming-Hsuan Yang, Jan Kautz

Specifically, we develop a three-way connection for the linear propagation model, which (a) formulates a sparse transformation matrix, where all elements can be the output from a deep CNN, but (b) results in a dense affinity matrix that effectively models any task-specific pairwise similarity matrix.

Colorization Face Parsing +4

Learning to Track Instances without Video Annotations

no code implementations CVPR 2021 Yang Fu, Sifei Liu, Umar Iqbal, Shalini De Mello, Humphrey Shi, Jan Kautz

Tracking segmentation masks of multiple instances has been intensively studied, but still faces two fundamental challenges: 1) the requirement of large-scale, frame-wise annotation, and 2) the complexity of two-stage approaches.

Instance Segmentation Pose Estimation +1

KAMA: 3D Keypoint Aware Body Mesh Articulation

no code implementations27 Apr 2021 Umar Iqbal, Kevin Xie, Yunrong Guo, Jan Kautz, Pavlo Molchanov

We present KAMA, a 3D Keypoint Aware Mesh Articulation approach that allows us to estimate a human body mesh from the positions of 3D body keypoints.

3D Human Pose Estimation 3D Human Shape Estimation +1

Adversarial Motion Modelling helps Semi-supervised Hand Pose Estimation

no code implementations10 Jun 2021 Adrian Spurr, Pavlo Molchanov, Umar Iqbal, Jan Kautz, Otmar Hilliges

Hand pose estimation is difficult due to different environmental conditions, object- and self-occlusion as well as diversity in hand shape and appearance.

Hand Pose Estimation valid

Self-Supervised Learning on 3D Point Clouds by Learning Discrete Generative Models

no code implementations CVPR 2021 Benjamin Eckart, Wentao Yuan, Chao Liu, Jan Kautz

In this work, we introduce a general method for 3D self-supervised representation learning that 1) remains agnostic to the underlying neural network architecture, and 2) specifically leverages the geometric nature of 3D point cloud data.

Point Cloud Segmentation Representation Learning +4

Privacy Vulnerability of Split Computing to Data-Free Model Inversion Attacks

no code implementations13 Jul 2021 Xin Dong, Hongxu Yin, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov, H. T. Kung

Prior works usually assume that SC offers privacy benefits as only intermediate features, instead of private data, are shared from devices to the cloud.

LANA: Latency Aware Network Acceleration

no code implementations12 Jul 2021 Pavlo Molchanov, Jimmy Hall, Hongxu Yin, Jan Kautz, Nicolo Fusi, Arash Vahdat

We analyze three popular network architectures: EfficientNetV1, EfficientNetV2 and ResNeST, and achieve accuracy improvement for all models (up to $3. 0\%$) when compressing larger models to the latency level of smaller models.

Neural Architecture Search Quantization

Learning Contrastive Representation for Semantic Correspondence

no code implementations22 Sep 2021 Taihong Xiao, Sifei Liu, Shalini De Mello, Zhiding Yu, Jan Kautz, Ming-Hsuan Yang

Dense correspondence across semantically related images has been extensively studied, but still faces two challenges: 1) large variations in appearance, scale and pose exist even for objects from the same category, and 2) labeling pixel-level dense correspondences is labor intensive and infeasible to scale.

Contrastive Learning Semantic correspondence

Hardware-Aware Network Transformation

no code implementations29 Sep 2021 Pavlo Molchanov, Jimmy Hall, Hongxu Yin, Jan Kautz, Nicolo Fusi, Arash Vahdat

In the second phase, it solves the combinatorial selection of efficient operations using a novel constrained integer linear optimization approach.

Neural Architecture Search

Convolutional Tensor-Train LSTM for Long-Term Video Prediction

no code implementations25 Sep 2019 Jiahao Su, Wonmin Byeon, Furong Huang, Jan Kautz, Animashree Anandkumar

Long-term video prediction is highly challenging since it entails simultaneously capturing spatial and temporal information across a long range of image frames. Standard recurrent models are ineffective since they are prone to error propagation and cannot effectively capture higher-order correlations.

Video Prediction

Long History Short-Term Memory for Long-Term Video Prediction

no code implementations25 Sep 2019 Wonmin Byeon, Jan Kautz

While video prediction approaches have advanced considerably in recent years, learning to predict long-term future is challenging — ambiguous future or error propagation over time yield blurry predictions.

Video Prediction

NCP-VAE: Variational Autoencoders with Noise Contrastive Priors

no code implementations28 Sep 2020 Jyoti Aneja, Alex Schwing, Jan Kautz, Arash Vahdat

To tackle this issue, we propose an energy-based prior defined by the product of a base prior distribution and a reweighting factor, designed to bring the base closer to the aggregate posterior.

Learning Continuous Environment Fields via Implicit Functions

no code implementations ICLR 2022 Xueting Li, Shalini De Mello, Xiaolong Wang, Ming-Hsuan Yang, Jan Kautz, Sifei Liu

We propose a novel scene representation that encodes reaching distance -- the distance between any position in the scene to a goal along a feasible trajectory.

Position Trajectory Prediction

Federated Learning with Heterogeneous Architectures using Graph HyperNetworks

no code implementations20 Jan 2022 Or Litany, Haggai Maron, David Acuna, Jan Kautz, Gal Chechik, Sanja Fidler

Standard Federated Learning (FL) techniques are limited to clients with identical network architectures.

Federated Learning

Physics Informed RNN-DCT Networks for Time-Dependent Partial Differential Equations

no code implementations24 Feb 2022 Benjamin Wu, Oliver Hennigh, Jan Kautz, Sanjay Choudhry, Wonmin Byeon

This efficiently and flexibly produces a compressed representation which is used for additional conditioning of physics-informed models.

GradViT: Gradient Inversion of Vision Transformers

no code implementations CVPR 2022 Ali Hatamizadeh, Hongxu Yin, Holger Roth, Wenqi Li, Jan Kautz, Daguang Xu, Pavlo Molchanov

In this work we demonstrate the vulnerability of vision transformers (ViTs) to gradient-based inversion attacks.

Scheduling

DRaCoN -- Differentiable Rasterization Conditioned Neural Radiance Fields for Articulated Avatars

no code implementations29 Mar 2022 Amit Raj, Umar Iqbal, Koki Nagano, Sameh Khamis, Pavlo Molchanov, James Hays, Jan Kautz

In this work, we present, DRaCoN, a framework for learning full-body volumetric avatars which exploits the advantages of both the 2D and 3D neural rendering techniques.

Neural Rendering

RTMV: A Ray-Traced Multi-View Synthetic Dataset for Novel View Synthesis

no code implementations14 May 2022 Jonathan Tremblay, Moustafa Meshry, Alex Evans, Jan Kautz, Alexander Keller, Sameh Khamis, Thomas Müller, Charles Loop, Nathan Morrical, Koki Nagano, Towaki Takikawa, Stan Birchfield

We present a large-scale synthetic dataset for novel view synthesis consisting of ~300k images rendered from nearly 2000 complex scenes using high-quality ray tracing at high resolution (1600 x 1600 pixels).

Novel View Synthesis

Neural Light Field Estimation for Street Scenes with Differentiable Virtual Object Insertion

no code implementations19 Aug 2022 Zian Wang, Wenzheng Chen, David Acuna, Jan Kautz, Sanja Fidler

In this work, we propose a neural approach that estimates the 5D HDR light field from a single image, and a differentiable object insertion formulation that enables end-to-end training with image-based losses that encourage realism.

Autonomous Driving Lighting Estimation +1

Learning to Relight Portrait Images via a Virtual Light Stage and Synthetic-to-Real Adaptation

no code implementations21 Sep 2022 Yu-Ying Yeh, Koki Nagano, Sameh Khamis, Jan Kautz, Ming-Yu Liu, Ting-Chun Wang

An effective approach is to supervise the training of deep neural networks with a high-fidelity dataset of desired input-output pairs, captured with a light stage.

PhysDiff: Physics-Guided Human Motion Diffusion Model

no code implementations ICCV 2023 Ye Yuan, Jiaming Song, Umar Iqbal, Arash Vahdat, Jan Kautz

Specifically, we propose a physics-based motion projection module that uses motion imitation in a physics simulator to project the denoised motion of a diffusion step to a physically-plausible motion.

Denoising

RANA: Relightable Articulated Neural Avatars

no code implementations ICCV 2023 Umar Iqbal, Akin Caliskan, Koki Nagano, Sameh Khamis, Pavlo Molchanov, Jan Kautz

We propose RANA, a relightable and articulated neural avatar for the photorealistic synthesis of humans under arbitrary viewpoints, body poses, and lighting.

Disentanglement Image Generation

Score-based Diffusion Models in Function Space

no code implementations14 Feb 2023 Jae Hyun Lim, Nikola B. Kovachki, Ricardo Baptista, Christopher Beckham, Kamyar Azizzadenesheli, Jean Kossaifi, Vikram Voleti, Jiaming Song, Karsten Kreis, Jan Kautz, Christopher Pal, Arash Vahdat, Anima Anandkumar

They consist of a forward process that perturbs input data with Gaussian white noise and a reverse process that learns a score function to generate samples by denoising.

Denoising

Single-Shot Implicit Morphable Faces with Consistent Texture Parameterization

no code implementations4 May 2023 Connor Z. Lin, Koki Nagano, Jan Kautz, Eric R. Chan, Umar Iqbal, Leonidas Guibas, Gordon Wetzstein, Sameh Khamis

To tackle this problem, we propose a novel method for constructing implicit 3D morphable face models that are both generalizable and intuitive for editing.

Face Model Face Reconstruction

Generalizable One-shot Neural Head Avatar

no code implementations14 Jun 2023 Xueting Li, Shalini De Mello, Sifei Liu, Koki Nagano, Umar Iqbal, Jan Kautz

We present a method that reconstructs and animates a 3D head avatar from a single-view portrait image.

Super-Resolution

Heterogeneous Continual Learning

no code implementations CVPR 2023 Divyam Madaan, Hongxu Yin, Wonmin Byeon, Jan Kautz, Pavlo Molchanov

We propose a novel framework and a solution to tackle the continual learning (CL) problem with changing network architectures.

Continual Learning Knowledge Distillation +1

Online Overexposed Pixels Hallucination in Videos with Adaptive Reference Frame Selection

no code implementations29 Aug 2023 Yazhou Xing, Amrita Mazumdar, Anjul Patney, Chao Liu, Hongxu Yin, Qifeng Chen, Jan Kautz, Iuri Frosio

We present a learning-based system to reduce these artifacts without resorting to complex acquisition mechanisms like alternating exposures or costly processing that are typical of high dynamic range (HDR) imaging.

Hallucination

3D Reconstruction with Generalizable Neural Fields using Scene Priors

no code implementations26 Sep 2023 Yang Fu, Shalini De Mello, Xueting Li, Amey Kulkarni, Jan Kautz, Xiaolong Wang, Sifei Liu

NFP not only demonstrates SOTA scene reconstruction performance and efficiency, but it also supports single-image novel-view synthesis, which is underexplored in neural fields.

3D Reconstruction 3D Scene Reconstruction +1

PACE: Human and Camera Motion Estimation from in-the-wild Videos

no code implementations20 Oct 2023 Muhammed Kocabas, Ye Yuan, Pavlo Molchanov, Yunrong Guo, Michael J. Black, Otmar Hilliges, Jan Kautz, Umar Iqbal

This design combines the strengths of SLAM and motion priors, which leads to significant improvements in human and camera motion estimation.

Motion Estimation

COLMAP-Free 3D Gaussian Splatting

no code implementations12 Dec 2023 Yang Fu, Sifei Liu, Amey Kulkarni, Jan Kautz, Alexei A. Efros, Xiaolong Wang

While neural rendering has led to impressive advances in scene reconstruction and novel view synthesis, it relies heavily on accurately pre-computed camera poses.

Neural Rendering Novel View Synthesis +1

GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning

no code implementations18 Dec 2023 Ye Yuan, Xueting Li, Yangyi Huang, Shalini De Mello, Koki Nagano, Jan Kautz, Umar Iqbal

Gaussian splatting has emerged as a powerful 3D representation that harnesses the advantages of both explicit (mesh) and implicit (NeRF) 3D representations.

FoVA-Depth: Field-of-View Agnostic Depth Estimation for Cross-Dataset Generalization

no code implementations24 Jan 2024 Daniel Lichy, Hang Su, Abhishek Badki, Jan Kautz, Orazio Gallo

Unfortunately, most of the GT data is for pinhole cameras, making it impossible to properly train depth estimation models for large-FoV cameras.

Stereo Depth Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.