Search Results for author: Jan Kautz

Found 175 papers, 83 papers with code

GroupViT: Semantic Segmentation Emerges from Text Supervision

2 code implementations • CVPR 2022 • Jiarui Xu, Shalini De Mello, Sifei Liu, Wonmin Byeon, Thomas Breuel, Jan Kautz, Xiaolong Wang

With only text supervision and without any pixel-level annotations, GroupViT learns to group together semantic regions and successfully transfers to the task of semantic segmentation in a zero-shot manner, i. e., without any further fine-tuning.

Ranked #3 on Unsupervised Semantic Segmentation with Language-image Pre-training on PascalVOC-20

Object Detection Scene Understanding +3

124,889

Paper
Code

Global Context Vision Transformers

8 code implementations • 20 Jun 2022 • Ali Hatamizadeh, Hongxu Yin, Greg Heinrich, Jan Kautz, Pavlo Molchanov

Pre-trained GC ViT backbones in downstream tasks of object detection, instance segmentation, and semantic segmentation using MS COCO and ADE20K datasets outperform prior work consistently.

Ranked #132 on Semantic Segmentation on ADE20K

Image Classification Inductive Bias +4

29,735

Paper
Code

Multimodal Unsupervised Image-to-Image Translation

14 code implementations • ECCV 2018 • Xun Huang, Ming-Yu Liu, Serge Belongie, Jan Kautz

To translate an image to another domain, we recombine its content code with a random style code sampled from the style space of the target domain.

Ranked #1 on Multimodal Unsupervised Image-To-Image Translation on Edge-to-Handbags

Multimodal Unsupervised Image-To-Image Translation Translation +1

15,701

Paper
Code

Unsupervised Image-to-Image Translation Networks

8 code implementations • NeurIPS 2017 • Ming-Yu Liu, Thomas Breuel, Jan Kautz

Unsupervised image-to-image translation aims at learning a joint distribution of images in different domains by using images from the marginal distributions in individual domains.

Ranked #2 on Multimodal Unsupervised Image-To-Image Translation on Cats-and-Dogs

Domain Adaptation Multimodal Unsupervised Image-To-Image Translation +2

15,701

Paper
Code

A Closed-form Solution to Photorealistic Image Stylization

12 code implementations • ECCV 2018 • Yijun Li, Ming-Yu Liu, Xueting Li, Ming-Hsuan Yang, Jan Kautz

Photorealistic image stylization concerns transferring style of a reference photo to a content photo with the constraint that the stylized photo should remain photorealistic.

Image Stylization

11,093

Paper
Code

Loss Functions for Neural Networks for Image Processing

2 code implementations • 28 Nov 2015 • Hang Zhao, Orazio Gallo, Iuri Frosio, Jan Kautz

Neural networks are becoming central in several areas of computer vision and image processing and different architectures have been proposed to solve specific problems.

Image Restoration

9,370

Paper
Code

Video-to-Video Synthesis

11 code implementations • NeurIPS 2018 • Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, Bryan Catanzaro

We study the problem of video-to-video synthesis, whose goal is to learn a mapping function from an input source video (e. g., a sequence of semantic segmentation masks) to an output photorealistic video that precisely depicts the content of the source video.

2k Semantic Segmentation +2

8,496

Paper
Code

High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

20 code implementations • CVPR 2018 • Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, Bryan Catanzaro

We present a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs).

Ranked #2 on Sketch-to-Image Translation on COCO-Stuff

Conditional Image Generation Fundus to Angiography Generation +5

6,524

Paper
Code

Joint Discriminative and Generative Learning for Person Re-identification

12 code implementations • CVPR 2019 • Zhedong Zheng, Xiaodong Yang, Zhiding Yu, Liang Zheng, Yi Yang, Jan Kautz

To this end, we propose a joint learning framework that couples re-id learning and data generation end-to-end.

Ranked #1 on Person Re-Identification on UAV-Human

Image-to-Image Translation Unsupervised Domain Adaptation +1

3,949

Paper
Code

Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation

5 code implementations • CVPR 2018 • Huaizu Jiang, Deqing Sun, Varun Jampani, Ming-Hsuan Yang, Erik Learned-Miller, Jan Kautz

Finally, the two input images are warped and linearly fused to form each intermediate frame.

Ranked #3 on Video Frame Interpolation on MSU Video Frame Interpolation

Optical Flow Estimation Video Frame Interpolation +1

2,970

Paper
Code

VILA: On Pre-training for Visual Language Models

2 code implementations • 12 Dec 2023 • Ji Lin, Hongxu Yin, Wei Ping, Yao Lu, Pavlo Molchanov, Andrew Tao, Huizi Mao, Jan Kautz, Mohammad Shoeybi, Song Han

Visual language models (VLMs) rapidly progressed with the recent success of large language models.

Ranked #21 on Visual Question Answering on MM-Vet

In-Context Learning Language Modelling +2

1,784

Paper
Code

Few-shot Video-to-Video Synthesis

6 code implementations • NeurIPS 2019 • Ting-Chun Wang, Ming-Yu Liu, Andrew Tao, Guilin Liu, Jan Kautz, Bryan Catanzaro

To address the limitations, we propose a few-shot vid2vid framework, which learns to synthesize videos of previously unseen subjects or scenes by leveraging few example images of the target at test time.

Ranked #1 on Video-to-Video Synthesis on YouTube Dancing

Video-to-Video Synthesis

1,781

Paper
Code

PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume

21 code implementations • CVPR 2018 • Deqing Sun, Xiaodong Yang, Ming-Yu Liu, Jan Kautz

It then uses the warped features and features of the first image to construct a cost volume, which is processed by a CNN to estimate the optical flow.

Ranked #3 on Dense Pixel Correspondence Estimation on HPatches

Dense Pixel Correspondence Estimation Optical Flow Estimation

1,585

Paper
Code

Models Matter, So Does Training: An Empirical Study of CNNs for Optical Flow Estimation

2 code implementations • 14 Sep 2018 • Deqing Sun, Xiaodong Yang, Ming-Yu Liu, Jan Kautz

We investigate two crucial and closely related aspects of CNNs for optical flow estimation: models and training.

Ranked #7 on Optical Flow Estimation on KITTI 2012

Optical Flow Estimation

1,585

Paper
Code

A Fusion Approach for Multi-Frame Optical Flow Estimation

2 code implementations • 23 Oct 2018 • Zhile Ren, Orazio Gallo, Deqing Sun, Ming-Hsuan Yang, Erik B. Sudderth, Jan Kautz

To date, top-performing optical flow estimation methods only take pairs of consecutive frames into account.

Optical Flow Estimation

1,585

Paper
Code

Few-Shot Unsupervised Image-to-Image Translation

10 code implementations • ICCV 2019 • Ming-Yu Liu, Xun Huang, Arun Mallya, Tero Karras, Timo Aila, Jaakko Lehtinen, Jan Kautz

Unsupervised image-to-image translation methods learn to map images in a given class to an analogous image in a different class, drawing on unstructured (non-registered) datasets of images.

Translation Unsupervised Image-To-Image Translation

1,562

Paper
Code

NVAE: A Deep Hierarchical Variational Autoencoder

8 code implementations • NeurIPS 2020 • Arash Vahdat, Jan Kautz

For example, on CIFAR-10, NVAE pushes the state-of-the-art from 2. 98 to 2. 91 bits per dimension, and it produces high-quality images on CelebA HQ.

Ranked #3 on Image Generation on FFHQ 256 x 256 (bits/dimension metric)

Image Generation

973

Paper
Code

BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects

1 code implementation • CVPR 2023 • Bowen Wen, Jonathan Tremblay, Valts Blukis, Stephen Tyree, Thomas Muller, Alex Evans, Dieter Fox, Jan Kautz, Stan Birchfield

We present a near real-time method for 6-DoF tracking of an unknown object from a monocular RGBD video sequence, while simultaneously performing neural 3D reconstruction of the object.

3D Object Tracking 3D Reconstruction +5

890

Paper
Code

FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects

1 code implementation • 13 Dec 2023 • Bowen Wen, Wei Yang, Jan Kautz, Stan Birchfield

We present FoundationPose, a unified foundation model for 6D object pose estimation and tracking, supporting both model-based and model-free setups.

3D Object Detection 3D Object Tracking +7

865

Paper
Code

Fast and Accurate Point Cloud Registration using Trees of Gaussian Mixtures

1 code implementation • 6 Jul 2018 • Ben Eckart, Kihwan Kim, Jan Kautz

Point cloud registration sits at the core of many important and challenging 3D perception problems including autonomous navigation, SLAM, object/scene recognition, and augmented reality.

Autonomous Navigation Point Cloud Registration +1

783

Paper
Code

FasterViT: Fast Vision Transformers with Hierarchical Attention

2 code implementations • 9 Jun 2023 • Ali Hatamizadeh, Greg Heinrich, Hongxu Yin, Andrew Tao, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov

At a high level, global self-attentions enable the efficient cross-window communication at lower costs.

object-detection Object Detection +1

668

Paper
Code

Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU

3 code implementations • 18 Nov 2016 • Mohammad Babaeizadeh, Iuri Frosio, Stephen Tyree, Jason Clemons, Jan Kautz

We introduce a hybrid CPU/GPU version of the Asynchronous Advantage Actor-Critic (A3C) algorithm, currently the state-of-the-art method in reinforcement learning for various gaming tasks.

reinforcement-learning Reinforcement Learning (RL) +1

646

Paper
Code

DeepGMR: Learning Latent Gaussian Mixture Models for Registration

2 code implementations • ECCV 2020 • Wentao Yuan, Ben Eckart, Kihwan Kim, Varun Jampani, Dieter Fox, Jan Kautz

Point cloud registration is a fundamental problem in 3D computer vision, graphics and robotics.

Point Cloud Registration

646

Paper
Code

MoCoGAN: Decomposing Motion and Content for Video Generation

5 code implementations • CVPR 2018 • Sergey Tulyakov, Ming-Yu Liu, Xiaodong Yang, Jan Kautz

The proposed framework generates a video by mapping a sequence of random vectors to a sequence of video frames.

Ranked #4 on Video Generation on UCF-101 16 frames, Unconditional, Single GPU

Generative Adversarial Network Video Generation

560

Paper
Code

Synthetically Trained Neural Networks for Learning Human-Readable Plans from Real-World Demonstrations

1 code implementation • 18 May 2018 • Jonathan Tremblay, Thang To, Artem Molchanov, Stephen Tyree, Jan Kautz, Stan Birchfield

We present a system to infer and execute a human-readable program from a real-world demonstration.

Robotics

554

Paper
Code

FB-OCC: 3D Occupancy Prediction based on Forward-Backward View Transformation

1 code implementation • 4 Jul 2023 • Zhiqi Li, Zhiding Yu, David Austin, Mingsheng Fang, Shiyi Lan, Jan Kautz, Jose M. Alvarez

This technical report summarizes the winning solution for the 3D Occupancy Prediction Challenge, which is held in conjunction with the CVPR 2023 Workshop on End-to-End Autonomous Driving and CVPR 23 Workshop on Vision-Centric Autonomous Driving Workshop.

Ranked #1 on Prediction Of Occupancy Grid Maps on Occ3D-nuScenes

Autonomous Driving Prediction Of Occupancy Grid Maps

540

Paper
Code

PlaneRCNN: 3D Plane Detection and Reconstruction from a Single Image

2 code implementations • CVPR 2019 • Chen Liu, Kihwan Kim, Jinwei Gu, Yasutaka Furukawa, Jan Kautz

This paper proposes a deep neural architecture, PlaneRCNN, that detects and reconstructs piecewise planar surfaces from a single RGB image.

3D Plane Detection 3D Reconstruction +1

538

Paper
Code

Dancing to Music

2 code implementations • NeurIPS 2019 • Hsin-Ying Lee, Xiaodong Yang, Ming-Yu Liu, Ting-Chun Wang, Yu-Ding Lu, Ming-Hsuan Yang, Jan Kautz

In the analysis phase, we decompose a dance into a series of basic dance units, through which the model learns how to move.

Ranked #3 on Motion Synthesis on BRACE

Motion Synthesis Pose Estimation

521

Paper
Code

Pixel-Adaptive Convolutional Neural Networks

2 code implementations • CVPR 2019 • Hang Su, Varun Jampani, Deqing Sun, Orazio Gallo, Erik Learned-Miller, Jan Kautz

In addition, we also demonstrate that PAC can be used as a drop-in replacement for convolution layers in pre-trained networks, resulting in consistent performance improvements.

507

Paper
Code

Dreaming to Distill: Data-free Knowledge Transfer via DeepInversion

2 code implementations • CVPR 2020 • Hongxu Yin, Pavlo Molchanov, Zhizhong Li, Jose M. Alvarez, Arun Mallya, Derek Hoiem, Niraj K. Jha, Jan Kautz

We introduce DeepInversion, a new method for synthesizing images from the image distribution used to train a deep neural network.

Continual Learning Network Pruning +1

474

Paper
Code

Intrinsic3D: High-Quality 3D Reconstruction by Joint Appearance and Geometry Optimization with Spatially-Varying Lighting

1 code implementation • ICCV 2017 • Robert Maier, Kihwan Kim, Daniel Cremers, Jan Kautz, Matthias Nießner

We introduce a novel method to obtain high-quality 3D reconstructions from consumer RGB-D sensors.

3D Reconstruction Surface Reconstruction

443

Paper
Code

Learning Linear Transformations for Fast Arbitrary Style Transfer

1 code implementation • 14 Aug 2018 • Xueting Li, Sifei Liu, Jan Kautz, Ming-Hsuan Yang

Recent arbitrary style transfer methods transfer second order statistics from reference image onto content image via a multiplication between content image features and a transformation matrix, which is computed from features with a pre-determined algorithm.

Domain Adaptation Style Transfer

376

Paper
Code

Depth-Based 3D Hand Pose Estimation: From Current Achievements to Future Goals

1 code implementation • CVPR 2018 • Shanxin Yuan, Guillermo Garcia-Hernando, Bjorn Stenger, Gyeongsik Moon, Ju Yong Chang, Kyoung Mu Lee, Pavlo Molchanov, Jan Kautz, Sina Honari, Liuhao Ge, Junsong Yuan, Xinghao Chen, Guijin Wang, Fan Yang, Kai Akiyama, Yang Wu, Qingfu Wan, Meysam Madadi, Sergio Escalera, Shile Li, Dongheui Lee, Iason Oikonomidis, Antonis Argyros, Tae-Kyun Kim

Official Torch7 implementation of "V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map", CVPR 2018

Ranked #5 on Hand Pose Estimation on HANDS 2017

3D Hand Pose Estimation 3D Pose Estimation

373

Paper
Code

Instance-aware, Context-focused, and Memory-efficient Weakly Supervised Object Detection

2 code implementations • CVPR 2020 • Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Yong Jae Lee, Alexander G. Schwing, Jan Kautz

Weakly supervised learning has emerged as a compelling tool for object detection by reducing the need for strong supervision during training.

Ranked #1 on Weakly Supervised Object Detection on COCO test-dev

Object object-detection +3

359

Paper
Code

UFO²: A Unified Framework towards Omni-supervised Object Detection

1 code implementation • ECCV 2020 • Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Alexander G. Schwing, Jan Kautz

Existing work on object detection often relies on a single form of annotation: the model is trained using either accurate yet costly bounding boxes or cheaper but less expressive image-level tags.

Object object-detection +1

359

Paper
Code

Superpixel Sampling Networks

2 code implementations • ECCV 2018 • Varun Jampani, Deqing Sun, Ming-Yu Liu, Ming-Hsuan Yang, Jan Kautz

Superpixels provide an efficient low/mid-level representation of image data, which greatly reduces the number of image primitives for subsequent vision tasks.

Segmentation Superpixels

343

Paper
Code

Geometry-Aware Learning of Maps for Camera Localization

1 code implementation • CVPR 2018 • Samarth Brahmbhatt, Jinwei Gu, Kihwan Kim, James Hays, Jan Kautz

Maps are a key component in image-based camera localization and visual SLAM systems: they are used to establish geometric constraints between images, correct drift in relative pose estimation, and relocalize cameras after lost tracking.

Ranked #5 on Visual Localization on Oxford RobotCar Full

Camera Localization Visual Localization

340

Paper
Code

GLAMR: Global Occlusion-Aware Human Mesh Recovery with Dynamic Cameras

1 code implementation • CVPR 2022 • Ye Yuan, Umar Iqbal, Pavlo Molchanov, Kris Kitani, Jan Kautz

Since the joint reconstruction of human motions and camera poses is underconstrained, we propose a global trajectory predictor that generates global human trajectories based on local body movements.

Ranked #1 on Global 3D Human Pose Estimation on EMDB

Global 3D Human Pose Estimation Human Mesh Recovery

340

Paper
Code

Score-based Generative Modeling in Latent Space

1 code implementation • NeurIPS 2021 • Arash Vahdat, Karsten Kreis, Jan Kautz

Moving from data to latent space allows us to train more expressive generative models, apply SGMs to non-continuous data, and learn smoother SGMs in a smaller space, resulting in fewer network evaluations and faster sampling.

Ranked #3 on Image Generation on CIFAR-10 (FD metric)

Image Generation

336

Paper
Code

See through Gradients: Image Batch Recovery via GradInversion

2 code implementations • CVPR 2021 • Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov

In this work, we introduce GradInversion, using which input images from a larger batch (8 - 48 images) can also be recovered for large networks such as ResNets (50 layers), on complex datasets such as ImageNet (1000 classes, 224x224 px).

Federated Learning Inference Attack +1

323

Paper
Code

DiffiT: Diffusion Vision Transformers for Image Generation

1 code implementation • 4 Dec 2023 • Ali Hatamizadeh, Jiaming Song, Guilin Liu, Jan Kautz, Arash Vahdat

In this paper, we study the effectiveness of ViTs in diffusion-based generative learning and propose a new model denoted as Diffusion Vision Transformers (DiffiT).

Ranked #4 on Image Generation on ImageNet 256x256

Denoising Image Generation

318

Paper
Code

FreeSOLO: Learning to Segment Objects without Annotations

1 code implementation • CVPR 2022 • Xinlong Wang, Zhiding Yu, Shalini De Mello, Jan Kautz, Anima Anandkumar, Chunhua Shen, Jose M. Alvarez

FreeSOLO further demonstrates superiority as a strong pre-training method, outperforming state-of-the-art self-supervised pre-training methods by +9. 8% AP when fine-tuning instance segmentation with only 5% COCO masks.

Instance Segmentation object-detection +4

309

Paper
Code

Few-Shot Adaptive Gaze Estimation

1 code implementation • ICCV 2019 • Seonwook Park, Shalini De Mello, Pavlo Molchanov, Umar Iqbal, Otmar Hilliges, Jan Kautz

Inter-personal anatomical differences limit the accuracy of person-independent gaze estimation networks.

Ranked #1 on Gaze Estimation on MPII Gaze (using extra training data)

Gaze Estimation Meta-Learning

306

Paper
Code

Importance Estimation for Neural Network Pruning

3 code implementations • CVPR 2019 • Pavlo Molchanov, Arun Mallya, Stephen Tyree, Iuri Frosio, Jan Kautz

On ResNet-101, we achieve a 40% FLOPS reduction by removing 30% of the parameters, with a loss of 0. 02% in the top-1 accuracy on ImageNet.

Network Pruning

303

Paper
Code

Neural RGB->D Sensing: Depth and Uncertainty from a Video Camera

1 code implementation • 9 Jan 2019 • Chao Liu, Jinwei Gu, Kihwan Kim, Srinivasa Narasimhan, Jan Kautz

Depth sensing is crucial for 3D reconstruction and scene understanding.

3D Reconstruction 3D Scene Reconstruction +1

297

Paper
Code

SPLATNet: Sparse Lattice Networks for Point Cloud Processing

2 code implementations • CVPR 2018 • Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz

We present a network architecture for processing point clouds that directly operates on a collection of points represented as a sparse set of samples in a high-dimensional lattice.

Ranked #29 on Semantic Segmentation on ScanNet

3D Part Segmentation 3D Semantic Segmentation

266

Paper
Code

STEP: Spatio-Temporal Progressive Learning for Video Action Detection

1 code implementation • CVPR 2019 • Xitong Yang, Xiaodong Yang, Ming-Yu Liu, Fanyi Xiao, Larry Davis, Jan Kautz

In this paper, we propose Spatio-TEmporal Progressive (STEP) action detector---a progressive learning framework for spatio-temporal action detection in videos.

Ranked #7 on Action Detection on UCF101-24

Action Detection Action Recognition

245

Paper
Code

Self-supervised Single-view 3D Reconstruction via Semantic Consistency

1 code implementation • ECCV 2020 • Xueting Li, Sifei Liu, Kihwan Kim, Shalini De Mello, Varun Jampani, Ming-Hsuan Yang, Jan Kautz

To the best of our knowledge, we are the first to try and solve the single-view reconstruction problem without a category-specific template mesh or semantic keypoints.

3D Reconstruction Object +1

226

Paper
Code

SCOPS: Self-Supervised Co-Part Segmentation

1 code implementation • CVPR 2019 • Wei-Chih Hung, Varun Jampani, Sifei Liu, Pavlo Molchanov, Ming-Hsuan Yang, Jan Kautz

Parts provide a good intermediate representation of objects that is robust with respect to the camera, pose and appearance variations.

Ranked #4 on Unsupervised Keypoint Estimation on CUB

Object Segmentation +4

218

Paper
Code

Self-Supervised Viewpoint Learning From Image Collections

2 code implementations • CVPR 2020 • Siva Karthik Mustikovela, Varun Jampani, Shalini De Mello, Sifei Liu, Umar Iqbal, Carsten Rother, Jan Kautz

Training deep neural networks to estimate the viewpoint of objects requires large labeled training datasets.

Object Viewpoint Estimation

214

Paper
Code

Switchable Temporal Propagation Network

1 code implementation • ECCV 2018 • Sifei Liu, Guangyu Zhong, Shalini De Mello, Jinwei Gu, Varun Jampani, Ming-Hsuan Yang, Jan Kautz

Our approach is based on a temporal propagation network (TPN), which models the transition-related affinity between a pair of frames in a purely data-driven manner.

Video Compression

176

Paper
Code

Joint-task Self-supervised Learning for Temporal Correspondence

2 code implementations • NeurIPS 2019 • Xueting Li, Sifei Liu, Shalini De Mello, Xiaolong Wang, Jan Kautz, Ming-Hsuan Yang

Our learning process integrates two highly related tasks: tracking large image regions \emph{and} establishing fine-grained pixel-level associations between consecutive video frames.

Ranked #73 on Semi-Supervised Video Object Segmentation on DAVIS 2017 (val)

Object Tracking Self-Supervised Learning +2

176

Paper
Code

Bi3D: Stereo Depth Estimation via Binary Classifications

1 code implementation • CVPR 2020 • Abhishek Badki, Alejandro Troccoli, Kihwan Kim, Jan Kautz, Pradeep Sen, Orazio Gallo

Given a strict time budget, Bi3D can detect objects closer than a given distance in as little as a few milliseconds, or estimate depth with arbitrarily coarse quantization, with complexity linear with the number of quantization levels.

Autonomous Navigation Quantization +1

158

Paper
Code

Learning Rigidity in Dynamic Scenes with a Moving Camera for 3D Motion Field Estimation

1 code implementation • ECCV 2018 • Zhaoyang Lv, Kihwan Kim, Alejandro Troccoli, Deqing Sun, James M. Rehg, Jan Kautz

Estimation of 3D motion in a dynamic scene from a temporal pair of images is a core task in many scene understanding problems.

Optical Flow Estimation Scene Flow Estimation +1

146

Paper
Code

Pruning Convolutional Neural Networks for Resource Efficient Inference

9 code implementations • 19 Nov 2016 • Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, Jan Kautz

We propose a new criterion based on Taylor expansion that approximates the change in the cost function induced by pruning network parameters.

Transfer Learning

141

Paper
Code

DexYCB: A Benchmark for Capturing Hand Grasping of Objects

2 code implementations • CVPR 2021 • Yu-Wei Chao, Wei Yang, Yu Xiang, Pavlo Molchanov, Ankur Handa, Jonathan Tremblay, Yashraj S. Narang, Karl Van Wyk, Umar Iqbal, Stan Birchfield, Jan Kautz, Dieter Fox

We introduce DexYCB, a new dataset for capturing hand grasping of objects.

3D Hand Pose Estimation 6D Pose Estimation using RGB +2

141

Paper
Code

AdaViT: Adaptive Tokens for Efficient Vision Transformer

1 code implementation • CVPR 2022 • Hongxu Yin, Arash Vahdat, Jose Alvarez, Arun Mallya, Jan Kautz, Pavlo Molchanov

A-ViT achieves this by automatically reducing the number of tokens in vision transformers that are processed in the network as inference proceeds.

Ranked #34 on Efficient ViTs on ImageNet-1K (with DeiT-S)

Efficient ViTs Token Reduction

132

Paper
Code

Simultaneous Edge Alignment and Learning

3 code implementations • ECCV 2018 • Zhiding Yu, Weiyang Liu, Yang Zou, Chen Feng, Srikumar Ramalingam, B. V. K. Vijaya Kumar, Jan Kautz

Edge detection is among the most fundamental vision problems for its role in perceptual grouping and its wide applications.

Edge Detection Representation Learning

126

Paper
Code

Convolutional Tensor-Train LSTM for Spatio-temporal Learning

2 code implementations • NeurIPS 2020 • Jiahao Su, Wonmin Byeon, Jean Kossaifi, Furong Huang, Jan Kautz, Animashree Anandkumar

Learning from spatio-temporal data has numerous applications such as human-behavior analysis, object tracking, video compression, and physics simulation. However, existing methods still perform poorly on challenging video tasks such as long-term forecasting.

Ranked #1 on Video Prediction on KTH (Cond metric)

Activity Recognition Video Compression +1

121

Paper
Code

Unsupervised Video Interpolation Using Cycle Consistency

1 code implementation • ICCV 2019 • Fitsum A. Reda, Deqing Sun, Aysegul Dundar, Mohammad Shoeybi, Guilin Liu, Kevin J. Shih, Andrew Tao, Jan Kautz, Bryan Catanzaro

We further introduce a pseudo supervised loss term that enforces the interpolated frames to be consistent with predictions of a pre-trained interpolation model.

Ranked #1 on Video Frame Interpolation on UCF101 (PSNR (sRGB) metric)

Video Frame Interpolation

107

Paper
Code

LITA: Language Instructed Temporal-Localization Assistant

1 code implementation • 27 Mar 2024 • De-An Huang, Shijia Liao, Subhashree Radhakrishnan, Hongxu Yin, Pavlo Molchanov, Zhiding Yu, Jan Kautz

In addition to leveraging existing video datasets with timestamps, we propose a new task, Reasoning Temporal Localization (RTL), along with the dataset, ActivityNet-RTL, for learning and evaluating this task.

Ranked #4 on Video-based Generative Performance Benchmarking on VideoInstruct

Instruction Following Temporal Localization +2

103

Paper
Code

Is Ego Status All You Need for Open-Loop End-to-End Autonomous Driving?

1 code implementation • 5 Dec 2023 • Zhiqi Li, Zhiding Yu, Shiyi Lan, Jiahan Li, Jan Kautz, Tong Lu, Jose M. Alvarez

We initially observed that the nuScenes dataset, characterized by relatively simple driving scenarios, leads to an under-utilization of perception information in end-to-end models incorporating ego status, such as the ego vehicle's velocity.

Autonomous Driving

Paper
Code

Context-Aware Synthesis and Placement of Object Instances

2 code implementations • NeurIPS 2018 • Donghoon Lee, Sifei Liu, Jinwei Gu, Ming-Yu Liu, Ming-Hsuan Yang, Jan Kautz

Learning to insert an object instance into an image in a semantically coherent manner is a challenging and interesting problem.

Object Scene Parsing

Paper
Code

Joint Disentangling and Adaptation for Cross-Domain Person Re-Identification

1 code implementation • ECCV 2020 • Yang Zou, Xiaodong Yang, Zhiding Yu, B. V. K. Vijaya Kumar, Jan Kautz

To this end, we propose a joint learning framework that disentangles id-related/unrelated features and enforces adaptation to work on the id-related feature space exclusively.

Ranked #6 on Unsupervised Domain Adaptation on Market to MSMT

Person Re-Identification Unsupervised Domain Adaptation

Paper
Code

Two-shot Spatially-varying BRDF and Shape Estimation

1 code implementation • CVPR 2020 • Mark Boss, Varun Jampani, Kihwan Kim, Hendrik P. A. Lensch, Jan Kautz

Extensive experiments on both synthetic and real-world datasets show that our network trained on a synthetic dataset can generalize well to real-world images.

Vocal Bursts Valence Prediction

Paper
Code

SENSE: a Shared Encoder Network for Scene-flow Estimation

1 code implementation • ICCV 2019 • Huaizu Jiang, Deqing Sun, Varun Jampani, Zhaoyang Lv, Erik Learned-Miller, Jan Kautz

We introduce a compact network for holistic scene flow estimation, called SENSE, which shares common encoder features among four closely-related tasks: optical flow estimation, disparity estimation from stereo, occlusion estimation, and semantic segmentation.

Disparity Estimation Occlusion Estimation +3

Paper
Code

Extreme View Synthesis

1 code implementation • ICCV 2019 • Inchang Choi, Orazio Gallo, Alejandro Troccoli, Min H. Kim, Jan Kautz

We present Extreme View Synthesis, a solution for novel view extrapolation that works even when the number of input images is small--as few as two.

Paper
Code

AM-RADIO: Agglomerative Vision Foundation Model -- Reduce All Domains Into One

1 code implementation • 10 Dec 2023 • Mike Ranzinger, Greg Heinrich, Jan Kautz, Pavlo Molchanov

A handful of visual foundation models (VFMs) have recently emerged as the backbones for numerous downstream tasks.

Benchmarking object-detection +2

Paper
Code

Contrastive Learning for Weakly Supervised Phrase Grounding

1 code implementation • ECCV 2020 • Tanmay Gupta, Arash Vahdat, Gal Chechik, Xiaodong Yang, Jan Kautz, Derek Hoiem

Given pairs of images and captions, we maximize compatibility of the attention-weighted regions and the words in the corresponding caption, compared to non-corresponding pairs of images and captions.

Contrastive Learning Language Modelling +1

Paper
Code

UNAS: Differentiable Architecture Search Meets Reinforcement Learning

1 code implementation • CVPR 2020 • Arash Vahdat, Arun Mallya, Ming-Yu Liu, Jan Kautz

Our framework brings the best of both worlds, and it enables us to search for architectures with both differentiable and non-differentiable criteria in one unified framework while maintaining a low search cost.

Neural Architecture Search reinforcement-learning +1

Paper
Code

VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models

1 code implementation • ICLR 2021 • Zhisheng Xiao, Karsten Kreis, Jan Kautz, Arash Vahdat

VAEBM captures the overall mode structure of the data distribution using a state-of-the-art VAE and it relies on its EBM component to explicitly exclude non-data-like regions from the model and refine the image samples.

Ranked #1 on Image Generation on Stacked MNIST

Image Generation Out-of-Distribution Detection

Paper
Code

Binary TTC: A Temporal Geofence for Autonomous Navigation

1 code implementation • CVPR 2021 • Abhishek Badki, Orazio Gallo, Jan Kautz, Pradeep Sen

Time-to-contact (TTC), the time for an object to collide with the observer's plane, is a powerful tool for path planning: it is potentially more informative than the depth, velocity, and acceleration of objects in the scene -- even for humans.

Autonomous Navigation Quantization

Paper
Code

A Variational Perspective on Solving Inverse Problems with Diffusion Models

1 code implementation • 7 May 2023 • Morteza Mardani, Jiaming Song, Jan Kautz, Arash Vahdat

To cope with this challenge, we propose a variational approach that by design seeks to approximate the true posterior distribution.

Denoising Image Restoration +1

Paper
Code

Meshlet Priors for 3D Mesh Reconstruction

1 code implementation • CVPR 2020 • Abhishek Badki, Orazio Gallo, Jan Kautz, Pradeep Sen

Meshlets act as a dictionary of local features and thus allow to use learned priors to reconstruct object meshes in any pose and from unseen classes, even when the noise is large and the samples sparse.

Object

Paper
Code

Recurrence without Recurrence: Stable Video Landmark Detection with Deep Equilibrium Models

1 code implementation • CVPR 2023 • Paul Micaelli, Arash Vahdat, Hongxu Yin, Jan Kautz, Pavlo Molchanov

Our Landmark DEQ (LDEQ) achieves state-of-the-art performance on the challenging WFLW facial landmark dataset, reaching $3. 92$ NME with fewer parameters and a training memory cost of $\mathcal{O}(1)$ in the number of recurrent modules.

Ranked #2 on Face Alignment on WFLW

Face Alignment

Paper
Code

Weakly-Supervised Physically Unconstrained Gaze Estimation

1 code implementation • CVPR 2021 • Rakshit Kothari, Shalini De Mello, Umar Iqbal, Wonmin Byeon, Seonwook Park, Jan Kautz

A major challenge for physically unconstrained gaze estimation is acquiring training data with 3D gaze annotations for in-the-wild and outdoor scenarios.

Ranked #3 on Gaze Estimation on Gaze360

Domain Generalization Gaze Estimation

Paper
Code

The Best Defense is a Good Offense: Adversarial Augmentation against Adversarial Attacks

1 code implementation • CVPR 2023 • Iuri Frosio, Jan Kautz

Many defenses against adversarial attacks (\eg robust classifiers, randomization, or image purification) use countermeasures put to work only after the attack has been crafted.

Paper
Code

CoordGAN: Self-Supervised Dense Correspondences Emerge from GANs

1 code implementation • CVPR 2022 • Jiteng Mu, Shalini De Mello, Zhiding Yu, Nuno Vasconcelos, Xiaolong Wang, Jan Kautz, Sifei Liu

We represent the correspondence maps of different images as warped coordinate frames transformed from a canonical coordinate frame, i. e., the correspondence map, which describes the structure (e. g., the shape of a face), is controlled via a transformation.

Disentanglement

Paper
Code

Discovering Nonlinear Relations with Minimum Predictive Information Regularization

1 code implementation • 7 Jan 2020 • Tailin Wu, Thomas Breuel, Michael Skuhersky, Jan Kautz

Identifying the underlying directional relations from observational time series with nonlinear interactions and complex relational structures is key to a wide range of applications, yet remains a hard problem.

Time Series Time Series Analysis

Paper
Code

SMRD: SURE-based Robust MRI Reconstruction with Diffusion Models

2 code implementations • 3 Oct 2023 • Batu Ozturkler, Chao Liu, Benjamin Eckart, Morteza Mardani, Jiaming Song, Jan Kautz

However, diffusion models require careful tuning of inference hyperparameters on a validation set and are still sensitive to distribution shifts during testing.

MRI Reconstruction

Paper
Code

ViR: Towards Efficient Vision Retention Backbones

1 code implementation • 30 Oct 2023 • Ali Hatamizadeh, Michael Ranzinger, Shiyi Lan, Jose M. Alvarez, Sanja Fidler, Jan Kautz

Inspired by this trend, we propose a new class of computer vision models, dubbed Vision Retention Networks (ViR), with dual parallel and recurrent formulations, which strike an optimal balance between fast inference and parallel training with competitive performance.

Paper
Code

Zero-shot Pose Transfer for Unrigged Stylized 3D Characters

1 code implementation • CVPR 2023 • Jiashun Wang, Xueting Li, Sifei Liu, Shalini De Mello, Orazio Gallo, Xiaolong Wang, Jan Kautz

We present a zero-shot approach that requires only the widely available deformed non-stylized avatars in training, and deforms stylized characters of significantly different shapes at inference.

Pose Transfer

Paper
Code

Global Vision Transformer Pruning with Hessian-Aware Saliency

1 code implementation • CVPR 2023 • Huanrui Yang, Hongxu Yin, Maying Shen, Pavlo Molchanov, Hai Li, Jan Kautz

This work aims on challenging the common design philosophy of the Vision Transformer (ViT) model with uniform dimension across all the stacked blocks in a model stage, where we redistribute the parameters both across transformer blocks and between different structures within the block via the first systematic attempt on global structural pruning.

Efficient ViTs Philosophy

Paper
Code

Improving Landmark Localization with Semi-Supervised Learning

no code implementations • CVPR 2018 • Sina Honari, Pavlo Molchanov, Stephen Tyree, Pascal Vincent, Christopher Pal, Jan Kautz

First, we propose the framework of sequential multitasking and explore it here through an architecture for landmark localization where training with class labels acts as an auxiliary signal to guide the landmark localization on unlabeled data.

Ranked #41 on Face Alignment on 300W

Face Alignment Small Data Image Classification

Paper
Add Code

Budget-Aware Activity Detection with A Recurrent Policy Network

no code implementations • 30 Nov 2017 • Behrooz Mahasseni, Xiaodong Yang, Pavlo Molchanov, Jan Kautz

In this paper, we address the challenging problem of efficient temporal activity detection in untrimmed long videos.

Action Detection Activity Detection

Paper
Add Code

IamNN: Iterative and Adaptive Mobile Neural Network for Efficient Image Classification

no code implementations • 26 Apr 2018 • Sam Leroux, Pavlo Molchanov, Pieter Simoens, Bart Dhoedt, Thomas Breuel, Jan Kautz

Deep residual networks (ResNets) made a recent breakthrough in deep learning.

General Classification Image Classification

Paper
Add Code

Hand Pose Estimation via Latent 2.5D Heatmap Regression

no code implementations • ECCV 2018 • Umar Iqbal, Pavlo Molchanov, Thomas Breuel, Juergen Gall, Jan Kautz

Estimating the 3D pose of a hand is an essential part of human-computer interaction.

3D Hand Pose Estimation regression

Paper
Add Code

Light-weight Head Pose Invariant Gaze Tracking

no code implementations • 23 Apr 2018 • Rajeev Ranjan, Shalini De Mello, Jan Kautz

Unconstrained remote gaze tracking using off-the-shelf cameras is a challenging problem.

Gaze Estimation Transfer Learning +1

Paper
Add Code

A Lightweight Approach for On-the-Fly Reflectance Estimation

no code implementations • ICCV 2017 • Kihwan Kim, Jinwei Gu, Stephen Tyree, Pavlo Molchanov, Matthias Nießner, Jan Kautz

In addition, we have created a large synthetic dataset, SynBRDF, which comprises a total of $500$K RGBD images rendered with a physically-based ray tracer under a variety of natural illumination, covering $5000$ materials and $5000$ shapes.

Color Constancy

Paper
Add Code

Deep Semantic Face Deblurring

no code implementations • CVPR 2018 • Ziyi Shen, Wei-Sheng Lai, Tingfa Xu, Jan Kautz, Ming-Hsuan Yang

In this paper, we present an effective and efficient face deblurring algorithm by exploiting semantic cues via deep convolutional neural networks (CNNs).

Deblurring Face Recognition

Paper
Add Code

Reblur2Deblur: Deblurring Videos via Self-Supervised Learning

no code implementations • 16 Jan 2018 • Huaijin Chen, Jinwei Gu, Orazio Gallo, Ming-Yu Liu, Ashok Veeraraghavan, Jan Kautz

Motion blur is a fundamental problem in computer vision as it impacts image quality and hinders inference.

Deblurring Optical Flow Estimation +1

Paper
Add Code

Learning Adaptive Parameter Tuning for Image Processing

no code implementations • 28 Oct 2016 • Jingming Dong, Iuri Frosio, Jan Kautz

The non-stationary nature of image characteristics calls for adaptive processing, based on the local image content.

Deblurring Demosaicking +1

Paper
Add Code

Learning Binary Residual Representations for Domain-specific Video Streaming

no code implementations • 14 Dec 2017 • Yi-Hsuan Tsai, Ming-Yu Liu, Deqing Sun, Ming-Hsuan Yang, Jan Kautz

Specifically, we target a streaming setting where the videos to be streamed from a server to a client are all in the same domain and they have to be compressed to a small size for low-latency transmission.

Video Compression

Paper
Add Code

Separating Reflection and Transmission Images in the Wild

no code implementations • ECCV 2018 • Patrick Wieschollek, Orazio Gallo, Jinwei Gu, Jan Kautz

The reflections caused by common semi-reflectors, such as glass windows, can impact the performance of computer vision algorithms.

Synthetic Data Generation

Paper
Add Code

On Nearest Neighbors in Non Local Means Denoising

no code implementations • 20 Nov 2017 • Iuri Frosio, Jan Kautz

To denoise a reference patch, the Non-Local-Means denoising filter processes a set of neighbor patches.

Denoising

Paper
Add Code

Cascaded Scene Flow Prediction using Semantic Segmentation

no code implementations • 26 Jul 2017 • Zhile Ren, Deqing Sun, Jan Kautz, Erik B. Sudderth

Given two consecutive frames from a pair of stereo cameras, 3D scene flow methods simultaneously estimate the 3D geometry and motion of the observed scene.

Autonomous Driving General Classification +3

Paper
Add Code

Multiframe Scene Flow with Piecewise Rigid Motion

no code implementations • 5 Oct 2017 • Vladislav Golyanik, Kihwan Kim, Robert Maier, Matthias Nießner, Didier Stricker, Jan Kautz

We introduce a novel multiframe scene flow approach that jointly optimizes the consistency of the patch appearances and their local rigid motions from RGB-D image sequences.

Scene Flow Estimation

Paper
Add Code

Learning Affinity via Spatial Propagation Networks

no code implementations • NeurIPS 2017 • Sifei Liu, Shalini De Mello, Jinwei Gu, Guangyu Zhong, Ming-Hsuan Yang, Jan Kautz

Specifically, we develop a three-way connection for the linear propagation model, which (a) formulates a sparse transformation matrix, where all elements can be the output from a deep CNN, but (b) results in a dense affinity matrix that effectively models any task-specific pairwise similarity matrix.

Colorization Face Parsing +4

Paper
Add Code

Learning to Segment Instances in Videos with Spatial Propagation Network

no code implementations • 14 Sep 2017 • Jingchun Cheng, Sifei Liu, Yi-Hsuan Tsai, Wei-Chih Hung, Shalini De Mello, Jinwei Gu, Jan Kautz, Shengjin Wang, Ming-Hsuan Yang

In addition, we apply a filter on the refined score map that aims to recognize the best connected region using spatial and temporal consistencies in the video.

Object Segmentation +1

Paper
Add Code

Deep Learning with Energy-efficient Binary Gradient Cameras

no code implementations • 3 Dec 2016 • Suren Jayasuriya, Orazio Gallo, Jinwei Gu, Jan Kautz

Power consumption is a critical factor for the deployment of embedded computer vision systems.

Face Detection Gesture Recognition +1

Paper
Add Code

Locally Non-rigid Registration for Mobile HDR Photography

no code implementations • 7 Apr 2015 • Orazio Gallo, Alejandro Troccoli, Jun Hu, Kari Pulli, Jan Kautz

Image registration for stack-based HDR photography is challenging.

Image Registration

Paper
Add Code

Hierarchical Subquery Evaluation for Active Learning on a Graph

no code implementations • CVPR 2014 • Oisin Mac Aodha, Neill D. F. Campbell, Jan Kautz, Gabriel J. Brostow

Under some specific circumstances, Expected Error Reduction has been one of the strongest-performing informativeness criteria for active learning.

Active Learning graph construction +1

Paper
Add Code

Domain Stylization: A Strong, Simple Baseline for Synthetic to Real Image Domain Adaptation

no code implementations • 24 Jul 2018 • Aysegul Dundar, Ming-Yu Liu, Ting-Chun Wang, John Zedlewski, Jan Kautz

Deep neural networks have largely failed to effectively utilize synthetic data when applied to real images due to the covariate shift problem.

Domain Adaptation object-detection +5

Paper
Add Code

Tackling 3D ToF Artifacts Through Learning and the FLAT Dataset

no code implementations • ECCV 2018 • Qi Guo, Iuri Frosio, Orazio Gallo, Todd Zickler, Jan Kautz

Scene motion, multiple reflections, and sensor noise introduce artifacts in the depth reconstruction performed by time-of-flight cameras.

Paper
Add Code

EOE: Expected Overlap Estimation over Unstructured Point Cloud Data

no code implementations • 6 Aug 2018 • Ben Eckart, Kihwan Kim, Jan Kautz

We present an iterative overlap estimation technique to augment existing point cloud registration algorithms that can achieve high performance in difficult real-world situations where large pose displacement and non-overlapping geometry would otherwise cause traditional methods to fail.

Point Cloud Registration

Paper
Add Code

Learning Superpixels With Segmentation-Aware Affinity Loss

no code implementations • CVPR 2018 • Wei-Chih Tu, Ming-Yu Liu, Varun Jampani, Deqing Sun, Shao-Yi Chien, Ming-Hsuan Yang, Jan Kautz

Specifically, we propose a new loss function that takes the segmentation error into account for affinity learning.

Segmentation Superpixels

Paper
Add Code

Making Convolutional Networks Recurrent for Visual Sequence Learning

no code implementations • CVPR 2018 • Xiaodong Yang, Pavlo Molchanov, Jan Kautz

Recurrent neural networks (RNNs) have emerged as a powerful model for a broad range of machine learning problems that involve sequential data.

Action Recognition Face Alignment +6

Paper
Add Code

Neural Causal Discovery with Learnable Input Noise

no code implementations • ICLR 2019 • Tailin Wu, Thomas Breuel, Jan Kautz

Learning causal relations from observational time series with nonlinear interactions and complex causal structures is a key component of human intelligence, and has a wide range of applications.

Causal Discovery EEG +2

Paper
Add Code

Neural Inverse Rendering of an Indoor Scene from a Single Image

no code implementations • ICCV 2019 • Soumyadip Sengupta, Jinwei Gu, Kihwan Kim, Guilin Liu, David W. Jacobs, Jan Kautz

Inverse rendering aims to estimate physical attributes of a scene, e. g., reflectance, geometry, and lighting, from image(s).

Inverse Rendering Self-Supervised Learning

Paper
Add Code

NRMVS: Non-Rigid Multi-View Stereo

no code implementations • 12 Jan 2019 • Matthias Innmann, Kihwan Kim, Jinwei Gu, Matthias Niessner, Charles Loop, Marc Stamminger, Jan Kautz

We show that creating a dense 4D structure from a few RGB images with non-rigid changes is possible, and demonstrate that our method can be used to interpolate novel deformed scenes from various combinations of these deformation estimates derived from the sparse views.

3D Reconstruction Depth Estimation

Paper
Add Code

Fully-Connected CRFs with Non-Parametric Pairwise Potential

no code implementations • CVPR 2013 • Neill D. F. Campbell, Kartic Subr, Jan Kautz

Conditional Random Fields (CRFs) are used for diverse tasks, ranging from image denoising to object recognition.

Density Estimation Image Denoising +1

Paper
Add Code

Modeling Object Appearance Using Context-Conditioned Component Analysis

no code implementations • CVPR 2015 • Daniyar Turmukhambetov, Neill D. F. Campbell, Simon J. D. Prince, Jan Kautz

In this work we remove the image space alignment limitations of existing subspace models by conditioning the models on a shape dependent context that allows for the complex, non-linear structure of the appearance of the visual object to be captured and shared.

Object

Paper
Add Code

Online Detection and Classification of Dynamic Hand Gestures With Recurrent 3D Convolutional Neural Network

no code implementations • CVPR 2016 • Pavlo Molchanov, Xiaodong Yang, Shalini Gupta, Kihwan Kim, Stephen Tyree, Jan Kautz

Automatic detection and classification of dynamic hand gestures in real-world systems intended for human computer interaction is challenging as: 1) there is a large diversity in how people perform gestures, making detection and classification difficult; 2) the system must work online in order to avoid noticeable lag between performing a gesture and its classification; in fact, a negative lag (classification before the gesture is finished) is desirable, as feedback to the user can then be truly instantaneous.

Classification General Classification +1

Paper
Add Code

Accelerated Generative Models for 3D Point Cloud Data

no code implementations • CVPR 2016 • Benjamin Eckart, Kihwan Kim, Alejandro Troccoli, Alonzo Kelly, Jan Kautz

In this paper we introduce a method for constructing compact generative representations of PCD at multiple levels of detail.

Paper
Add Code

Dynamic Facial Analysis: From Bayesian Filtering to Recurrent Neural Network

no code implementations • CVPR 2017 • Jinwei Gu, Xiaodong Yang, Shalini De Mello, Jan Kautz

We are inspired by the fact that the computation performed in an RNN bears resemblance to Bayesian filters, which have been used for tracking in many previous methods for facial analysis from videos.

Ranked #1 on Head Pose Estimation on BIWI (MAE (trained with BIWI data) metric, using extra training data)

Face Alignment Feature Engineering +3

Paper
Add Code

Polarimetric Multi-View Stereo

no code implementations • CVPR 2017 • Zhaopeng Cui, Jinwei Gu, Boxin Shi, Ping Tan, Jan Kautz

Multi-view stereo relies on feature correspondences for 3D reconstruction, and thus is fundamentally flawed in dealing with featureless scenes.

3D Reconstruction

Paper
Add Code

Robust Model-Based 3D Head Pose Estimation

no code implementations • ICCV 2015 • Gregory P. Meyer, Shalini Gupta, Iuri Frosio, Dikpal Reddy, Jan Kautz

We introduce a method for accurate three dimensional head pose estimation using a commodity depth camera.

Face Model Head Pose Estimation

Paper
Add Code

Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments

no code implementations • CVPR 2019 • Xueting Li, Sifei Liu, Kihwan Kim, Xiaolong Wang, Ming-Hsuan Yang, Jan Kautz

In order to predict valid affordances and learn possible 3D human poses in indoor scenes, we need to understand the semantic and geometric structure of a scene as well as its potential interactions with a human.

valid

Paper
Add Code

Towards annotation-efficient segmentation via image-to-image translation

no code implementations • 2 Apr 2019 • Eugene Vorontsov, Pavlo Molchanov, Christopher Beckham, Jan Kautz, Samuel Kadoury

Specifically, we propose a semi-supervised framework that employs unpaired image-to-image translation between two domains, presence vs. absence of cancer, as the unsupervised objective.

Brain Tumor Segmentation Image-to-Image Translation +3

Paper
Add Code

Few-Shot Viewpoint Estimation

no code implementations • 13 May 2019 • Hung-Yu Tseng, Shalini De Mello, Jonathan Tremblay, Sifei Liu, Stan Birchfield, Ming-Hsuan Yang, Jan Kautz

Through extensive experimentation on the ObjectNet3D and Pascal3D+ benchmark datasets, we demonstrate that our framework, which we call MetaView, significantly outperforms fine-tuning the state-of-the-art models with few examples, and that the specific architectural innovations of our method are crucial to achieving good performance.

Meta-Learning Viewpoint Estimation

Paper
Add Code

Video Stitching for Linear Camera Arrays

no code implementations • 31 Jul 2019 • Wei-Sheng Lai, Orazio Gallo, Jinwei Gu, Deqing Sun, Ming-Hsuan Yang, Jan Kautz

Despite the long history of image and video stitching research, existing academic and commercial solutions still produce strong artifacts.

Autonomous Driving Spatial Interpolation

Paper
Add Code

Learning Propagation for Arbitrarily-structured Data

no code implementations • ICCV 2019 • Sifei Liu, Xueting Li, Varun Jampani, Shalini De Mello, Jan Kautz

We experiment with semantic segmentation networks, where we use our propagation module to jointly train on different data -- images, superpixels and point clouds.

Point Cloud Segmentation Segmentation +2

Paper
Add Code

Angular Visual Hardness

no code implementations • ICML 2020 • Beidi Chen, Weiyang Liu, Zhiding Yu, Jan Kautz, Anshumali Shrivastava, Animesh Garg, Anima Anandkumar

We also find that AVH has a statistically significant correlation with human visual hardness.

Domain Generalization

Paper
Add Code

Exploiting Semantics for Face Image Deblurring

no code implementations • 19 Jan 2020 • Ziyi Shen, Wei-Sheng Lai, Tingfa Xu, Jan Kautz, Ming-Hsuan Yang

Specifically, we first use a coarse deblurring network to reduce the motion blur on the input face image.

Deblurring Face Recognition +1

Paper
Add Code

Learning to Generate Multiple Style Transfer Outputs for an Input Sentence

no code implementations • WS 2020 • Kevin Lin, Ming-Yu Liu, Ming-Ting Sun, Jan Kautz

Specifically, we decompose the latent representation of the input sentence to a style code that captures the language style variation and a content code that encodes the language style-independent content.

Sentence Style Transfer +1

Paper
Add Code

Weakly-Supervised 3D Human Pose Learning via Multi-view Images in the Wild

no code implementations • CVPR 2020 • Umar Iqbal, Pavlo Molchanov, Jan Kautz

One major challenge for monocular 3D human pose estimation in-the-wild is the acquisition of training data that contains unconstrained images annotated with accurate 3D poses.

Ranked #1 on Weakly-supervised 3D Human Pose Estimation on MPI-INF-3DHP

Monocular 3D Human Pose Estimation Weakly-superavised 3D Human Pose Estimation +1

Paper
Add Code

Gaze-Sensing LEDs for Head Mounted Displays

no code implementations • 18 Mar 2020 • Kaan Akşit, Jan Kautz, David Luebke

We introduce a new gaze tracker for Head Mounted Displays (HMDs).

Dimensionality Reduction Gaze Estimation +2

Paper
Add Code

Weakly Supervised 3D Hand Pose Estimation via Biomechanical Constraints

no code implementations • ECCV 2020 • Adrian Spurr, Umar Iqbal, Pavlo Molchanov, Otmar Hilliges, Jan Kautz

Estimating 3D hand pose from 2D images is a difficult, inverse problem due to the inherent scale and depth ambiguities.

Ranked #10 on 3D Hand Pose Estimation on DexYCB

3D Hand Pose Estimation Open-Ended Question Answering +1

Paper
Add Code

Novel View Synthesis of Dynamic Scenes with Globally Coherent Depths from a Monocular Camera

no code implementations • CVPR 2020 • Jae Shin Yoon, Kihwan Kim, Orazio Gallo, Hyun Soo Park, Jan Kautz

Our insight is that although its scale and quality are inconsistent with other views, the depth estimation from a single view can be used to reason about the globally coherent geometry of dynamic contents.

Depth Estimation Novel View Synthesis

Paper
Add Code

How to Close Sim-Real Gap? Transfer with Segmentation!

no code implementations • 14 May 2020 • Mengyuan Yan, Qingyun Sun, Iuri Frosio, Stephen Tyree, Jan Kautz

Combining the control policy learned from simulation with the perception model, we achieve an impressive $\bf{88\%}$ success rate in grasping a tiny sphere with a real robot.

Robotics

Paper
Add Code

Hierarchical Contrastive Motion Learning for Video Action Recognition

no code implementations • 20 Jul 2020 • Xitong Yang, Xiaodong Yang, Sifei Liu, Deqing Sun, Larry Davis, Jan Kautz

Thus, the motion features at higher levels are trained to gradually capture semantic dynamics and evolve more discriminative for action recognition.

Action Recognition Contrastive Learning +2

Paper
Add Code

Improving Deep Stereo Network Generalization with Geometric Priors

no code implementations • 25 Aug 2020 • Jialiang Wang, Varun Jampani, Deqing Sun, Charles Loop, Stan Birchfield, Jan Kautz

End-to-end deep learning methods have advanced stereo vision in recent years and obtained excellent results when the training and test data are similar.

Paper
Add Code

A Contrastive Learning Approach for Training Variational Autoencoder Priors

no code implementations • NeurIPS 2021 • Jyoti Aneja, Alexander Schwing, Jan Kautz, Arash Vahdat

To tackle this issue, we propose an energy-based prior defined by the product of a base prior distribution and a reweighting factor, designed to bring the base closer to the aggregate posterior.

Ranked #6 on Image Generation on CelebA 256x256 (FID metric)

Contrastive Learning Image Generation

Paper
Add Code

UFO$^2$: A Unified Framework towards Omni-supervised Object Detection

no code implementations • 21 Oct 2020 • Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Alexander G. Schwing, Jan Kautz

Existing work on object detection often relies on a single form of annotation: the model is trained using either accurate yet costly bounding boxes or cheaper but less expressive image-level tags.

object-detection Object Detection

Paper
Add Code

Displacement-Invariant Cost Computation for Efficient Stereo Matching

no code implementations • 1 Dec 2020 • Yiran Zhong, Charles Loop, Wonmin Byeon, Stan Birchfield, Yuchao Dai, Kaihao Zhang, Alexey Kamenev, Thomas Breuel, Hongdong Li, Jan Kautz

A common way to speed up the computation is to downsample the feature volume, but this loses high-frequency details.

Autonomous Driving Stereo Matching

Paper
Add Code

Online Adaptation for Consistent Mesh Reconstruction in the Wild

no code implementations • NeurIPS 2020 • Xueting Li, Sifei Liu, Shalini De Mello, Kihwan Kim, Xiaolong Wang, Ming-Hsuan Yang, Jan Kautz

This paper presents an algorithm to reconstruct temporally consistent 3D meshes of deformable object instances from videos in the wild.

3D Reconstruction

Paper
Add Code

Parameter Efficient Multimodal Transformers for Video Representation Learning

no code implementations • ICLR 2021 • Sangho Lee, Youngjae Yu, Gunhee Kim, Thomas Breuel, Jan Kautz, Yale Song

The recent success of Transformers in the language domain has motivated adapting it to a multimodal setting, where a new visual model is trained in tandem with an already pretrained language model.

Language Modelling Representation Learning

Paper
Add Code

Neural 3D Clothes Retargeting from a Single Image

no code implementations • 29 Jan 2021 • Jae Shin Yoon, Kihwan Kim, Jan Kautz, Hyun Soo Park

In this paper, we present a method of clothes retargeting; generating the potential poses and deformations of a given 3D clothing template model to fit onto a person in a single RGB image.

Paper
Add Code

Learning Affinity via Spatial Propagation Network

no code implementations • 3 Oct 2017 • Sifei Liu, Shalini De Mello, Jinwei Gu, Guangyu Zhong, Ming-Hsuan Yang, Jan Kautz

Colorization Face Parsing +4

Paper
Add Code

Learning to Track Instances without Video Annotations

no code implementations • CVPR 2021 • Yang Fu, Sifei Liu, Umar Iqbal, Shalini De Mello, Humphrey Shi, Jan Kautz

Tracking segmentation masks of multiple instances has been intensively studied, but still faces two fundamental challenges: 1) the requirement of large-scale, frame-wise annotation, and 2) the complexity of two-stage approaches.

Instance Segmentation Pose Estimation +1

Paper
Add Code

KAMA: 3D Keypoint Aware Body Mesh Articulation

no code implementations • 27 Apr 2021 • Umar Iqbal, Kevin Xie, Yunrong Guo, Jan Kautz, Pavlo Molchanov

We present KAMA, a 3D Keypoint Aware Mesh Articulation approach that allows us to estimate a human body mesh from the positions of 3D body keypoints.

Ranked #47 on 3D Human Pose Estimation on 3DPW

3D Human Pose Estimation 3D Human Shape Estimation +1

Paper
Add Code

Adversarial Motion Modelling helps Semi-supervised Hand Pose Estimation

no code implementations • 10 Jun 2021 • Adrian Spurr, Pavlo Molchanov, Umar Iqbal, Jan Kautz, Otmar Hilliges

Hand pose estimation is difficult due to different environmental conditions, object- and self-occlusion as well as diversity in hand shape and appearance.

Hand Pose Estimation valid

Paper
Add Code

Self-Supervised Learning on 3D Point Clouds by Learning Discrete Generative Models

no code implementations • CVPR 2021 • Benjamin Eckart, Wentao Yuan, Chao Liu, Jan Kautz

In this work, we introduce a general method for 3D self-supervised representation learning that 1) remains agnostic to the underlying neural network architecture, and 2) specifically leverages the geometric nature of 3D point cloud data.

Point Cloud Segmentation Representation Learning +4

Paper
Add Code

Privacy Vulnerability of Split Computing to Data-Free Model Inversion Attacks

no code implementations • 13 Jul 2021 • Xin Dong, Hongxu Yin, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov, H. T. Kung

Prior works usually assume that SC offers privacy benefits as only intermediate features, instead of private data, are shared from devices to the cloud.

Paper
Add Code

LANA: Latency Aware Network Acceleration

no code implementations • 12 Jul 2021 • Pavlo Molchanov, Jimmy Hall, Hongxu Yin, Jan Kautz, Nicolo Fusi, Arash Vahdat

We analyze three popular network architectures: EfficientNetV1, EfficientNetV2 and ResNeST, and achieve accuracy improvement for all models (up to $3. 0\%$) when compressing larger models to the latency level of smaller models.

Neural Architecture Search Quantization

Paper
Add Code

Learning Indoor Inverse Rendering with 3D Spatially-Varying Lighting

no code implementations • ICCV 2021 • Zian Wang, Jonah Philion, Sanja Fidler, Jan Kautz

In this paper, we propose a unified, learning-based inverse rendering framework that formulates 3D spatially-varying lighting.

Image-to-Image Translation Inverse Rendering +1

Paper
Add Code

Learning Contrastive Representation for Semantic Correspondence

no code implementations • 22 Sep 2021 • Taihong Xiao, Sifei Liu, Shalini De Mello, Zhiding Yu, Jan Kautz, Ming-Hsuan Yang

Dense correspondence across semantically related images has been extensively studied, but still faces two challenges: 1) large variations in appearance, scale and pose exist even for objects from the same category, and 2) labeling pixel-level dense correspondences is labor intensive and infeasible to scale.

Contrastive Learning Semantic correspondence

Paper
Add Code

Hardware-Aware Network Transformation

no code implementations • 29 Sep 2021 • Pavlo Molchanov, Jimmy Hall, Hongxu Yin, Jan Kautz, Nicolo Fusi, Arash Vahdat

In the second phase, it solves the combinatorial selection of efficient operations using a novel constrained integer linear optimization approach.

Neural Architecture Search

Paper
Add Code

Self-Supervised Object Detection via Generative Image Synthesis

no code implementations • ICCV 2021 • Siva Karthik Mustikovela, Shalini De Mello, Aayush Prakash, Umar Iqbal, Sifei Liu, Thu Nguyen-Phuoc, Carsten Rother, Jan Kautz

We present SSOD, the first end-to-end analysis-by synthesis framework with controllable GANs for the task of self-supervised object detection.

Image Generation Object +2

Paper
Add Code

Coupled Segmentation and Edge Learning via Dynamic Graph Propagation

no code implementations • NeurIPS 2021 • Zhiding Yu, Rui Huang, Wonmin Byeon, Sifei Liu, Guilin Liu, Thomas Breuel, Anima Anandkumar, Jan Kautz

It is therefore interesting to study how these two tasks can be coupled to benefit each other.

Edge Detection Image Segmentation +2

Paper
Add Code

Convolutional Tensor-Train LSTM for Long-Term Video Prediction

no code implementations • 25 Sep 2019 • Jiahao Su, Wonmin Byeon, Furong Huang, Jan Kautz, Animashree Anandkumar

Long-term video prediction is highly challenging since it entails simultaneously capturing spatial and temporal information across a long range of image frames. Standard recurrent models are ineffective since they are prone to error propagation and cannot effectively capture higher-order correlations.

Video Prediction

Paper
Add Code

Long History Short-Term Memory for Long-Term Video Prediction

no code implementations • 25 Sep 2019 • Wonmin Byeon, Jan Kautz

While video prediction approaches have advanced considerably in recent years, learning to predict long-term future is challenging — ambiguous future or error propagation over time yield blurry predictions.

Video Prediction

Paper
Add Code

NCP-VAE: Variational Autoencoders with Noise Contrastive Priors

no code implementations • 28 Sep 2020 • Jyoti Aneja, Alex Schwing, Jan Kautz, Arash Vahdat

To tackle this issue, we propose an energy-based prior defined by the product of a base prior distribution and a reweighting factor, designed to bring the base closer to the aggregate posterior.

Paper
Add Code

Learning Continuous Environment Fields via Implicit Functions

no code implementations • ICLR 2022 • Xueting Li, Shalini De Mello, Xiaolong Wang, Ming-Hsuan Yang, Jan Kautz, Sifei Liu

We propose a novel scene representation that encodes reaching distance -- the distance between any position in the scene to a goal along a feasible trajectory.

Position Trajectory Prediction

Paper
Add Code

Federated Learning with Heterogeneous Architectures using Graph HyperNetworks

no code implementations • 20 Jan 2022 • Or Litany, Haggai Maron, David Acuna, Jan Kautz, Gal Chechik, Sanja Fidler

Standard Federated Learning (FL) techniques are limited to clients with identical network architectures.

Federated Learning

Paper
Add Code

Do Gradient Inversion Attacks Make Federated Learning Unsafe?

no code implementations • 14 Feb 2022 • Ali Hatamizadeh, Hongxu Yin, Pavlo Molchanov, Andriy Myronenko, Wenqi Li, Prerna Dogra, Andrew Feng, Mona G. Flores, Jan Kautz, Daguang Xu, Holger R. Roth

Federated learning (FL) allows the collaborative training of AI models without needing to share raw data.

Federated Learning Privacy Preserving

Paper
Add Code

Physics Informed RNN-DCT Networks for Time-Dependent Partial Differential Equations

no code implementations • 24 Feb 2022 • Benjamin Wu, Oliver Hennigh, Jan Kautz, Sanjay Choudhry, Wonmin Byeon

This efficiently and flexibly produces a compressed representation which is used for additional conditioning of physics-informed models.

Paper
Add Code

GradViT: Gradient Inversion of Vision Transformers

no code implementations • CVPR 2022 • Ali Hatamizadeh, Hongxu Yin, Holger Roth, Wenqi Li, Jan Kautz, Daguang Xu, Pavlo Molchanov

In this work we demonstrate the vulnerability of vision transformers (ViTs) to gradient-based inversion attacks.

Scheduling

Paper
Add Code

DRaCoN -- Differentiable Rasterization Conditioned Neural Radiance Fields for Articulated Avatars

no code implementations • 29 Mar 2022 • Amit Raj, Umar Iqbal, Koki Nagano, Sameh Khamis, Pavlo Molchanov, James Hays, Jan Kautz

In this work, we present, DRaCoN, a framework for learning full-body volumetric avatars which exploits the advantages of both the 2D and 3D neural rendering techniques.

Neural Rendering

Paper
Add Code

RTMV: A Ray-Traced Multi-View Synthetic Dataset for Novel View Synthesis

no code implementations • 14 May 2022 • Jonathan Tremblay, Moustafa Meshry, Alex Evans, Jan Kautz, Alexander Keller, Sameh Khamis, Thomas Müller, Charles Loop, Nathan Morrical, Koki Nagano, Towaki Takikawa, Stan Birchfield

We present a large-scale synthetic dataset for novel view synthesis consisting of ~300k images rendered from nearly 2000 complex scenes using high-quality ray tracing at high resolution (1600 x 1600 pixels).

Ranked #1 on Novel View Synthesis on RTMV

Novel View Synthesis

Paper
Add Code

Neural Light Field Estimation for Street Scenes with Differentiable Virtual Object Insertion

no code implementations • 19 Aug 2022 • Zian Wang, Wenzheng Chen, David Acuna, Jan Kautz, Sanja Fidler

In this work, we propose a neural approach that estimates the 5D HDR light field from a single image, and a differentiable object insertion formulation that enables end-to-end training with image-based losses that encourage realism.

Autonomous Driving Lighting Estimation +1

Paper
Add Code

Learning to Relight Portrait Images via a Virtual Light Stage and Synthetic-to-Real Adaptation

no code implementations • 21 Sep 2022 • Yu-Ying Yeh, Koki Nagano, Sameh Khamis, Jan Kautz, Ming-Yu Liu, Ting-Chun Wang

An effective approach is to supervise the training of deep neural networks with a high-fidelity dataset of desired input-output pairs, captured with a light stage.

Paper
Add Code

PhysDiff: Physics-Guided Human Motion Diffusion Model

no code implementations • ICCV 2023 • Ye Yuan, Jiaming Song, Umar Iqbal, Arash Vahdat, Jan Kautz

Specifically, we propose a physics-based motion projection module that uses motion imitation in a physics simulator to project the denoised motion of a diffusion step to a physically-plausible motion.

Denoising

Paper
Add Code

RANA: Relightable Articulated Neural Avatars

no code implementations • ICCV 2023 • Umar Iqbal, Akin Caliskan, Koki Nagano, Sameh Khamis, Pavlo Molchanov, Jan Kautz

We propose RANA, a relightable and articulated neural avatar for the photorealistic synthesis of humans under arbitrary viewpoints, body poses, and lighting.

Disentanglement Image Generation

Paper
Add Code

Score-based Diffusion Models in Function Space

no code implementations • 14 Feb 2023 • Jae Hyun Lim, Nikola B. Kovachki, Ricardo Baptista, Christopher Beckham, Kamyar Azizzadenesheli, Jean Kossaifi, Vikram Voleti, Jiaming Song, Karsten Kreis, Jan Kautz, Christopher Pal, Arash Vahdat, Anima Anandkumar

They consist of a forward process that perturbs input data with Gaussian white noise and a reverse process that learns a score function to generate samples by denoising.

Denoising

Paper
Add Code

Single-Shot Implicit Morphable Faces with Consistent Texture Parameterization

no code implementations • 4 May 2023 • Connor Z. Lin, Koki Nagano, Jan Kautz, Eric R. Chan, Umar Iqbal, Leonidas Guibas, Gordon Wetzstein, Sameh Khamis

To tackle this problem, we propose a novel method for constructing implicit 3D morphable face models that are both generalizable and intuitive for editing.

Face Model Face Reconstruction

Paper
Add Code

Generalizable One-shot Neural Head Avatar

no code implementations • 14 Jun 2023 • Xueting Li, Shalini De Mello, Sifei Liu, Koki Nagano, Umar Iqbal, Jan Kautz

We present a method that reconstructs and animates a 3D head avatar from a single-view portrait image.

Super-Resolution

Paper
Add Code

Heterogeneous Continual Learning

no code implementations • CVPR 2023 • Divyam Madaan, Hongxu Yin, Wonmin Byeon, Jan Kautz, Pavlo Molchanov

We propose a novel framework and a solution to tackle the continual learning (CL) problem with changing network architectures.

Continual Learning Knowledge Distillation +1

Paper
Add Code

Online Overexposed Pixels Hallucination in Videos with Adaptive Reference Frame Selection

no code implementations • 29 Aug 2023 • Yazhou Xing, Amrita Mazumdar, Anjul Patney, Chao Liu, Hongxu Yin, Qifeng Chen, Jan Kautz, Iuri Frosio

We present a learning-based system to reduce these artifacts without resorting to complex acquisition mechanisms like alternating exposures or costly processing that are typical of high dynamic range (HDR) imaging.

Hallucination

Paper
Add Code

3D Reconstruction with Generalizable Neural Fields using Scene Priors

no code implementations • 26 Sep 2023 • Yang Fu, Shalini De Mello, Xueting Li, Amey Kulkarni, Jan Kautz, Xiaolong Wang, Sifei Liu

NFP not only demonstrates SOTA scene reconstruction performance and efficiency, but it also supports single-image novel-view synthesis, which is underexplored in neural fields.

3D Reconstruction 3D Scene Reconstruction +1

Paper
Add Code

Residual Diffusion Modeling for Km-scale Atmospheric Downscaling

no code implementations • 24 Sep 2023 • Morteza Mardani, Noah Brenowitz, Yair Cohen, Jaideep Pathak, Chieh-Yu Chen, Cheng-Chin Liu, Arash Vahdat, Karthik Kashinath, Jan Kautz, Mike Pritchard

Predictions of weather hazard require expensive km-scale simulations driven by coarser global inputs.

Paper
Add Code

PACE: Human and Camera Motion Estimation from in-the-wild Videos

no code implementations • 20 Oct 2023 • Muhammed Kocabas, Ye Yuan, Pavlo Molchanov, Yunrong Guo, Michael J. Black, Otmar Hilliges, Jan Kautz, Umar Iqbal

This design combines the strengths of SLAM and motion priors, which leads to significant improvements in human and camera motion estimation.

Motion Estimation

Paper
Add Code

COLMAP-Free 3D Gaussian Splatting

no code implementations • 12 Dec 2023 • Yang Fu, Sifei Liu, Amey Kulkarni, Jan Kautz, Alexei A. Efros, Xiaolong Wang

While neural rendering has led to impressive advances in scene reconstruction and novel view synthesis, it relies heavily on accurately pre-computed camera poses.

Neural Rendering Novel View Synthesis +1

Paper
Add Code

GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning

no code implementations • 18 Dec 2023 • Ye Yuan, Xueting Li, Yangyi Huang, Shalini De Mello, Koki Nagano, Jan Kautz, Umar Iqbal

Gaussian splatting has emerged as a powerful 3D representation that harnesses the advantages of both explicit (mesh) and implicit (NeRF) 3D representations.

Paper
Add Code

FoVA-Depth: Field-of-View Agnostic Depth Estimation for Cross-Dataset Generalization

no code implementations • 24 Jan 2024 • Daniel Lichy, Hang Su, Abhishek Badki, Jan Kautz, Orazio Gallo

Unfortunately, most of the GT data is for pinhole cameras, making it impossible to properly train depth estimation models for large-FoV cameras.

Stereo Depth Estimation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.