Search Results for author: Aayush Bansal

Found 22 papers, 7 papers with code

VR-NeRF: High-Fidelity Virtualized Walkable Spaces

no code implementations5 Nov 2023 Linning Xu, Vasu Agrawal, William Laney, Tony Garcia, Aayush Bansal, Changil Kim, Samuel Rota Bulò, Lorenzo Porzi, Peter Kontschieder, Aljaž Božič, Dahua Lin, Michael Zollhöfer, Christian Richardt

We present an end-to-end system for the high-fidelity capture, model reconstruction, and real-time rendering of walkable spaces in virtual reality using neural radiance fields.

2k

EgoHumans: An Egocentric 3D Multi-Human Benchmark

no code implementations25 May 2023 Rawal Khirodkar, Aayush Bansal, Lingni Ma, Richard Newcombe, Minh Vo, Kris Kitani

We present EgoHumans, a new multi-view multi-human video benchmark to advance the state-of-the-art of egocentric human 3D pose estimation and tracking.

3D Pose Estimation Human Detection

Ego-Humans: An Ego-Centric 3D Multi-Human Benchmark

no code implementations ICCV 2023 Rawal Khirodkar, Aayush Bansal, Lingni Ma, Richard Newcombe, Minh Vo, Kris Kitani

We present EgoHumans, a new multi-view multi-human video benchmark to advance the state-of-the-art of egocentric human 3D pose estimation and tracking.

3D Pose Estimation Human Detection

Neural Pixel Composition for 3D-4D View Synthesis From Multi-Views

no code implementations CVPR 2023 Aayush Bansal, Michael Zollhöfer

We present Neural Pixel Composition (NPC), a novel approach for continuous 3D-4D view synthesis given only a discrete set of multi-view observations as input.

Neural Pixel Composition: 3D-4D View Synthesis from Multi-Views

no code implementations21 Jul 2022 Aayush Bansal, Michael Zollhoefer

We present Neural Pixel Composition (NPC), a novel approach for continuous 3D-4D view synthesis given only a discrete set of multi-view observations as input.

3D Reconstruction

KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints

1 code implementation10 May 2022 Marko Mihajlovic, Aayush Bansal, Michael Zollhoefer, Siyu Tang, Shunsuke Saito

In this work, we investigate common issues with existing spatial encodings and propose a simple yet highly effective approach to modeling high-fidelity volumetric humans from sparse views.

3D Face Reconstruction 3D Human Reconstruction +2

Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

1 code implementation CVPR 2021 Julian Chibane, Aayush Bansal, Verica Lazova, Gerard Pons-Moll

Recent neural view synthesis methods have achieved impressive quality and realism, surpassing classical pipelines which rely on multi-view reconstruction.

Streaming Self-Training via Domain-Agnostic Unlabeled Images

no code implementations7 Apr 2021 Zhiqiu Lin, Deva Ramanan, Aayush Bansal

We present streaming self-training (SST) that aims to democratize the process of learning visual recognition models such that a non-expert user can define a new task depending on their needs via a few labeled examples and minimal domain knowledge.

Fine-Grained Image Classification Semantic Segmentation +1

Video-Specific Autoencoders for Exploring, Editing and Transmitting Videos

no code implementations31 Mar 2021 Kevin Wang, Deva Ramanan, Aayush Bansal

Associating latent codes of a video and manifold projection enables users to make desired edits.

Denoising Super-Resolution

Improving task-specific representation via 1M unlabelled images without any extra knowledge

no code implementations24 Jun 2020 Aayush Bansal

We improve surface normal estimation on NYU-v2 depth dataset and semantic segmentation on PASCAL VOC by 4% over base model.

Segmentation Semantic Segmentation +1

4D Visualization of Dynamic Events from Unconstrained Multi-View Videos

no code implementations CVPR 2020 Aayush Bansal, Minh Vo, Yaser Sheikh, Deva Ramanan, Srinivasa Narasimhan

We present a data-driven approach for 4D space-time visualization of dynamic events from videos captured by hand-held multiple cameras.

Unsupervised Audiovisual Synthesis via Exemplar Autoencoders

1 code implementation ICLR 2021 Kangle Deng, Aayush Bansal, Deva Ramanan

We present an unsupervised approach that converts the input speech of any individual into audiovisual streams of potentially-infinitely many output speakers.

Shapes and Context: In-the-Wild Image Synthesis & Manipulation

no code implementations CVPR 2019 Aayush Bansal, Yaser Sheikh, Deva Ramanan

We introduce a data-driven approach for interactively synthesizing in-the-wild images from semantic label maps.

Image Generation

Recycle-GAN: Unsupervised Video Retargeting

1 code implementation ECCV 2018 Aayush Bansal, Shugao Ma, Deva Ramanan, Yaser Sheikh

We introduce a data-driven approach for unsupervised video retargeting that translates content from one domain to another while preserving the style native to a domain, i. e., if contents of John Oliver's speech were to be transferred to Stephen Colbert, then the generated content/speech should be in Stephen Colbert's style.

Face to Face Translation Translation +1

Patch Correspondences for Interpreting Pixel-level CNNs

no code implementations29 Nov 2017 Victor Fragoso, Chunhui Liu, Aayush Bansal, Deva Ramanan

We present compositional nearest neighbors (CompNN), a simple approach to visually interpreting distributed representations learned by a convolutional neural network (CNN) for pixel-level tasks (e. g., image synthesis and segmentation).

Image-to-Image Translation Segmentation +2

PixelNN: Example-based Image Synthesis

1 code implementation ICLR 2018 Aayush Bansal, Yaser Sheikh, Deva Ramanan

We present a simple nearest-neighbor (NN) approach that synthesizes high-frequency photorealistic images from an "incomplete" signal such as a low-resolution image, a surface normal map, or edges.

Image Generation

Be Careful What You Backpropagate: A Case For Linear Output Activations & Gradient Boosting

no code implementations13 Jul 2017 Anders Oland, Aayush Bansal, Roger B. Dannenberg, Bhiksha Raj

To this end, we demonstrate faster convergence and better performance on diverse classification tasks: image classification using CIFAR-10 and ImageNet, and semantic segmentation using PASCAL VOC 2012.

Classification General Classification +2

PixelNet: Representation of the pixels, by the pixels, and for the pixels

1 code implementation21 Feb 2017 Aayush Bansal, Xinlei Chen, Bryan Russell, Abhinav Gupta, Deva Ramanan

We explore design principles for general pixel-level prediction problems, from low-level edge detection to mid-level surface normal estimation to high-level semantic segmentation.

Edge Detection Segmentation +2

PixelNet: Towards a General Pixel-level Architecture

no code implementations21 Sep 2016 Aayush Bansal, Xinlei Chen, Bryan Russell, Abhinav Gupta, Deva Ramanan

We explore architectures for general pixel-level prediction problems, from low-level edge detection to mid-level surface normal estimation to high-level semantic segmentation.

Edge Detection Semantic Segmentation +1

Marr Revisited: 2D-3D Alignment via Surface Normal Prediction

no code implementations CVPR 2016 Aayush Bansal, Bryan Russell, Abhinav Gupta

We introduce an approach that leverages surface normal predictions, along with appearance cues, to retrieve 3D models for objects depicted in 2D still images from a large CAD object library.

Object Pose Prediction +1

Mid-level Elements for Object Detection

no code implementations27 Apr 2015 Aayush Bansal, Abhinav Shrivastava, Carl Doersch, Abhinav Gupta

Building on the success of recent discriminative mid-level elements, we propose a surprisingly simple approach for object detection which performs comparable to the current state-of-the-art approaches on PASCAL VOC comp-3 detection challenge (no external data).

Object object-detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.