Search Results for author: Mukul Khanna

Found 6 papers, 1 papers with code

GOAT-Bench: A Benchmark for Multi-Modal Lifelong Navigation

no code implementations9 Apr 2024 Mukul Khanna, Ram Ramrakhya, Gunjan Chhablani, Sriram Yenamandra, Theophile Gervet, Matthew Chang, Zsolt Kira, Devendra Singh Chaplot, Dhruv Batra, Roozbeh Mottaghi

The Embodied AI community has made significant strides in visual navigation tasks, exploring targets from 3D coordinates, objects, language descriptions, and images.

Navigate Visual Navigation

Habitat Synthetic Scenes Dataset (HSSD-200): An Analysis of 3D Scene Scale and Realism Tradeoffs for ObjectGoal Navigation

no code implementations20 Jun 2023 Mukul Khanna, Yongsen Mao, Hanxiao Jiang, Sanjay Haresh, Brennan Shacklett, Dhruv Batra, Alexander Clegg, Eric Undersander, Angel X. Chang, Manolis Savva

Surprisingly, we observe that agents trained on just 122 scenes from our dataset outperform agents trained on 10, 000 scenes from the ProcTHOR-10K dataset in terms of zero-shot generalization in real-world scanned environments.

Navigate Zero-shot Generalization

DeepHS-HDRVideo: Deep High Speed High Dynamic Range Video Reconstruction

no code implementations10 Oct 2022 Zeeshan Khan, Parth Shettiwar, Mukul Khanna, Shanmuganathan Raman

Previous works in high dynamic range (HDR) video reconstruction uses sequence of alternating exposure LDR frames as input, and align the neighbouring frames using optical flow based networks.

Optical Flow Estimation Video Frame Interpolation +3

Episodic Memory Question Answering

no code implementations CVPR 2022 Samyak Datta, Sameer Dharur, Vincent Cartillier, Ruta Desai, Mukul Khanna, Dhruv Batra, Devi Parikh

Towards that end, we introduce (1) a new task - Episodic Memory Question Answering (EMQA) wherein an egocentric AI assistant is provided with a video sequence (the tour) and a question as an input and is asked to localize its answer to the question within the tour, (2) a dataset of grounded questions designed to probe the agent's spatio-temporal understanding of the tour, and (3) a model for the task that encodes the scene as an allocentric, top-down semantic feature map and grounds the question into the map to localize the answer.

Question Answering

FHDR: HDR Image Reconstruction from a Single LDR Image using Feedback Network

1 code implementation24 Dec 2019 Zeeshan Khan, Mukul Khanna, Shanmuganathan Raman

High dynamic range (HDR) image generation from a single exposure low dynamic range (LDR) image has been made possible due to the recent advances in Deep Learning.

Image Generation Image Reconstruction +1

Cannot find the paper you are looking for? You can Submit a new open access paper.