Search Results for author: Yash Bhalgat

Found 21 papers, 6 papers with code

Reflecting Reality: Enabling Diffusion Models to Produce Faithful Mirror Reflections

no code implementations23 Sep 2024 Ankit Dhiman, Manan Shah, Rishubh Parihar, Yash Bhalgat, Lokesh R Boregowda, R Venkatesh Babu

To the best of our knowledge, we are the first to successfully tackle the challenging problem of generating controlled and faithful mirror reflections of an object in a scene using diffusion based models.

Image Inpainting

GSLoc: Efficient Camera Pose Refinement via 3D Gaussian Splatting

no code implementations20 Aug 2024 Changkun Liu, Shuai Chen, Yash Bhalgat, Siyan Hu, Ming Cheng, ZiRui Wang, Victor Adrian Prisacariu, Tristan Braud

We leverage 3D Gaussian Splatting (3DGS) as a scene representation and propose a novel test-time camera pose refinement framework, GSLoc.

Pose Estimation regression +1

3D-Aware Instance Segmentation and Tracking in Egocentric Videos

no code implementations19 Aug 2024 Yash Bhalgat, Vadim Tschernezki, Iro Laina, João F. Henriques, Andrea Vedaldi, Andrew Zisserman

Egocentric videos present unique challenges for 3D scene understanding due to rapid camera motion, frequent object occlusions, and limited object visibility.

3D Object Reconstruction Instance Segmentation +5

Reproducibility Study of CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image Classification

1 code implementation19 May 2024 Manan Shah, Yash Bhalgat

(3) We try to verify the effectiveness of the gradient-alignment training method specified in the original paper, which is used to update the network parameters and pseudo labels.

Multi-Label Image Classification

N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields

no code implementations16 Mar 2024 Yash Bhalgat, Iro Laina, João F. Henriques, Andrew Zisserman, Andrea Vedaldi

To address this, we introduce Nested Neural Feature Fields (N2F2), a novel approach that employs hierarchical supervision to learn a single feature field, wherein different dimensions within the same high-dimensional feature encode scene properties at varying granularities.

Scene Understanding

SiLVR: Scalable Lidar-Visual Reconstruction with Neural Radiance Fields for Robotic Inspection

no code implementations11 Mar 2024 Yifu Tao, Yash Bhalgat, Lanke Frank Tarimo Fu, Matias Mattamala, Nived Chebrolu, Maurice Fallon

We present a neural-field-based large-scale reconstruction system that fuses lidar and vision data to generate high-quality reconstructions that are geometrically accurate and capture photo-realistic textures.

Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion

1 code implementation NeurIPS 2023 Yash Bhalgat, Iro Laina, João F. Henriques, Andrew Zisserman, Andrea Vedaldi

Our approach outperforms the state-of-the-art on challenging scenes from the ScanNet, Hypersim, and Replica datasets, as well as on our newly created Messy Rooms dataset, demonstrating the effectiveness and scalability of our slow-fast clustering method.

Clustering Instance Segmentation +2

A Light Touch Approach to Teaching Transformers Multi-view Geometry

no code implementations CVPR 2023 Yash Bhalgat, Joao F. Henriques, Andrew Zisserman

Transformers are powerful visual learners, in large part due to their conspicuous lack of manually-specified priors.

Retrieval

A Prompt Array Keeps the Bias Away: Debiasing Vision-Language Models with Adversarial Learning

1 code implementation22 Mar 2022 Hugo Berg, Siobhan Mackenzie Hall, Yash Bhalgat, Wonsuk Yang, Hannah Rose Kirk, Aleksandar Shtedritski, Max Bain

Vision-language models can encode societal biases and stereotypes, but there are challenges to measuring and mitigating these multimodal harms due to lacking measurement robustness and feature degradation.

Dynamic Iterative Refinement for Efficient 3D Hand Pose Estimation

no code implementations11 Nov 2021 John Yang, Yash Bhalgat, Simyung Chang, Fatih Porikli, Nojun Kwak

While hand pose estimation is a critical component of most interactive extended reality and gesture recognition systems, contemporary approaches are not optimized for computational and memory efficiency.

3D Hand Pose Estimation Gesture Recognition

Data-driven Weight Initialization with Sylvester Solvers

no code implementations2 May 2021 Debasmit Das, Yash Bhalgat, Fatih Porikli

The initialization is cast as an optimization problem where we minimize a combination of encoding and decoding losses of the input activations, which is further constrained by a user-defined latent code.

LSQ+: Improving low-bit quantization through learnable offsets and better initialization

4 code implementations20 Apr 2020 Yash Bhalgat, Jinwon Lee, Markus Nagel, Tijmen Blankevoort, Nojun Kwak

To solve this problem, we propose LSQ+, a natural extension of LSQ, wherein we introduce a general asymmetric quantization scheme with trainable scale and offset parameters that can learn to accommodate the negative activations.

Image Classification Quantization

Learned Threshold Pruning

no code implementations28 Feb 2020 Kambiz Azarian, Yash Bhalgat, Jinwon Lee, Tijmen Blankevoort

This is in contrast to other methods that search for per-layer thresholds via a computationally intensive iterative pruning and fine-tuning process.

QKD: Quantization-aware Knowledge Distillation

no code implementations28 Nov 2019 Jangho Kim, Yash Bhalgat, Jinwon Lee, Chirag Patel, Nojun Kwak

First, Self-studying (SS) phase fine-tunes a quantized low-precision student network without KD to obtain a good initialization.

Knowledge Distillation Quantization

Teacher-Student Learning Paradigm for Tri-training: An Efficient Method for Unlabeled Data Exploitation

no code implementations25 Sep 2019 Yash Bhalgat, Zhe Liu, Pritam Gundecha, Jalal Mahmud, Amita Misra

Given that labeled data is expensive to obtain in real-world scenarios, many semi-supervised algorithms have explored the task of exploitation of unlabeled data.

Sentiment Analysis

Annotation-cost Minimization for Medical Image Segmentation using Suggestive Mixed Supervision Fully Convolutional Networks

no code implementations29 Dec 2018 Yash Bhalgat, Meet Shah, Suyash Awate

For medical image segmentation, most fully convolutional networks (FCNs) need strong supervision through a large sample of high-quality dense segmentations, which is taxing in terms of costs, time and logistics involved.

Image Segmentation Medical Image Segmentation +1

FusedLSTM: Fusing frame-level and video-level features for Content-based Video Relevance Prediction

no code implementations29 Sep 2018 Yash Bhalgat

The last section gives a complete comparison of all the approaches implemented during this challenge, including the one presented in the baseline paper.

Triplet

Cannot find the paper you are looking for? You can Submit a new open access paper.