Search Results for author: Vineet Gandhi

Found 33 papers, 13 papers with code

Empathic Machines: Using Intermediate Features as Levers to Emulate Emotions in Text-To-Speech Systems

no code implementations • NAACL 2022 • Saiteja Kosgi, Sarath Sivaprasad, Niranjan Pedanekar, Anil Nelakanti, Vineet Gandhi

We present a method to control the emotional prosody of Text to Speech (TTS) systems by using phoneme-level intermediate features (pitch, energy, and duration) as levers.

Paper
Add Code

SARI: Simplistic Average and Robust Identification based Noisy Partial Label Learning

no code implementations • 7 Feb 2024 • Darshana Saravanan, Naresh Manwani, Vineet Gandhi

Noisy PLL (NPLL) relaxes this constraint by allowing some partial labels to not contain the true label, enhancing the practicality of the problem.

Partial Label Learning Pseudo Label +1

Paper
Add Code

Real Time GAZED: Online Shot Selection and Editing of Virtual Cameras from Wide-Angle Monocular Video Recordings

no code implementations • 27 Nov 2023 • Sudheer Achary, Rohit Girmaji, Adhiraj Anil Deshmukh, Vineet Gandhi

Eliminating time-consuming post-production processes and delivering high-quality videos in today's fast-paced digital landscape are the key advantages of real-time approaches.

Video Editing

Paper
Add Code

RobustL2S: Speaker-Specific Lip-to-Speech Synthesis exploiting Self-Supervised Representations

no code implementations • 3 Jul 2023 • Neha Sahipjohn, Neil Shah, Vishal Tambrahalli, Vineet Gandhi

Significant progress has been made in speaker dependent Lip-to-Speech synthesis, which aims to generate speech from silent videos of talking faces.

Speaker-Specific Lip to Speech Synthesis Speech Synthesis

Paper
Add Code

MParrotTTS: Multilingual Multi-speaker Text to Speech Synthesis in Low Resource Setting

no code implementations • 19 May 2023 • Neil Shah, Vishal Tambrahalli, Saiteja Kosgi, Niranjan Pedanekar, Vineet Gandhi

We present MParrotTTS, a unified multilingual, multi-speaker text-to-speech (TTS) synthesis model that can produce high-quality speech.

Speech Synthesis Text-To-Speech Synthesis

Paper
Add Code

ParrotTTS: Text-to-Speech synthesis by exploiting self-supervised representations

no code implementations • 1 Mar 2023 • Neil Shah, Saiteja Kosgi, Vishal Tambrahalli, Neha Sahipjohn, Niranjan Pedanekar, Vineet Gandhi

We present ParrotTTS, a modularized text-to-speech synthesis model leveraging disentangled self-supervised speech representations.

Self-Supervised Learning Speech Synthesis +1

Paper
Add Code

Test-Time Amendment with a Coarse Classifier for Fine-Grained Classification

1 code implementation • NeurIPS 2023 • Kanishk Jain, Shyamgopal Karthik, Vineet Gandhi

We investigate the problem of reducing mistake severity for fine-grained classification.

Avg Classification

Paper
Code

Ground then Navigate: Language-guided Navigation in Dynamic Scenes

no code implementations • 24 Sep 2022 • Kanishk Jain, Varun Chhangani, Amogh Tiwari, K. Madhava Krishna, Vineet Gandhi

We investigate the Vision-and-Language Navigation (VLN) problem in the context of autonomous driving in outdoor settings.

Autonomous Driving Navigate +1

Paper
Add Code

Grounding Linguistic Commands to Navigable Regions

1 code implementation • 24 Dec 2021 • Nivedita Rufus, Kanishk Jain, Unni Krishnan R Nair, Vineet Gandhi, K Madhava Krishna

We introduce a new dataset, Talk2Car-RegSeg, which extends the existing Talk2car dataset with segmentation masks for the regions described by the linguistic commands.

Autonomous Vehicles Image Segmentation +2

Paper
Code

Emotional Prosody Control for Speech Generation

no code implementations • 7 Nov 2021 • Sarath Sivaprasad, Saiteja Kosgi, Vineet Gandhi

The proposed TTS system can generate speech from the text in any speaker's style, with fine control of emotion.

Paper
Add Code

Reappraising Domain Generalization in Neural Networks

no code implementations • 15 Oct 2021 • Sarath Sivaprasad, Akshay Goindani, Vaibhav Garg, Ritam Basu, Saiteja Kosgi, Vineet Gandhi

We find that the presence of multiple domains incentivizes domain agnostic learning and is the primary reason for generalization in Tradition DG.

Data Augmentation Domain Generalization

Paper
Add Code

Bringing Generalization to Deep Multi-View Pedestrian Detection

1 code implementation • 24 Sep 2021 • Jeet Vora, Swetanjal Dutta, Kanishk Jain, Shyamgopal Karthik, Vineet Gandhi

Multi-view Detection (MVD) is highly effective for occlusion reasoning in a crowded environment.

Ranked #2 on Multiview Detection on GMVD

Multiview Detection Pedestrian Detection

Paper
Code

High-Resolution Depth Maps Based on TOF-Stereo Fusion

no code implementations • 30 Jul 2021 • Vineet Gandhi, Jan Cech, Radu Horaud

Most of these systems suffer from the problems of noise in the range-data and resolution mismatch between the range sensor and the color cameras, since the resolution of current range sensors is much less than the resolution of color cameras.

Robot Navigation Stereo Matching +1

Paper
Add Code

Comprehensive Multi-Modal Interactions for Referring Image Segmentation

1 code implementation • Findings (ACL) 2022 • Kanishk Jain, Vineet Gandhi

We investigate Referring Image Segmentation (RIS), which outputs a segmentation map corresponding to the natural language description.

Ranked #3 on Referring Expression Segmentation on ReferIt

Image Segmentation Referring Expression Segmentation +2

Paper
Code

No Cost Likelihood Manipulation at Test Time for Making Better Mistakes in Deep Networks

1 code implementation • 1 Apr 2021 • Shyamgopal Karthik, Ameya Prabhu, Puneet K. Dokania, Vineet Gandhi

There has been increasing interest in building deep hierarchy-aware classifiers that aim to quantify and reduce the severity of mistakes, and not just reduce the number of errors.

Paper
Code

Amending Mistakes Post-hoc in Deep Networks by Leveraging Class Hierarchies

no code implementations • ICLR 2021 • Shyamgopal Karthik, Ameya Prabhu, Puneet K. Dokania, Vineet Gandhi

There has been increasing interest in building deep hierarchy-aware classifiers, aiming to quantify and reduce the severity of mistakes and not just count the number of errors.

Paper
Add Code

ViNet: Pushing the limits of Visual Modality for Audio-Visual Saliency Prediction

1 code implementation • 11 Dec 2020 • Samyak Jain, Pradeep Yarlagadda, Shreyank Jyoti, Shyamgopal Karthik, Ramanathan Subramanian, Vineet Gandhi

We also explore a variation of ViNet architecture by augmenting audio features into the decoder.

Ranked #1 on Video Saliency Detection on MSU Video Saliency Prediction

Action Recognition Decoder +3

Paper
Code

GAZED- Gaze-guided Cinematic Editing of Wide-Angle Monocular Video Recordings

no code implementations • 22 Oct 2020 • K L Bhanu Moorthy, Moneish Kumar, Ramanathan Subramaniam, Vineet Gandhi

We present GAZED- eye GAZe-guided EDiting for videos captured by a solitary, static, wide-angle and high-resolution camera.

valid Video Editing

Paper
Add Code

Cosine meets Softmax: A tough-to-beat baseline for visual grounding

1 code implementation • 13 Sep 2020 • Nivedita Rufus, Unni Krishnan R Nair, K. Madhava Krishna, Vineet Gandhi

In this paper, we present a simple baseline for visual grounding for autonomous driving which outperforms the state of the art methods, while retaining minimal design choices.

Ranked #6 on Referring Expression Comprehension on Talk2Car

Autonomous Driving Metric Learning +3

Paper
Code

The Curious Case of Convex Neural Networks

no code implementations • 9 Jun 2020 • Sarath Sivaprasad, Ankur Singh, Naresh Manwani, Vineet Gandhi

In this paper, we investigate a constrained formulation of neural networks where the output is a convex function of the input.

Image Classification

Paper
Add Code

Simple Unsupervised Multi-Object Tracking

no code implementations • 4 Jun 2020 • Shyamgopal Karthik, Ameya Prabhu, Vineet Gandhi

Multi-object tracking has seen a lot of progress recently, albeit with substantial annotation costs for developing better and larger labeled datasets.

Multi-Object Tracking Object

Paper
Add Code

LiDAR guided Small obstacle Segmentation

2 code implementations • 12 Mar 2020 • Aasheesh Singh, Aditya Kamireddypalli, Vineet Gandhi, K. Madhava Krishna

In this paper, we present a method to reliably detect such obstacles through a multi-modal framework of sparse LiDAR(VLP-16) and Monocular vision.

Autonomous Driving Segmentation +1

Paper
Code

Tidying Deep Saliency Prediction Architectures

1 code implementation • 10 Mar 2020 • Navyasri Reddy, Samyak Jain, Pradeep Yarlagadda, Vineet Gandhi

As a result, we propose two novel end-to-end architectures called SimpleNet and MDNSal, which are neater, minimal, more interpretable and achieve state of the art performance on public saliency benchmarks.

Decoder Saliency Prediction

Paper
Code

CineFilter: Unsupervised Filtering for Real Time Autonomous Camera Systems

no code implementations • 11 Dec 2019 • Sudheer Achary, K L Bhanu Moorthy, Syed Ashar Javed, Nikita Shravan, Vineet Gandhi, Anoop Namboodiri

Autonomous camera systems are often subjected to an optimization/filtering operation to smoothen and stabilize the rough trajectory estimates.

Decoder

Paper
Add Code

Exploring 3 R's of Long-term Tracking: Re-detection, Recovery and Reliability

no code implementations • 27 Oct 2019 • Shyamgopal Karthik, Abhinav Moudgil, Vineet Gandhi

Recent works have proposed several long term tracking benchmarks and highlight the importance of moving towards long-duration tracking to bridge the gap with application requirements.

Paper
Add Code

Nose, eyes and ears: Head pose estimation by locating facial keypoints

1 code implementation • 3 Dec 2018 • Aryaman Gupta, Kalpit Thakkar, Vineet Gandhi, P. J. Narayanan

Monocular head pose estimation requires learning a model that computes the intrinsic Euler angles for pose (yaw, pitch, roll) from an input image of human face.

Ranked #2 on Head Pose Estimation on AFLW

Head Pose Estimation

Paper
Code

An Iterative Approach for Shadow Removal in Document Images

1 code implementation • ICASSP 2018 • Vatsal Shah, Vineet Gandhi

Uneven illumination and shadows in document images cause a challenge for digitization applications and automated workflows.

Document Shadow Removal

Paper
Code

MergeNet: A Deep Net Architecture for Small Obstacle Discovery

no code implementations • 17 Mar 2018 • Krishnam Gupta, Syed Ashar Javed, Vineet Gandhi, K. Madhava Krishna

We present here, a novel network architecture called MergeNet for discovering small obstacles for on-road scenes in the context of autonomous driving.

Autonomous Driving

Paper
Add Code

Learning Unsupervised Visual Grounding Through Semantic Self-Supervision

no code implementations • 17 Mar 2018 • Syed Ashar Javed, Shreyas Saxena, Vineet Gandhi

Localizing natural language phrases in images is a challenging problem that requires joint understanding of both the textual and visual modalities.

Visual Grounding

Paper
Add Code

Long-Term Visual Object Tracking Benchmark

1 code implementation • 4 Dec 2017 • Abhinav Moudgil, Vineet Gandhi

We propose a new long video dataset (called Track Long and Prosper - TLP) and benchmark for single object tracking.

Object Visual Object Tracking +1

Paper
Code

Automated Top View Registration of Broadcast Football Videos

no code implementations • 4 Mar 2017 • Rahul Anand Sharma, Bharath Bhat, Vineet Gandhi, C. V. Jawahar

The proposed method is fully automatic in contrast to the current state of the art which requires manual initialization of point correspondences between the image and the static model.

Bird View Synthesis Homography Estimation

Paper
Add Code

The Prose Storyboard Language: A Tool for Annotating and Directing Movies

1 code implementation • 30 Aug 2015 • Remi Ronfard, Vineet Gandhi, Laurent Boiron, Vaishnavi Ameya Murukutla

The prose storyboard language is a formal language for describing movies shot by shot, where each shot is described with a unique sentence.

Graphics

Paper
Code

Detecting and Naming Actors in Movies Using Generative Appearance Models

no code implementations • CVPR 2013 • Vineet Gandhi, Remi Ronfard

We introduce a generative model for learning person and costume specific detectors from labeled examples.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.