Search Results for author: Vineet Gandhi

Found 26 papers, 9 papers with code

Empathic Machines: Using Intermediate Features as Levers to Emulate Emotions in Text-To-Speech Systems

no code implementations NAACL 2022 Saiteja Kosgi, Sarath Sivaprasad, Niranjan Pedanekar, Anil Nelakanti, Vineet Gandhi

We present a method to control the emotional prosody of Text to Speech (TTS) systems by using phoneme-level intermediate features (pitch, energy, and duration) as levers.

Ground then Navigate: Language-guided Navigation in Dynamic Scenes

no code implementations24 Sep 2022 Kanishk Jain, Varun Chhangani, Amogh Tiwari, K. Madhava Krishna, Vineet Gandhi

We investigate the Vision-and-Language Navigation (VLN) problem in the context of autonomous driving in outdoor settings.

Autonomous Driving Navigate +1

Grounding Linguistic Commands to Navigable Regions

no code implementations24 Dec 2021 Nivedita Rufus, Kanishk Jain, Unni Krishnan R Nair, Vineet Gandhi, K Madhava Krishna

We introduce a new dataset, Talk2Car-RegSeg, which extends the existing Talk2car dataset with segmentation masks for the regions described by the linguistic commands.

Autonomous Vehicles Image Segmentation +2

Emotional Prosody Control for Speech Generation

no code implementations7 Nov 2021 Sarath Sivaprasad, Saiteja Kosgi, Vineet Gandhi

The proposed TTS system can generate speech from the text in any speaker's style, with fine control of emotion.

Reappraising Domain Generalization in Neural Networks

no code implementations15 Oct 2021 Sarath Sivaprasad, Akshay Goindani, Vaibhav Garg, Ritam Basu, Saiteja Kosgi, Vineet Gandhi

We find that the presence of multiple domains incentivizes domain agnostic learning and is the primary reason for generalization in Tradition DG.

Data Augmentation Domain Generalization

High-Resolution Depth Maps Based on TOF-Stereo Fusion

no code implementations30 Jul 2021 Vineet Gandhi, Jan Cech, Radu Horaud

Most of these systems suffer from the problems of noise in the range-data and resolution mismatch between the range sensor and the color cameras, since the resolution of current range sensors is much less than the resolution of color cameras.

Robot Navigation Stereo Matching

No Cost Likelihood Manipulation at Test Time for Making Better Mistakes in Deep Networks

1 code implementation1 Apr 2021 Shyamgopal Karthik, Ameya Prabhu, Puneet K. Dokania, Vineet Gandhi

There has been increasing interest in building deep hierarchy-aware classifiers that aim to quantify and reduce the severity of mistakes, and not just reduce the number of errors.

Amending Mistakes Post-hoc in Deep Networks by Leveraging Class Hierarchies

no code implementations ICLR 2021 Shyamgopal Karthik, Ameya Prabhu, Puneet K. Dokania, Vineet Gandhi

There has been increasing interest in building deep hierarchy-aware classifiers, aiming to quantify and reduce the severity of mistakes and not just count the number of errors.

GAZED- Gaze-guided Cinematic Editing of Wide-Angle Monocular Video Recordings

no code implementations22 Oct 2020 K L Bhanu Moorthy, Moneish Kumar, Ramanathan Subramaniam, Vineet Gandhi

We present GAZED- eye GAZe-guided EDiting for videos captured by a solitary, static, wide-angle and high-resolution camera.

Cosine meets Softmax: A tough-to-beat baseline for visual grounding

1 code implementation13 Sep 2020 Nivedita Rufus, Unni Krishnan R Nair, K. Madhava Krishna, Vineet Gandhi

In this paper, we present a simple baseline for visual grounding for autonomous driving which outperforms the state of the art methods, while retaining minimal design choices.

Autonomous Driving Metric Learning +2

The Curious Case of Convex Neural Networks

no code implementations9 Jun 2020 Sarath Sivaprasad, Ankur Singh, Naresh Manwani, Vineet Gandhi

In this paper, we investigate a constrained formulation of neural networks where the output is a convex function of the input.

Image Classification

Simple Unsupervised Multi-Object Tracking

no code implementations4 Jun 2020 Shyamgopal Karthik, Ameya Prabhu, Vineet Gandhi

Multi-object tracking has seen a lot of progress recently, albeit with substantial annotation costs for developing better and larger labeled datasets.

Multi-Object Tracking

LiDAR guided Small obstacle Segmentation

2 code implementations12 Mar 2020 Aasheesh Singh, Aditya Kamireddypalli, Vineet Gandhi, K. Madhava Krishna

In this paper, we present a method to reliably detect such obstacles through a multi-modal framework of sparse LiDAR(VLP-16) and Monocular vision.

Autonomous Driving Semantic Segmentation

Tidying Deep Saliency Prediction Architectures

1 code implementation10 Mar 2020 Navyasri Reddy, Samyak Jain, Pradeep Yarlagadda, Vineet Gandhi

As a result, we propose two novel end-to-end architectures called SimpleNet and MDNSal, which are neater, minimal, more interpretable and achieve state of the art performance on public saliency benchmarks.

Saliency Prediction

CineFilter: Unsupervised Filtering for Real Time Autonomous Camera Systems

no code implementations11 Dec 2019 Sudheer Achary, K L Bhanu Moorthy, Syed Ashar Javed, Nikita Shravan, Vineet Gandhi, Anoop Namboodiri

Autonomous camera systems are often subjected to an optimization/filtering operation to smoothen and stabilize the rough trajectory estimates.

Exploring 3 R's of Long-term Tracking: Re-detection, Recovery and Reliability

no code implementations27 Oct 2019 Shyamgopal Karthik, Abhinav Moudgil, Vineet Gandhi

Recent works have proposed several long term tracking benchmarks and highlight the importance of moving towards long-duration tracking to bridge the gap with application requirements.

Nose, eyes and ears: Head pose estimation by locating facial keypoints

1 code implementation3 Dec 2018 Aryaman Gupta, Kalpit Thakkar, Vineet Gandhi, P. J. Narayanan

Monocular head pose estimation requires learning a model that computes the intrinsic Euler angles for pose (yaw, pitch, roll) from an input image of human face.

Head Pose Estimation

Learning Unsupervised Visual Grounding Through Semantic Self-Supervision

no code implementations17 Mar 2018 Syed Ashar Javed, Shreyas Saxena, Vineet Gandhi

Localizing natural language phrases in images is a challenging problem that requires joint understanding of both the textual and visual modalities.

Visual Grounding

MergeNet: A Deep Net Architecture for Small Obstacle Discovery

no code implementations17 Mar 2018 Krishnam Gupta, Syed Ashar Javed, Vineet Gandhi, K. Madhava Krishna

We present here, a novel network architecture called MergeNet for discovering small obstacles for on-road scenes in the context of autonomous driving.

Autonomous Driving

Long-Term Visual Object Tracking Benchmark

1 code implementation4 Dec 2017 Abhinav Moudgil, Vineet Gandhi

We propose a new long video dataset (called Track Long and Prosper - TLP) and benchmark for single object tracking.

Visual Object Tracking Visual Tracking

Automated Top View Registration of Broadcast Football Videos

no code implementations4 Mar 2017 Rahul Anand Sharma, Bharath Bhat, Vineet Gandhi, C. V. Jawahar

The proposed method is fully automatic in contrast to the current state of the art which requires manual initialization of point correspondences between the image and the static model.

Bird View Synthesis Homography Estimation

The Prose Storyboard Language: A Tool for Annotating and Directing Movies

1 code implementation30 Aug 2015 Remi Ronfard, Vineet Gandhi, Laurent Boiron, Vaishnavi Ameya Murukutla

The prose storyboard language is a formal language for describing movies shot by shot, where each shot is described with a unique sentence.

Graphics

Detecting and Naming Actors in Movies Using Generative Appearance Models

no code implementations CVPR 2013 Vineet Gandhi, Remi Ronfard

We introduce a generative model for learning person and costume specific detectors from labeled examples.

Cannot find the paper you are looking for? You can Submit a new open access paper.