Search Results for author: Vineet Gandhi

Found 31 papers, 12 papers with code

Empathic Machines: Using Intermediate Features as Levers to Emulate Emotions in Text-To-Speech Systems

no code implementations NAACL 2022 Saiteja Kosgi, Sarath Sivaprasad, Niranjan Pedanekar, Anil Nelakanti, Vineet Gandhi

We present a method to control the emotional prosody of Text to Speech (TTS) systems by using phoneme-level intermediate features (pitch, energy, and duration) as levers.

Real Time GAZED: Online Shot Selection and Editing of Virtual Cameras from Wide-Angle Monocular Video Recordings

no code implementations27 Nov 2023 Sudheer Achary, Rohit Girmaji, Adhiraj Anil Deshmukh, Vineet Gandhi

Eliminating time-consuming post-production processes and delivering high-quality videos in today's fast-paced digital landscape are the key advantages of real-time approaches.

Video Editing

RobustL2S: Speaker-Specific Lip-to-Speech Synthesis exploiting Self-Supervised Representations

no code implementations3 Jul 2023 Neha Sahipjohn, Neil Shah, Vishal Tambrahalli, Vineet Gandhi

Significant progress has been made in speaker dependent Lip-to-Speech synthesis, which aims to generate speech from silent videos of talking faces.

Speaker-Specific Lip to Speech Synthesis Speech Synthesis

MParrotTTS: Multilingual Multi-speaker Text to Speech Synthesis in Low Resource Setting

no code implementations19 May 2023 Neil Shah, Vishal Tambrahalli, Saiteja Kosgi, Niranjan Pedanekar, Vineet Gandhi

We present MParrotTTS, a unified multilingual, multi-speaker text-to-speech (TTS) synthesis model that can produce high-quality speech.

Speech Synthesis Text-To-Speech Synthesis

Ground then Navigate: Language-guided Navigation in Dynamic Scenes

no code implementations24 Sep 2022 Kanishk Jain, Varun Chhangani, Amogh Tiwari, K. Madhava Krishna, Vineet Gandhi

We investigate the Vision-and-Language Navigation (VLN) problem in the context of autonomous driving in outdoor settings.

Autonomous Driving Navigate +1

Grounding Linguistic Commands to Navigable Regions

1 code implementation24 Dec 2021 Nivedita Rufus, Kanishk Jain, Unni Krishnan R Nair, Vineet Gandhi, K Madhava Krishna

We introduce a new dataset, Talk2Car-RegSeg, which extends the existing Talk2car dataset with segmentation masks for the regions described by the linguistic commands.

Autonomous Vehicles Image Segmentation +2

Emotional Prosody Control for Speech Generation

no code implementations7 Nov 2021 Sarath Sivaprasad, Saiteja Kosgi, Vineet Gandhi

The proposed TTS system can generate speech from the text in any speaker's style, with fine control of emotion.

Reappraising Domain Generalization in Neural Networks

no code implementations15 Oct 2021 Sarath Sivaprasad, Akshay Goindani, Vaibhav Garg, Ritam Basu, Saiteja Kosgi, Vineet Gandhi

We find that the presence of multiple domains incentivizes domain agnostic learning and is the primary reason for generalization in Tradition DG.

Data Augmentation Domain Generalization

High-Resolution Depth Maps Based on TOF-Stereo Fusion

no code implementations30 Jul 2021 Vineet Gandhi, Jan Cech, Radu Horaud

Most of these systems suffer from the problems of noise in the range-data and resolution mismatch between the range sensor and the color cameras, since the resolution of current range sensors is much less than the resolution of color cameras.

Robot Navigation Stereo Matching +1

No Cost Likelihood Manipulation at Test Time for Making Better Mistakes in Deep Networks

1 code implementation1 Apr 2021 Shyamgopal Karthik, Ameya Prabhu, Puneet K. Dokania, Vineet Gandhi

There has been increasing interest in building deep hierarchy-aware classifiers that aim to quantify and reduce the severity of mistakes, and not just reduce the number of errors.


Amending Mistakes Post-hoc in Deep Networks by Leveraging Class Hierarchies

no code implementations ICLR 2021 Shyamgopal Karthik, Ameya Prabhu, Puneet K. Dokania, Vineet Gandhi

There has been increasing interest in building deep hierarchy-aware classifiers, aiming to quantify and reduce the severity of mistakes and not just count the number of errors.

GAZED- Gaze-guided Cinematic Editing of Wide-Angle Monocular Video Recordings

no code implementations22 Oct 2020 K L Bhanu Moorthy, Moneish Kumar, Ramanathan Subramaniam, Vineet Gandhi

We present GAZED- eye GAZe-guided EDiting for videos captured by a solitary, static, wide-angle and high-resolution camera.

valid Video Editing

Cosine meets Softmax: A tough-to-beat baseline for visual grounding

1 code implementation13 Sep 2020 Nivedita Rufus, Unni Krishnan R Nair, K. Madhava Krishna, Vineet Gandhi

In this paper, we present a simple baseline for visual grounding for autonomous driving which outperforms the state of the art methods, while retaining minimal design choices.

Autonomous Driving Metric Learning +2

The Curious Case of Convex Neural Networks

no code implementations9 Jun 2020 Sarath Sivaprasad, Ankur Singh, Naresh Manwani, Vineet Gandhi

In this paper, we investigate a constrained formulation of neural networks where the output is a convex function of the input.

Image Classification

Simple Unsupervised Multi-Object Tracking

no code implementations4 Jun 2020 Shyamgopal Karthik, Ameya Prabhu, Vineet Gandhi

Multi-object tracking has seen a lot of progress recently, albeit with substantial annotation costs for developing better and larger labeled datasets.

Multi-Object Tracking

LiDAR guided Small obstacle Segmentation

2 code implementations12 Mar 2020 Aasheesh Singh, Aditya Kamireddypalli, Vineet Gandhi, K. Madhava Krishna

In this paper, we present a method to reliably detect such obstacles through a multi-modal framework of sparse LiDAR(VLP-16) and Monocular vision.

Autonomous Driving Segmentation +1

Tidying Deep Saliency Prediction Architectures

1 code implementation10 Mar 2020 Navyasri Reddy, Samyak Jain, Pradeep Yarlagadda, Vineet Gandhi

As a result, we propose two novel end-to-end architectures called SimpleNet and MDNSal, which are neater, minimal, more interpretable and achieve state of the art performance on public saliency benchmarks.

Saliency Prediction

CineFilter: Unsupervised Filtering for Real Time Autonomous Camera Systems

no code implementations11 Dec 2019 Sudheer Achary, K L Bhanu Moorthy, Syed Ashar Javed, Nikita Shravan, Vineet Gandhi, Anoop Namboodiri

Autonomous camera systems are often subjected to an optimization/filtering operation to smoothen and stabilize the rough trajectory estimates.

Exploring 3 R's of Long-term Tracking: Re-detection, Recovery and Reliability

no code implementations27 Oct 2019 Shyamgopal Karthik, Abhinav Moudgil, Vineet Gandhi

Recent works have proposed several long term tracking benchmarks and highlight the importance of moving towards long-duration tracking to bridge the gap with application requirements.


Nose, eyes and ears: Head pose estimation by locating facial keypoints

1 code implementation3 Dec 2018 Aryaman Gupta, Kalpit Thakkar, Vineet Gandhi, P. J. Narayanan

Monocular head pose estimation requires learning a model that computes the intrinsic Euler angles for pose (yaw, pitch, roll) from an input image of human face.

Head Pose Estimation

An Iterative Approach for Shadow Removal in Document Images

1 code implementation ICASSP 2018 Vatsal Shah, Vineet Gandhi

Uneven illumination and shadows in document images cause a challenge for digitization applications and automated workflows.

Document Shadow Removal

Learning Unsupervised Visual Grounding Through Semantic Self-Supervision

no code implementations17 Mar 2018 Syed Ashar Javed, Shreyas Saxena, Vineet Gandhi

Localizing natural language phrases in images is a challenging problem that requires joint understanding of both the textual and visual modalities.

Visual Grounding

MergeNet: A Deep Net Architecture for Small Obstacle Discovery

no code implementations17 Mar 2018 Krishnam Gupta, Syed Ashar Javed, Vineet Gandhi, K. Madhava Krishna

We present here, a novel network architecture called MergeNet for discovering small obstacles for on-road scenes in the context of autonomous driving.

Autonomous Driving

Long-Term Visual Object Tracking Benchmark

1 code implementation4 Dec 2017 Abhinav Moudgil, Vineet Gandhi

We propose a new long video dataset (called Track Long and Prosper - TLP) and benchmark for single object tracking.

Visual Object Tracking Visual Tracking

Automated Top View Registration of Broadcast Football Videos

no code implementations4 Mar 2017 Rahul Anand Sharma, Bharath Bhat, Vineet Gandhi, C. V. Jawahar

The proposed method is fully automatic in contrast to the current state of the art which requires manual initialization of point correspondences between the image and the static model.

Bird View Synthesis Homography Estimation

The Prose Storyboard Language: A Tool for Annotating and Directing Movies

1 code implementation30 Aug 2015 Remi Ronfard, Vineet Gandhi, Laurent Boiron, Vaishnavi Ameya Murukutla

The prose storyboard language is a formal language for describing movies shot by shot, where each shot is described with a unique sentence.


Detecting and Naming Actors in Movies Using Generative Appearance Models

no code implementations CVPR 2013 Vineet Gandhi, Remi Ronfard

We introduce a generative model for learning person and costume specific detectors from labeled examples.

Cannot find the paper you are looking for? You can Submit a new open access paper.