Search Results for author: Subhashini Venugopalan

Found 27 papers, 12 papers with code

Speech Intelligibility Classifiers from 550k Disordered Speech Samples

no code implementations13 Mar 2023 Subhashini Venugopalan, Jimmy Tobin, Samuel J. Yang, Katie Seaver, Richard J. N. Cave, Pan-Pan Jiang, Neil Zeghidour, Rus Heywood, Jordan Green, Michael P. Brenner

We developed dysarthric speech intelligibility classifiers on 551, 176 disordered speech samples contributed by a diverse set of 468 speakers, with a range of self-reported speaking disorders and rated for their overall intelligibility on a five-point scale.

Is Attention All That NeRF Needs?

no code implementations27 Jul 2022 Mukund Varma T, Peihao Wang, Xuxi Chen, Tianlong Chen, Subhashini Venugopalan, Zhangyang Wang

While prior works on NeRFs optimize a scene representation by inverting a handcrafted rendering equation, GNT achieves neural representation and rendering that generalizes across scenes using transformers at two stages.

Inductive Bias SSIM

Context-Aware Abbreviation Expansion Using Large Language Models

no code implementations NAACL 2022 Shanqing Cai, Subhashini Venugopalan, Katrin Tomanek, Ajit Narayanan, Meredith Ringel Morris, Michael P. Brenner

Motivated by the need for accelerating text entry in augmentative and alternative communication (AAC) for people with severe motor impairments, we propose a paradigm in which phrases are abbreviated aggressively as primarily word-initial letters.

TRILLsson: Distilled Universal Paralinguistic Speech Representations

no code implementations1 Mar 2022 Joel Shor, Subhashini Venugopalan

Our largest distilled model is less than 15% the size of the original model (314MB vs 2. 2GB), achieves over 96% the accuracy on 6 of 7 tasks, and is trained on 6. 5% the data.

Emotion Recognition Knowledge Distillation

Using a Cross-Task Grid of Linear Probes to Interpret CNN Model Predictions On Retinal Images

no code implementations23 Jul 2021 Katy Blumer, Subhashini Venugopalan, Michael P. Brenner, Jon Kleinberg

We find that some target tasks are easily predicted irrespective of the source task, and that some other target tasks are more accurately predicted from correlated source tasks than from embeddings trained on the same task.


Guided Integrated Gradients: An Adaptive Path Method for Removing Noise

1 code implementation CVPR 2021 Andrei Kapishnikov, Subhashini Venugopalan, Besim Avci, Ben Wedin, Michael Terry, Tolga Bolukbasi

To minimize the effect of this source of noise, we propose adapting the attribution path itself -- conditioning the path not just on the image but also on the model being explained.

Scaling Symbolic Methods using Gradients for Neural Model Explanation

2 code implementations ICLR 2021 Subham Sekhar Sahoo, Subhashini Venugopalan, Li Li, Rishabh Singh, Patrick Riley

In this work, we propose a technique for combining gradient-based methods with symbolic techniques to scale such analyses and demonstrate its application for model explanation.

Attribution in Scale and Space

1 code implementation CVPR 2020 Shawn Xu, Subhashini Venugopalan, Mukund Sundararajan

Third, it eliminates the need for a 'baseline' parameter for Integrated Gradients [31] for perception tasks.

Object Recognition

Utilizing Large Scale Vision and Text Datasets for Image Segmentation from Referring Expressions

no code implementations30 Aug 2016 Ronghang Hu, Marcus Rohrbach, Subhashini Venugopalan, Trevor Darrell

Image segmentation from referring expressions is a joint vision and language modeling task, where the input is an image and a textual expression describing a particular region in the image; and the goal is to localize and segment the specific image region based on the given expression.

Image Captioning Image Segmentation +2

Captioning Images with Diverse Objects

1 code implementation CVPR 2017 Subhashini Venugopalan, Lisa Anne Hendricks, Marcus Rohrbach, Raymond Mooney, Trevor Darrell, Kate Saenko

We propose minimizing a joint objective which can learn from these diverse data sources and leverage distributional semantic embeddings, enabling the model to generalize and describe novel objects outside of image-caption datasets.

Object Recognition

Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text

3 code implementations EMNLP 2016 Subhashini Venugopalan, Lisa Anne Hendricks, Raymond Mooney, Kate Saenko

This paper investigates how linguistic knowledge mined from large text corpora can aid the generation of natural language descriptions of videos.

Language Modelling Video Description

Sequence to Sequence - Video to Text

no code implementations ICCV 2015 Subhashini Venugopalan, Marcus Rohrbach, Jeffrey Donahue, Raymond Mooney, Trevor Darrell, Kate Saenko

Our LSTM model is trained on video-sentence pairs and learns to associate a sequence of video frames to a sequence of words in order to generate a description of the event in the video clip.

Language Modelling

Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data

1 code implementation CVPR 2016 Lisa Anne Hendricks, Subhashini Venugopalan, Marcus Rohrbach, Raymond Mooney, Kate Saenko, Trevor Darrell

Current deep caption models can only describe objects contained in paired image-sentence corpora, despite the fact that they are pre-trained with large object recognition datasets, namely ImageNet.

Image Captioning Novel Concepts +2

A Multi-scale Multiple Instance Video Description Network

no code implementations21 May 2015 Huijuan Xu, Subhashini Venugopalan, Vasili Ramanishka, Marcus Rohrbach, Kate Saenko

Most state-of-the-art methods for solving this problem borrow existing deep convolutional neural network (CNN) architectures (AlexNet, GoogLeNet) to extract a visual representation of the input video.

Image Segmentation Multiple Instance Learning +2

Sequence to Sequence -- Video to Text

3 code implementations3 May 2015 Subhashini Venugopalan, Marcus Rohrbach, Jeff Donahue, Raymond Mooney, Trevor Darrell, Kate Saenko

Our LSTM model is trained on video-sentence pairs and learns to associate a sequence of video frames to a sequence of words in order to generate a description of the event in the video clip.

Language Modelling

Long-term Recurrent Convolutional Networks for Visual Recognition and Description

7 code implementations CVPR 2015 Jeff Donahue, Lisa Anne Hendricks, Marcus Rohrbach, Subhashini Venugopalan, Sergio Guadarrama, Kate Saenko, Trevor Darrell

Models based on deep convolutional networks have dominated recent image interpretation tasks; we investigate whether models which are also recurrent, or "temporally deep", are effective for tasks involving sequences, visual and otherwise.

Retrieval Video Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.