Search Results for author: Danish Pruthi

Found 21 papers, 15 papers with code

Performance Trade-offs of Watermarking Large Language Models

no code implementations16 Nov 2023 Anirudh Ajith, Sameer Singh, Danish Pruthi

However, implanting such signals alters the model's output distribution and can have unintended effects when watermarked LLMs are used for downstream applications.

Language Modelling Misinformation +6

Goodhart's Law Applies to NLP's Explanation Benchmarks

no code implementations28 Aug 2023 Jennifer Hsia, Danish Pruthi, Aarti Singh, Zachary C. Lipton

First, we show that we can inflate a model's comprehensiveness and sufficiency scores dramatically without altering its predictions or explanations on in-distribution test inputs.

Inspecting the Geographical Representativeness of Images from Text-to-Image Models

no code implementations ICCV 2023 Abhipsa Basu, R. Venkatesh Babu, Danish Pruthi

Recent progress in generative models has resulted in models that produce both realistic as well as relevant images for most textual inputs.

Data Augmentation Marketing

Model-tuning Via Prompts Makes NLP Models Adversarially Robust

1 code implementation13 Mar 2023 Mrigank Raman, Pratyush Maini, J. Zico Kolter, Zachary C. Lipton, Danish Pruthi

Across 5 NLP datasets, 4 adversarial attacks, and 3 different models, MVP improves performance against adversarial substitutions by an average of 8% over standard methods and even outperforms adversarial training-based state-of-art defenses by 3. 5%.

Adversarial Robustness Language Modelling +1

Learning the Legibility of Visual Text Perturbations

1 code implementation9 Mar 2023 Dev Seth, Rickard Stureborg, Danish Pruthi, Bhuwan Dhingra

In this work, we address this gap by learning models that predict the legibility of a perturbed string, and rank candidate perturbations based on their legibility.

Assisting Human Decisions in Document Matching

1 code implementation16 Feb 2023 Joon Sik Kim, Valerie Chen, Danish Pruthi, Nihar B. Shah, Ameet Talwalkar

Many practical applications, ranging from paper-reviewer assignment in peer review to job-applicant matching for hiring, require human decision makers to identify relevant matches by combining their expertise with predictions from machine learning models.

Measures of Information Reflect Memorization Patterns

no code implementations17 Oct 2022 Rachit Bansal, Danish Pruthi, Yonatan Belinkov

In this work, we hypothesize -- and subsequently show -- that the diversity in the activation patterns of different neurons is reflective of model generalization and memorization.

Memorization Model Selection

Learning to Scaffold: Optimizing Model Explanations for Teaching

1 code implementation22 Apr 2022 Patrick Fernandes, Marcos Treviso, Danish Pruthi, André F. T. Martins, Graham Neubig

In this work, leveraging meta-learning techniques, we extend this idea to improve the quality of the explanations themselves, specifically by optimizing explanations such that student models more effectively learn to simulate the original model.


Explain, Edit, and Understand: Rethinking User Study Design for Evaluating Model Explanations

1 code implementation17 Dec 2021 Siddhant Arora, Danish Pruthi, Norman Sadeh, William W. Cohen, Zachary C. Lipton, Graham Neubig

Through our evaluation, we observe that for a linear bag-of-words model, participants with access to the feature coefficients during training are able to cause a larger reduction in model confidence in the testing phase when compared to the no-explanation control.

Deception Detection

Evaluating Explanations: How much do explanations from the teacher aid students?

1 code implementation1 Dec 2020 Danish Pruthi, Rachit Bansal, Bhuwan Dhingra, Livio Baldini Soares, Michael Collins, Zachary C. Lipton, Graham Neubig, William W. Cohen

While many methods purport to explain predictions by highlighting salient features, what aims these explanations serve and how they ought to be evaluated often go unstated.

Question Answering text-classification +1

Weakly- and Semi-supervised Evidence Extraction

1 code implementation Findings of the Association for Computational Linguistics 2020 Danish Pruthi, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton

For many prediction tasks, stakeholders desire not only predictions but also supporting evidence that a human can use to verify its correctness.

Why and when should you pool? Analyzing Pooling in Recurrent Architectures

1 code implementation Findings of the Association for Computational Linguistics 2020 Pratyush Maini, Keshav Kolluru, Danish Pruthi, Mausam

We find that pooling-based architectures substantially differ from their non-pooling equivalents in their learning ability and positional biases--which elucidate their performance benefits.

text-classification Text Classification

Learning to Deceive with Attention-Based Explanations

3 code implementations ACL 2020 Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton

Attention mechanisms are ubiquitous components in neural architectures applied to natural language processing.


Combating Adversarial Misspellings with Robust Word Recognition

3 code implementations ACL 2019 Danish Pruthi, Bhuwan Dhingra, Zachary C. Lipton

To combat adversarial spelling mistakes, we propose placing a word recognition model in front of the downstream classifier.

Sentiment Analysis

compare-mt: A Tool for Holistic Comparison of Language Generation Systems

2 code implementations NAACL 2019 Graham Neubig, Zi-Yi Dou, Junjie Hu, Paul Michel, Danish Pruthi, Xinyi Wang, John Wieting

In this paper, we describe compare-mt, a tool for holistic analysis and comparison of the results of systems for language generation tasks such as machine translation.

Machine Translation Text Generation +1

Measuring Density and Similarity of Task Relevant Information in Neural Representations

no code implementations27 Sep 2018 Danish Pruthi, Mansi Gupta, Nitish Kumar Kulkarni, Graham Neubig, Eduard Hovy

Neural models achieve state-of-the-art performance due to their ability to extract salient features useful to downstream tasks.

Transfer Learning

Simple and Effective Semi-Supervised Question Answering

no code implementations NAACL 2018 Bhuwan Dhingra, Danish Pruthi, Dheeraj Rajagopal

Recent success of deep learning models for the task of extractive Question Answering (QA) is hinged on the availability of large annotated corpora.

Extractive Question-Answering Question Answering +1

SPINE: SParse Interpretable Neural Embeddings

2 code implementations23 Nov 2017 Anant Subramanian, Danish Pruthi, Harsh Jhamtani, Taylor Berg-Kirkpatrick, Eduard Hovy

We propose a novel variant of denoising k-sparse autoencoders that generates highly efficient and interpretable distributed word representations (word embeddings), beginning with existing word representations from state-of-the-art methods like GloVe and word2vec.

Denoising Word Embeddings

Cannot find the paper you are looking for? You can Submit a new open access paper.