Search Results for author: Amit Dhurandhar

Found 53 papers, 15 papers with code

Programming Refusal with Conditional Activation Steering

1 code implementation6 Sep 2024 Bruce W. Lee, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Erik Miehling, Pierre Dognin, Manish Nagireddy, Amit Dhurandhar

In this paper, we propose Conditional Activation Steering (CAST), which analyzes LLM activation patterns during inference to selectively apply or withhold activation steering based on the input context.

CELL your Model: Contrastive Explanation Methods for Large Language Models

no code implementations17 Jun 2024 Ronny Luss, Erik Miehling, Amit Dhurandhar

However, in the case of generative AI such as large language models (LLMs), there is no class prediction to explain.

Text Generation

The global landscape of academic guidelines for generative AI and Large Language Models

no code implementations26 May 2024 Junfeng Jiao, Saleh Afroogh, Kevin Chen, David Atkinson, Amit Dhurandhar

The integration of Generative Artificial Intelligence (GAI) and Large Language Models (LLMs) in academia has spurred a global discourse on their potential pedagogical benefits and ethical considerations.

Misinformation

Deep Generative Sampling in the Dual Divergence Space: A Data-efficient & Interpretative Approach for Generative AI

no code implementations10 Apr 2024 Sahil Garg, Anderson Schneider, Anant Raj, Kashif Rasul, Yuriy Nevmyvaka, Sneihil Gopal, Amit Dhurandhar, Guillermo Cecchi, Irina Rish

In addition to the data efficiency gained from direct sampling, we propose an algorithm that offers a significant reduction in sample complexity for estimating the divergence of the data distribution with respect to the marginal distribution.

Denoising

Multi-Level Explanations for Generative Language Models

no code implementations21 Mar 2024 Lucas Monteiro Paes, Dennis Wei, Hyo Jin Do, Hendrik Strobelt, Ronny Luss, Amit Dhurandhar, Manish Nagireddy, Karthikeyan Natesan Ramamurthy, Prasanna Sattigeri, Werner Geyer, Soumya Ghosh

To address the challenges of text as output and long text inputs, we propose a general framework called MExGen that can be instantiated with different attribution algorithms.

Question Answering text-classification +1

NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models

no code implementations28 Feb 2024 Amit Dhurandhar, Tejaswini Pedapati, Ronny Luss, Soham Dan, Aurelie Lozano, Payel Das, Georgios Kollias

Transformer-based Language Models have become ubiquitous in Natural Language Processing (NLP) due to their impressive performance on various tasks.

Machine Translation Natural Language Inference

Trust Regions for Explanations via Black-Box Probabilistic Certification

1 code implementation17 Feb 2024 Amit Dhurandhar, Swagatam Haldar, Dennis Wei, Karthikeyan Natesan Ramamurthy

fidelity, stability), can we find the largest hypercube (i. e., $\ell_{\infty}$ ball) centered at the example such that when the explanation is applied to all examples within the hypercube, (with high probability) a quality criterion is met (viz.

Spectral Adversarial MixUp for Few-Shot Unsupervised Domain Adaptation

1 code implementation3 Sep 2023 Jiajin Zhang, Hanqing Chao, Amit Dhurandhar, Pin-Yu Chen, Ali Tajer, Yangyang Xu, Pingkun Yan

To accomplish this challenging task, first, a spectral sensitivity map is introduced to characterize the generalization weaknesses of models in the frequency domain.

Unsupervised Domain Adaptation

Reprogramming Pretrained Language Models for Antibody Sequence Infilling

1 code implementation5 Oct 2022 Igor Melnyk, Vijil Chenthamarakshan, Pin-Yu Chen, Payel Das, Amit Dhurandhar, Inkit Padhi, Devleena Das

Results on antibody design benchmarks show that our model on low-resourced antibody sequence dataset provides highly diverse CDR sequences, up to more than a two-fold increase of diversity over the baselines, without losing structural integrity and naturalness.

Diversity Language Modelling +2

Anomaly Attribution with Likelihood Compensation

no code implementations23 Aug 2022 Tsuyoshi Idé, Amit Dhurandhar, Jiří Navrátil, Moninder Singh, Naoki Abe

In either case, one would ideally want to compute a ``responsibility score'' indicative of the extent to which an input variable is responsible for the anomalous output.

Atomist or Holist? A Diagnosis and Vision for More Productive Interdisciplinary AI Ethics Dialogue

no code implementations19 Aug 2022 Travis Greene, Amit Dhurandhar, Galit Shmueli

In response to growing recognition of the social impact of new AI-based technologies, major AI and ML conferences and journals now encourage or require papers to include ethics impact statements and undergo ethics reviews.

Ethics Philosophy

Accurate Clinical Toxicity Prediction using Multi-task Deep Neural Nets and Contrastive Molecular Explanations

1 code implementation13 Apr 2022 Bhanushee Sharma, Vijil Chenthamarakshan, Amit Dhurandhar, Shiranee Pereira, James A. Hendler, Jonathan S. Dordick, Payel Das

Additionally, our multi-task approach is comprehensive in the sense that it is comparable to state-of-the-art approaches for specific endpoints in in vitro, in vivo and clinical platforms.

Transfer Learning

Analogies and Feature Attributions for Model Agnostic Explanation of Similarity Learners

no code implementations2 Feb 2022 Karthikeyan Natesan Ramamurthy, Amit Dhurandhar, Dennis Wei, Zaid Bin Tariq

We first propose a method that provides feature attributions to explain the similarity between a pair of inputs as determined by a black box similarity learner.

Sentence

Auto-Transfer: Learning to Route Transferrable Representations

1 code implementation2 Feb 2022 Keerthiram Murugesan, Vijay Sadashivaiah, Ronny Luss, Karthikeyan Shanmugam, Pin-Yu Chen, Amit Dhurandhar

Knowledge transfer between heterogeneous source and target networks and tasks has received a lot of attention in recent times as large amounts of quality labeled data can be difficult to obtain in many applications.

Transfer Learning

CoFrNets: Interpretable Neural Architecture Inspired by Continued Fractions

no code implementations NeurIPS 2021 Isha Puri, Amit Dhurandhar, Tejaswini Pedapati, Karthikeyan Shanmugam, Dennis Wei, Kush R. Varshney

We experiment on nonlinear synthetic functions and are able to accurately model as well as estimate feature attributions and even higher order terms in some cases, which is a testament to the representational power as well as interpretability of such architectures.

Auto-Transfer: Learning to Route Transferable Representations

no code implementations ICLR 2022 Keerthiram Murugesan, Vijay Sadashivaiah, Ronny Luss, Karthikeyan Shanmugam, Pin-Yu Chen, Amit Dhurandhar

Knowledge transfer between heterogeneous source and target networks and tasks has received a lot of attention in recent times as large amounts of quality labelled data can be difficult to obtain in many applications.

Transfer Learning

Locally Invariant Explanations: Towards Causal Explanations through Local Invariant Learning

no code implementations29 Sep 2021 Amit Dhurandhar, Karthikeyan Natesan Ramamurthy, Kartik Ahuja, Vijay Arya

Locally interpretable model agnostic explanations (LIME) method is one of the most popular methods used to explain black-box models at a per example level.

Out-of-Distribution Generalization

Interpreting Reinforcement Policies through Local Behaviors

no code implementations29 Sep 2021 Ronny Luss, Amit Dhurandhar, Miao Liu

Many works in explainable AI have focused on explaining black-box classification models.

Reinforcement Learning (RL)

Let the CAT out of the bag: Contrastive Attributed explanations for Text

no code implementations16 Sep 2021 Saneem Chemmengath, Amar Prakash Azad, Ronny Luss, Amit Dhurandhar

Contrastive explanations for understanding the behavior of black box models has gained a lot of attention recently as they provide potential for recourse.

Attribute Language Modelling

Multihop: Leveraging Complex Models to Learn Accurate Simple Models

no code implementations14 Sep 2021 Amit Dhurandhar, Tejaswini Pedapati

In this paper, we propose a meta-approach where we transfer information from the complex model to the simple model by dynamically selecting and/or constructing a sequence of intermediate models of decreasing complexity that are less intricate than the original complex model.

Explainable artificial intelligence Knowledge Distillation +2

Towards Better Model Understanding with Path-Sufficient Explanations

no code implementations13 Sep 2021 Ronny Luss, Amit Dhurandhar

To overcome these limitations, we propose a novel method called Path-Sufficient Explanations Method (PSEM) that outputs a sequence of sufficient explanations for a given input of strictly decreasing size (or value) -- from original input to a minimally sufficient explanation -- which can be thought to trace the local boundary of the model in a smooth manner, thus providing better intuition about the local model behavior for the specific input.

Explainable artificial intelligence Explainable Artificial Intelligence (XAI)

Treatment Effect Estimation using Invariant Risk Minimization

2 code implementations13 Mar 2021 Abhin Shah, Kartik Ahuja, Karthikeyan Shanmugam, Dennis Wei, Kush Varshney, Amit Dhurandhar

Inferring causal individual treatment effect (ITE) from observational data is a challenging problem whose difficulty is exacerbated by the presence of treatment assignment bias.

Diversity Domain Generalization +1

Learning to Initialize Gradient Descent Using Gradient Descent

no code implementations22 Dec 2020 Kartik Ahuja, Amit Dhurandhar, Kush R. Varshney

Non-convex optimization problems are challenging to solve; the success and computational expense of a gradient descent algorithm or variant depend heavily on the initialization strategy.

Empirical or Invariant Risk Minimization? A Sample Complexity Perspective

3 code implementations ICLR 2021 Kartik Ahuja, Jun Wang, Amit Dhurandhar, Karthikeyan Shanmugam, Kush R. Varshney

Recently, invariant risk minimization (IRM) was proposed as a promising solution to address out-of-distribution (OOD) generalization.

Linear Regression Games: Convergence Guarantees to Approximate Out-of-Distribution Solutions

3 code implementations28 Oct 2020 Kartik Ahuja, Karthikeyan Shanmugam, Amit Dhurandhar

In Ahuja et al., it was shown that solving for the Nash equilibria of a new class of "ensemble-games" is equivalent to solving IRM.

regression

Deciding Fast and Slow: The Role of Cognitive Biases in AI-assisted Decision-making

no code implementations15 Oct 2020 Charvi Rastogi, Yunfeng Zhang, Dennis Wei, Kush R. Varshney, Amit Dhurandhar, Richard Tomsett

We, then, conduct a second user experiment which shows that our time allocation strategy with explanation can effectively de-anchor the human and improve collaborative performance when the AI model has low confidence and is incorrect.

Decision Making

Model Agnostic Multilevel Explanations

no code implementations NeurIPS 2020 Karthikeyan Natesan Ramamurthy, Bhanukiran Vinzamuri, Yunfeng Zhang, Amit Dhurandhar

The method can also leverage side information, where users can specify points for which they may want the explanations to be similar.

Learning Global Transparent Models Consistent with Local Contrastive Explanations

no code implementations NeurIPS 2020 Tejaswini Pedapati, Avinash Balakrishnan, Karthikeyan Shanmugam, Amit Dhurandhar

Based on a key insight we propose a novel method where we create custom boolean features from sparse local contrastive explanations of the black-box model and then train a globally transparent model on just these, and showcase empirically that such models have higher local consistency compared with other known strategies, while still being close in performance to models that are trained with access to the original data.

counterfactual

Invariant Risk Minimization Games

3 code implementations ICML 2020 Kartik Ahuja, Karthikeyan Shanmugam, Kush R. Varshney, Amit Dhurandhar

The standard risk minimization paradigm of machine learning is brittle when operating in environments whose test distributions are different from the training distribution due to spurious correlations.

BIG-bench Machine Learning Image Classification

Leveraging Simple Model Predictions for Enhancing its Performance

no code implementations25 Sep 2019 Amit Dhurandhar, Karthikeyan Shanmugam, Ronny Luss

Our method also leverages the per sample hardness estimate of the simple model which is not the case with the prior works which primarily consider the complex model's confidences/predictions and is thus conceptually novel.

Teaching AI to Explain its Decisions Using Embeddings and Multi-Task Learning

no code implementations5 Jun 2019 Noel C. F. Codella, Michael Hind, Karthikeyan Natesan Ramamurthy, Murray Campbell, Amit Dhurandhar, Kush R. Varshney, Dennis Wei, Aleksandra Mojsilović

Using machine learning in high-stakes applications often requires predictions to be accompanied by explanations comprehensible to the domain user, who has ultimate responsibility for decisions and outcomes.

BIG-bench Machine Learning Multi-Task Learning

Model Agnostic Contrastive Explanations for Structured Data

no code implementations31 May 2019 Amit Dhurandhar, Tejaswini Pedapati, Avinash Balakrishnan, Pin-Yu Chen, Karthikeyan Shanmugam, Ruchir Puri

Recently, a method [7] was proposed to generate contrastive explanations for differentiable models such as deep neural networks, where one has complete access to the model.

Enhancing Simple Models by Exploiting What They Already Know

no code implementations ICML 2020 Amit Dhurandhar, Karthikeyan Shanmugam, Ronny Luss

Our method also leverages the per sample hardness estimate of the simple model which is not the case with the prior works which primarily consider the complex model's confidences/predictions and is thus conceptually novel.

Small Data Image Classification

Leveraging Latent Features for Local Explanations

2 code implementations29 May 2019 Ronny Luss, Pin-Yu Chen, Amit Dhurandhar, Prasanna Sattigeri, Yunfeng Zhang, Karthikeyan Shanmugam, Chun-Chen Tu

As the application of deep neural networks proliferates in numerous areas such as medical imaging, video surveillance, and self driving cars, the need for explaining the decisions of these models has become a hot research topic, both at the global and local level.

General Classification Open-Ended Question Answering +1

TED: Teaching AI to Explain its Decisions

no code implementations12 Nov 2018 Michael Hind, Dennis Wei, Murray Campbell, Noel C. F. Codella, Amit Dhurandhar, Aleksandra Mojsilović, Karthikeyan Natesan Ramamurthy, Kush R. Varshney

Artificial intelligence systems are being increasingly deployed due to their potential to increase the efficiency, scale, consistency, fairness, and accuracy of decisions.

Fairness

Streaming Methods for Restricted Strongly Convex Functions with Applications to Prototype Selection

no code implementations21 Jul 2018 Karthik S. Gurumoorthy, Amit Dhurandhar

In this paper, we show that if the optimization function is restricted-strongly-convex (RSC) and restricted-smooth (RSM) -- a rich subclass of weakly submodular functions -- then a streaming algorithm with constant factor approximation guarantee is possible.

Prototype Selection

Improving Simple Models with Confidence Profiles

no code implementations NeurIPS 2018 Amit Dhurandhar, Karthikeyan Shanmugam, Ronny Luss, Peder Olsen

Our transfer method involves a theoretically justified weighting of samples during the training of the simple model using confidence scores of these intermediate layers.

Teaching Meaningful Explanations

no code implementations29 May 2018 Noel C. F. Codella, Michael Hind, Karthikeyan Natesan Ramamurthy, Murray Campbell, Amit Dhurandhar, Kush R. Varshney, Dennis Wei, Aleksandra Mojsilovic

The adoption of machine learning in high-stakes applications such as healthcare and law has lagged in part because predictions are not accompanied by explanations comprehensible to the domain user, who often holds the ultimate responsibility for decisions and outcomes.

BIG-bench Machine Learning

A Formal Framework to Characterize Interpretability of Procedures

no code implementations12 Jul 2017 Amit Dhurandhar, Vijay Iyengar, Ronny Luss, Karthikeyan Shanmugam

We provide a novel notion of what it means to be interpretable, looking past the usual association with human understanding.

Efficient Data Representation by Selecting Prototypes with Importance Weights

1 code implementation5 Jul 2017 Karthik S. Gurumoorthy, Amit Dhurandhar, Guillermo Cecchi, Charu Aggarwal

Prototypical examples that best summarizes and compactly represents an underlying complex data distribution communicate meaningful insights to humans in domains where simple explanations are hard to extract.

TIP: Typifying the Interpretability of Procedures

no code implementations9 Jun 2017 Amit Dhurandhar, Vijay Iyengar, Ronny Luss, Karthikeyan Shanmugam

This leads to the insight that the improvement in the target model is not only a function of the oracle model's performance, but also its relative complexity with respect to the target model.

Knowledge Distillation

Learning with Changing Features

no code implementations29 Apr 2017 Amit Dhurandhar, Steve Hanneke, Liu Yang

In particular, we propose an approach to provably determine the time instant from which the new/changed features start becoming relevant with respect to an output variable in an agnostic (supervised) learning setting.

Change Point Detection

Uncovering Group Level Insights with Accordant Clustering

no code implementations7 Apr 2017 Amit Dhurandhar, Margareta Ackerman, Xiang Wang

Clustering is a widely-used data mining tool, which aims to discover partitions of similar items in data.

Clustering

Building an Interpretable Recommender via Loss-Preserving Transformation

no code implementations19 Jun 2016 Amit Dhurandhar, Sechan Oh, Marek Petrik

We propose a method for building an interpretable recommender system for personalizing online content and promotions.

Classification General Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.