1 code implementation • 6 Sep 2024 • Bruce W. Lee, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Erik Miehling, Pierre Dognin, Manish Nagireddy, Amit Dhurandhar
In this paper, we propose Conditional Activation Steering (CAST), which analyzes LLM activation patterns during inference to selectively apply or withhold activation steering based on the input context.
no code implementations • 17 Jun 2024 • Ronny Luss, Erik Miehling, Amit Dhurandhar
However, in the case of generative AI such as large language models (LLMs), there is no class prediction to explain.
no code implementations • 26 May 2024 • Junfeng Jiao, Saleh Afroogh, Kevin Chen, David Atkinson, Amit Dhurandhar
The integration of Generative Artificial Intelligence (GAI) and Large Language Models (LLMs) in academia has spurred a global discourse on their potential pedagogical benefits and ethical considerations.
no code implementations • 10 Apr 2024 • Sahil Garg, Anderson Schneider, Anant Raj, Kashif Rasul, Yuriy Nevmyvaka, Sneihil Gopal, Amit Dhurandhar, Guillermo Cecchi, Irina Rish
In addition to the data efficiency gained from direct sampling, we propose an algorithm that offers a significant reduction in sample complexity for estimating the divergence of the data distribution with respect to the marginal distribution.
no code implementations • 21 Mar 2024 • Lucas Monteiro Paes, Dennis Wei, Hyo Jin Do, Hendrik Strobelt, Ronny Luss, Amit Dhurandhar, Manish Nagireddy, Karthikeyan Natesan Ramamurthy, Prasanna Sattigeri, Werner Geyer, Soumya Ghosh
To address the challenges of text as output and long text inputs, we propose a general framework called MExGen that can be instantiated with different attribution algorithms.
no code implementations • 28 Feb 2024 • Amit Dhurandhar, Tejaswini Pedapati, Ronny Luss, Soham Dan, Aurelie Lozano, Payel Das, Georgios Kollias
Transformer-based Language Models have become ubiquitous in Natural Language Processing (NLP) due to their impressive performance on various tasks.
no code implementations • 21 Feb 2024 • Amit Dhurandhar, Rahul Nair, Moninder Singh, Elizabeth Daly, Karthikeyan Natesan Ramamurthy
and a set of LLMs, we rank them without access to any ground truth or reference responses.
1 code implementation • 17 Feb 2024 • Amit Dhurandhar, Swagatam Haldar, Dennis Wei, Karthikeyan Natesan Ramamurthy
fidelity, stability), can we find the largest hypercube (i. e., $\ell_{\infty}$ ball) centered at the example such that when the explanation is applied to all examples within the hypercube, (with high probability) a quality criterion is met (viz.
1 code implementation • 3 Sep 2023 • Jiajin Zhang, Hanqing Chao, Amit Dhurandhar, Pin-Yu Chen, Ali Tajer, Yangyang Xu, Pingkun Yan
To accomplish this challenging task, first, a spectral sensitivity map is introduced to characterize the generalization weaknesses of models in the frequency domain.
1 code implementation • 1 Dec 2022 • Jiajin Zhang, Hanqing Chao, Amit Dhurandhar, Pin-Yu Chen, Ali Tajer, Yangyang Xu, Pingkun Yan
Domain generalization (DG) aims to train a model to perform well in unseen domains under different distributions.
no code implementations • 2 Nov 2022 • Dennis Wei, Rahul Nair, Amit Dhurandhar, Kush R. Varshney, Elizabeth M. Daly, Moninder Singh
Interpretable and explainable machine learning has seen a recent surge of interest.
1 code implementation • 5 Oct 2022 • Igor Melnyk, Vijil Chenthamarakshan, Pin-Yu Chen, Payel Das, Amit Dhurandhar, Inkit Padhi, Devleena Das
Results on antibody design benchmarks show that our model on low-resourced antibody sequence dataset provides highly diverse CDR sequences, up to more than a two-fold increase of diversity over the baselines, without losing structural integrity and naturalness.
no code implementations • 14 Sep 2022 • Shreyas Fadnavis, Amit Dhurandhar, Raquel Norel, Jenna M Reinen, Carla Agurto, Erica Secchettin, Vittorio Schweiger, Giovanni Perini, Guillermo Cecchi
Chronic pain is a pervasive disorder which is often very disabling and is associated with comorbidities such as depression and anxiety.
no code implementations • 23 Aug 2022 • Tsuyoshi Idé, Amit Dhurandhar, Jiří Navrátil, Moninder Singh, Naoki Abe
In either case, one would ideally want to compute a ``responsibility score'' indicative of the extent to which an input variable is responsible for the anomalous output.
no code implementations • 19 Aug 2022 • Travis Greene, Amit Dhurandhar, Galit Shmueli
In response to growing recognition of the social impact of new AI-based technologies, major AI and ML conferences and journals now encourage or require papers to include ethics impact statements and undergo ethics reviews.
no code implementations • 22 Jun 2022 • Q. Vera Liao, Yunfeng Zhang, Ronny Luss, Finale Doshi-Velez, Amit Dhurandhar
We argue that one way to close the gap is to develop evaluation methods that account for different user requirements in these usage contexts.
1 code implementation • 13 Apr 2022 • Bhanushee Sharma, Vijil Chenthamarakshan, Amit Dhurandhar, Shiranee Pereira, James A. Hendler, Jonathan S. Dordick, Payel Das
Additionally, our multi-task approach is comprehensive in the sense that it is comparable to state-of-the-art approaches for specific endpoints in in vitro, in vivo and clinical platforms.
no code implementations • 8 Feb 2022 • Ronny Luss, Amit Dhurandhar, Miao Liu
Many works in explainable AI have focused on explaining black-box classification models.
no code implementations • 2 Feb 2022 • Karthikeyan Natesan Ramamurthy, Amit Dhurandhar, Dennis Wei, Zaid Bin Tariq
We first propose a method that provides feature attributions to explain the similarity between a pair of inputs as determined by a black box similarity learner.
1 code implementation • 2 Feb 2022 • Keerthiram Murugesan, Vijay Sadashivaiah, Ronny Luss, Karthikeyan Shanmugam, Pin-Yu Chen, Amit Dhurandhar
Knowledge transfer between heterogeneous source and target networks and tasks has received a lot of attention in recent times as large amounts of quality labeled data can be difficult to obtain in many applications.
no code implementations • NeurIPS 2021 • Isha Puri, Amit Dhurandhar, Tejaswini Pedapati, Karthikeyan Shanmugam, Dennis Wei, Kush R. Varshney
We experiment on nonlinear synthetic functions and are able to accurately model as well as estimate feature attributions and even higher order terms in some cases, which is a testament to the representational power as well as interpretability of such architectures.
no code implementations • ICLR 2022 • Keerthiram Murugesan, Vijay Sadashivaiah, Ronny Luss, Karthikeyan Shanmugam, Pin-Yu Chen, Amit Dhurandhar
Knowledge transfer between heterogeneous source and target networks and tasks has received a lot of attention in recent times as large amounts of quality labelled data can be difficult to obtain in many applications.
no code implementations • 29 Sep 2021 • Amit Dhurandhar, Karthikeyan Natesan Ramamurthy, Kartik Ahuja, Vijay Arya
Locally interpretable model agnostic explanations (LIME) method is one of the most popular methods used to explain black-box models at a per example level.
no code implementations • 29 Sep 2021 • Ronny Luss, Amit Dhurandhar, Miao Liu
Many works in explainable AI have focused on explaining black-box classification models.
no code implementations • 24 Sep 2021 • Vijay Arya, Rachel K. E. Bellamy, Pin-Yu Chen, Amit Dhurandhar, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Q. Vera Liao, Ronny Luss, Aleksandra Mojsilovic, Sami Mourad, Pablo Pedemonte, Ramya Raghavendra, John Richards, Prasanna Sattigeri, Karthikeyan Shanmugam, Moninder Singh, Kush R. Varshney, Dennis Wei, Yunfeng Zhang
As artificial intelligence and machine learning algorithms become increasingly prevalent in society, multiple stakeholders are calling for these algorithms to provide explanations.
no code implementations • 16 Sep 2021 • Saneem Chemmengath, Amar Prakash Azad, Ronny Luss, Amit Dhurandhar
Contrastive explanations for understanding the behavior of black box models has gained a lot of attention recently as they provide potential for recourse.
no code implementations • 14 Sep 2021 • Amit Dhurandhar, Tejaswini Pedapati
In this paper, we propose a meta-approach where we transfer information from the complex model to the simple model by dynamically selecting and/or constructing a sequence of intermediate models of decreasing complexity that are less intricate than the original complex model.
Explainable artificial intelligence Knowledge Distillation +2
no code implementations • 13 Sep 2021 • Ronny Luss, Amit Dhurandhar
To overcome these limitations, we propose a novel method called Path-Sufficient Explanations Method (PSEM) that outputs a sequence of sufficient explanations for a given input of strictly decreasing size (or value) -- from original input to a minimally sufficient explanation -- which can be thought to trace the local boundary of the model in a smooth manner, thus providing better intuition about the local model behavior for the specific input.
Explainable artificial intelligence Explainable Artificial Intelligence (XAI)
2 code implementations • 13 Mar 2021 • Abhin Shah, Kartik Ahuja, Karthikeyan Shanmugam, Dennis Wei, Kush Varshney, Amit Dhurandhar
Inferring causal individual treatment effect (ITE) from observational data is a challenging problem whose difficulty is exacerbated by the presence of treatment assignment bias.
no code implementations • 22 Dec 2020 • Kartik Ahuja, Amit Dhurandhar, Kush R. Varshney
Non-convex optimization problems are challenging to solve; the success and computational expense of a gradient descent algorithm or variant depend heavily on the initialization strategy.
3 code implementations • ICLR 2021 • Kartik Ahuja, Jun Wang, Amit Dhurandhar, Karthikeyan Shanmugam, Kush R. Varshney
Recently, invariant risk minimization (IRM) was proposed as a promising solution to address out-of-distribution (OOD) generalization.
3 code implementations • 28 Oct 2020 • Kartik Ahuja, Karthikeyan Shanmugam, Amit Dhurandhar
In Ahuja et al., it was shown that solving for the Nash equilibria of a new class of "ensemble-games" is equivalent to solving IRM.
no code implementations • 15 Oct 2020 • Charvi Rastogi, Yunfeng Zhang, Dennis Wei, Kush R. Varshney, Amit Dhurandhar, Richard Tomsett
We, then, conduct a second user experiment which shows that our time allocation strategy with explanation can effectively de-anchor the human and improve collaborative performance when the AI model has low confidence and is incorrect.
no code implementations • NeurIPS 2020 • Karthikeyan Natesan Ramamurthy, Bhanukiran Vinzamuri, Yunfeng Zhang, Amit Dhurandhar
The method can also leverage side information, where users can specify points for which they may want the explanations to be similar.
no code implementations • NeurIPS 2020 • Tejaswini Pedapati, Avinash Balakrishnan, Karthikeyan Shanmugam, Amit Dhurandhar
Based on a key insight we propose a novel method where we create custom boolean features from sparse local contrastive explanations of the black-box model and then train a globally transparent model on just these, and showcase empirically that such models have higher local consistency compared with other known strategies, while still being close in performance to models that are trained with access to the original data.
3 code implementations • ICML 2020 • Kartik Ahuja, Karthikeyan Shanmugam, Kush R. Varshney, Amit Dhurandhar
The standard risk minimization paradigm of machine learning is brittle when operating in environments whose test distributions are different from the training distribution due to spurious correlations.
no code implementations • 25 Sep 2019 • Amit Dhurandhar, Karthikeyan Shanmugam, Ronny Luss
Our method also leverages the per sample hardness estimate of the simple model which is not the case with the prior works which primarily consider the complex model's confidences/predictions and is thus conceptually novel.
2 code implementations • 6 Sep 2019 • Vijay Arya, Rachel K. E. Bellamy, Pin-Yu Chen, Amit Dhurandhar, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Q. Vera Liao, Ronny Luss, Aleksandra Mojsilović, Sami Mourad, Pablo Pedemonte, Ramya Raghavendra, John Richards, Prasanna Sattigeri, Karthikeyan Shanmugam, Moninder Singh, Kush R. Varshney, Dennis Wei, Yunfeng Zhang
Equally important, we provide a taxonomy to help entities requiring explanations to navigate the space of explanation methods, not only those in the toolkit but also in the broader literature on explainability.
no code implementations • 5 Jun 2019 • Noel C. F. Codella, Michael Hind, Karthikeyan Natesan Ramamurthy, Murray Campbell, Amit Dhurandhar, Kush R. Varshney, Dennis Wei, Aleksandra Mojsilović
Using machine learning in high-stakes applications often requires predictions to be accompanied by explanations comprehensible to the domain user, who has ultimate responsibility for decisions and outcomes.
no code implementations • 31 May 2019 • Amit Dhurandhar, Tejaswini Pedapati, Avinash Balakrishnan, Pin-Yu Chen, Karthikeyan Shanmugam, Ruchir Puri
Recently, a method [7] was proposed to generate contrastive explanations for differentiable models such as deep neural networks, where one has complete access to the model.
no code implementations • ICML 2020 • Amit Dhurandhar, Karthikeyan Shanmugam, Ronny Luss
Our method also leverages the per sample hardness estimate of the simple model which is not the case with the prior works which primarily consider the complex model's confidences/predictions and is thus conceptually novel.
2 code implementations • 29 May 2019 • Ronny Luss, Pin-Yu Chen, Amit Dhurandhar, Prasanna Sattigeri, Yunfeng Zhang, Karthikeyan Shanmugam, Chun-Chen Tu
As the application of deep neural networks proliferates in numerous areas such as medical imaging, video surveillance, and self driving cars, the need for explaining the decisions of these models has become a hot research topic, both at the global and local level.
no code implementations • 12 Nov 2018 • Michael Hind, Dennis Wei, Murray Campbell, Noel C. F. Codella, Amit Dhurandhar, Aleksandra Mojsilović, Karthikeyan Natesan Ramamurthy, Kush R. Varshney
Artificial intelligence systems are being increasingly deployed due to their potential to increase the efficiency, scale, consistency, fairness, and accuracy of decisions.
no code implementations • 21 Jul 2018 • Karthik S. Gurumoorthy, Amit Dhurandhar
In this paper, we show that if the optimization function is restricted-strongly-convex (RSC) and restricted-smooth (RSM) -- a rich subclass of weakly submodular functions -- then a streaming algorithm with constant factor approximation guarantee is possible.
no code implementations • NeurIPS 2018 • Amit Dhurandhar, Karthikeyan Shanmugam, Ronny Luss, Peder Olsen
Our transfer method involves a theoretically justified weighting of samples during the training of the simple model using confidence scores of these intermediate layers.
no code implementations • 29 May 2018 • Noel C. F. Codella, Michael Hind, Karthikeyan Natesan Ramamurthy, Murray Campbell, Amit Dhurandhar, Kush R. Varshney, Dennis Wei, Aleksandra Mojsilovic
The adoption of machine learning in high-stakes applications such as healthcare and law has lagged in part because predictions are not accompanied by explanations comprehensible to the domain user, who often holds the ultimate responsibility for decisions and outcomes.
4 code implementations • NeurIPS 2018 • Amit Dhurandhar, Pin-Yu Chen, Ronny Luss, Chun-Chen Tu, Pai-Shun Ting, Karthikeyan Shanmugam, Payel Das
important object pixels in an image) to justify its classification and analogously what should be minimally and necessarily \emph{absent} (viz.
no code implementations • 12 Jul 2017 • Amit Dhurandhar, Vijay Iyengar, Ronny Luss, Karthikeyan Shanmugam
We provide a novel notion of what it means to be interpretable, looking past the usual association with human understanding.
1 code implementation • 5 Jul 2017 • Karthik S. Gurumoorthy, Amit Dhurandhar, Guillermo Cecchi, Charu Aggarwal
Prototypical examples that best summarizes and compactly represents an underlying complex data distribution communicate meaningful insights to humans in domains where simple explanations are hard to extract.
no code implementations • 9 Jun 2017 • Amit Dhurandhar, Vijay Iyengar, Ronny Luss, Karthikeyan Shanmugam
This leads to the insight that the improvement in the target model is not only a function of the oracle model's performance, but also its relative complexity with respect to the target model.
no code implementations • 29 Apr 2017 • Amit Dhurandhar, Steve Hanneke, Liu Yang
In particular, we propose an approach to provably determine the time instant from which the new/changed features start becoming relevant with respect to an output variable in an agnostic (supervised) learning setting.
no code implementations • 7 Apr 2017 • Amit Dhurandhar, Margareta Ackerman, Xiang Wang
Clustering is a widely-used data mining tool, which aims to discover partitions of similar items in data.
no code implementations • 19 Jun 2016 • Amit Dhurandhar, Sechan Oh, Marek Petrik
We propose a method for building an interpretable recommender system for personalizing online content and promotions.