Search Results for author: Sree Harsha Tanneru

Found 4 papers, 1 papers with code

On the Hardness of Faithful Chain-of-Thought Reasoning in Large Language Models

no code implementations15 Jun 2024 Sree Harsha Tanneru, Dan Ley, Chirag Agarwal, Himabindu Lakkaraju

In this work, we explore the promise of three broad approaches commonly employed to steer the behavior of LLMs to enhance the faithfulness of the CoT reasoning generated by LLMs: in-context learning, fine-tuning, and activation editing.

In-Context Learning Question Answering

Faithfulness vs. Plausibility: On the (Un)Reliability of Explanations from Large Language Models

no code implementations7 Feb 2024 Chirag Agarwal, Sree Harsha Tanneru, Himabindu Lakkaraju

We highlight that the current trend towards increasing the plausibility of explanations, primarily driven by the demand for user-friendly interfaces, may come at the cost of diminishing their faithfulness.

Decision Making

Quantifying Uncertainty in Natural Language Explanations of Large Language Models

1 code implementation6 Nov 2023 Sree Harsha Tanneru, Chirag Agarwal, Himabindu Lakkaraju

In this work, we make one of the first attempts at quantifying the uncertainty in explanations of LLMs.

Word-Level Explanations for Analyzing Bias in Text-to-Image Models

no code implementations3 Jun 2023 Alexander Lin, Lucas Monteiro Paes, Sree Harsha Tanneru, Suraj Srinivas, Himabindu Lakkaraju

We introduce a method for computing scores for each word in the prompt; these scores represent its influence on biases in the model's output.

Sentence

Cannot find the paper you are looking for? You can Submit a new open access paper.