Search Results for author: Ashwin Kalyan

Found 24 papers, 11 papers with code

RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs

no code implementations12 Apr 2024 Shreyas Chaudhari, Pranjal Aggarwal, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan, Ameet Deshpande, Bruno Castro da Silva

A promising approach is reinforcement learning from human feedback (RLHF), which leverages human feedback to update the model in accordance with human preferences and mitigate issues like toxicity and hallucinations.

Language Modelling reinforcement-learning

GEO: Generative Engine Optimization

no code implementations16 Nov 2023 Pranjal Aggarwal, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik R Narasimhan, Ameet Deshpande

We facilitate systematic evaluation in this new paradigm by introducing GEO-bench, a benchmark of diverse user queries across multiple domains, coupled with sources required to answer these queries.

Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs

1 code implementation8 Nov 2023 Shashank Gupta, Vaishnavi Shrivastava, Ameet Deshpande, Ashwin Kalyan, Peter Clark, Ashish Sabharwal, Tushar Khot

Our experiments with ChatGPT-3. 5 show that this bias is ubiquitous - 80% of our personas demonstrate bias; it is significant - some datasets show performance drops of 70%+; and can be especially harmful for certain groups - some personas suffer statistically significant drops on 80%+ of the datasets.

Fairness Math

QualEval: Qualitative Evaluation for Model Improvement

1 code implementation6 Nov 2023 Vishvak Murahari, Ameet Deshpande, Peter Clark, Tanmay Rajpurohit, Ashish Sabharwal, Karthik Narasimhan, Ashwin Kalyan

In this work, we address the shortcomings of quantitative metrics by proposing QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.

Estimating Numbers without Regression

no code implementations9 Oct 2023 Avijit Thawani, Jay Pujara, Ashwin Kalyan

Despite recent successes in language models, their ability to represent numbers is insufficient.

Language Modelling regression

Distraction-free Embeddings for Robust VQA

no code implementations31 Aug 2023 Atharvan Dogra, Deeksha Varshney, Ashwin Kalyan, Ameet Deshpande, Neeraj Kumar

The generation of effective latent representations and their subsequent refinement to incorporate precise information is an essential prerequisite for Vision-Language Understanding (VLU) tasks such as Video Question Answering (VQA).

Question Answering Video Question Answering +1

Exploiting Generalization in Offline Reinforcement Learning via Unseen State Augmentations

no code implementations7 Aug 2023 Nirbhay Modhe, Qiaozi Gao, Ashwin Kalyan, Dhruv Batra, Govind Thattai, Gaurav Sukhatme

Offline reinforcement learning (RL) methods strike a balance between exploration and exploitation by conservative value estimation -- penalizing values of unseen states and actions.

Offline RL reinforcement-learning +1

C-STS: Conditional Semantic Textual Similarity

1 code implementation24 May 2023 Ameet Deshpande, Carlos E. Jimenez, Howard Chen, Vishvak Murahari, Victoria Graf, Tanmay Rajpurohit, Ashwin Kalyan, Danqi Chen, Karthik Narasimhan

Semantic textual similarity (STS), a cornerstone task in NLP, measures the degree of similarity between a pair of sentences, and has broad application in fields such as information retrieval and natural language understanding.

Information Retrieval Language Modelling +8

Anthropomorphization of AI: Opportunities and Risks

no code implementations24 May 2023 Ameet Deshpande, Tanmay Rajpurohit, Karthik Narasimhan, Ashwin Kalyan

With widespread adoption of AI systems, and the push from stakeholders to make it human-like through alignment techniques, human voice, and pictorial avatars, the tendency for users to anthropomorphize it increases significantly.


Toxicity in ChatGPT: Analyzing Persona-assigned Language Models

no code implementations11 Apr 2023 Ameet Deshpande, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan

Large language models (LLMs) have shown incredible capabilities and transcended the natural language processing (NLP) community, with adoption throughout many services like healthcare, therapy, education, and customer service.

SPARTAN: Sparse Hierarchical Memory for Parameter-Efficient Transformers

1 code implementation29 Nov 2022 Ameet Deshpande, Md Arafat Sultan, Anthony Ferritto, Ashwin Kalyan, Karthik Narasimhan, Avirup Sil

Fine-tuning pre-trained language models (PLMs) achieves impressive performance on a range of downstream tasks, and their sizes have consequently been getting bigger.

Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning

2 code implementations29 Sep 2022 Pan Lu, Liang Qiu, Kai-Wei Chang, Ying Nian Wu, Song-Chun Zhu, Tanmay Rajpurohit, Peter Clark, Ashwin Kalyan

However, it is unknown if the models can handle more complex problems that involve math reasoning over heterogeneous information, such as tabular data.

Logical Reasoning Math +1

Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering

1 code implementation20 Sep 2022 Pan Lu, Swaroop Mishra, Tony Xia, Liang Qiu, Kai-Wei Chang, Song-Chun Zhu, Oyvind Tafjord, Peter Clark, Ashwin Kalyan

We further design language models to learn to generate lectures and explanations as the chain of thought (CoT) to mimic the multi-hop reasoning process when answering ScienceQA questions.

Multimodal Deep Learning Multimodal Reasoning +5

Model-Advantage and Value-Aware Models for Model-Based Reinforcement Learning: Bridging the Gap in Theory and Practice

1 code implementation26 Jun 2021 Nirbhay Modhe, Harish Kamath, Dhruv Batra, Ashwin Kalyan

This work shows that value-aware model learning, known for its numerous theoretical benefits, is also practically viable for solving challenging continuous control tasks in prevalent model-based reinforcement learning algorithms.

Continuous Control Model-based Reinforcement Learning

Programming Puzzles

3 code implementations10 Jun 2021 Tal Schuster, Ashwin Kalyan, Oleksandr Polozov, Adam Tauman Kalai

The dataset is comprehensive in that it spans problems of a range of difficulties and domains, ranging from trivial string manipulation problems, to classic programming puzzles (e. g., Tower of Hanoi), to interview/competitive-programming problems (e. g., dynamic programming), to longstanding open problems in algorithms and mathematics (e. g., factoring).

Code Generation Natural Language Understanding +1


no code implementations25 Sep 2019 Ashwin Kalyan, Oleksandr Polozov, Adam Tauman Kalai

Puzzles are objective in that one can easily test the correctness of a given solution x by seeing whether it satisfies f, unlike the most common representations for program synthesis: given input-output pairs or an English problem description, the correctness of a given solution is not determined and is debatable.

Program Synthesis

Learn from Your Neighbor: Learning Multi-modal Mappings from Sparse Annotations

no code implementations ICML 2018 Ashwin Kalyan, Stefan Lee, Anitha Kannan, Dhruv Batra

Many structured prediction problems (particularly in vision and language domains) are ambiguous, with multiple outputs being correct for an input - e. g. there are many ways of describing an image, multiple ways of translating a sentence; however, exhaustively annotating the applicability of all possible outputs is intractable due to exponentially large output spaces (e. g. all English sentences).

Multi-Label Classification Question Generation +3

Neural-Guided Deductive Search for Real-Time Program Synthesis from Examples

no code implementations ICLR 2018 Ashwin Kalyan, Abhishek Mohta, Oleksandr Polozov, Dhruv Batra, Prateek Jain, Sumit Gulwani

In this work, we propose Neural Guided Deductive Search (NGDS), a hybrid synthesis technique that combines the best of both symbolic logic techniques and statistical models.

Program Synthesis

Cannot find the paper you are looking for? You can Submit a new open access paper.