Search Results for author: Ashwin Kalyan

Found 24 papers, 11 papers with code

RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs

no code implementations • 12 Apr 2024 • Shreyas Chaudhari, Pranjal Aggarwal, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan, Ameet Deshpande, Bruno Castro da Silva

A promising approach is reinforcement learning from human feedback (RLHF), which leverages human feedback to update the model in accordance with human preferences and mitigate issues like toxicity and hallucinations.

Language Modelling reinforcement-learning

Paper
Add Code

GEO: Generative Engine Optimization

no code implementations • 16 Nov 2023 • Pranjal Aggarwal, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik R Narasimhan, Ameet Deshpande

We facilitate systematic evaluation in this new paradigm by introducing GEO-bench, a benchmark of diverse user queries across multiple domains, coupled with sources required to answer these queries.

Paper
Add Code

Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs

1 code implementation • 8 Nov 2023 • Shashank Gupta, Vaishnavi Shrivastava, Ameet Deshpande, Ashwin Kalyan, Peter Clark, Ashish Sabharwal, Tushar Khot

Our experiments with ChatGPT-3. 5 show that this bias is ubiquitous - 80% of our personas demonstrate bias; it is significant - some datasets show performance drops of 70%+; and can be especially harmful for certain groups - some personas suffer statistically significant drops on 80%+ of the datasets.

Fairness Math

Paper
Code

QualEval: Qualitative Evaluation for Model Improvement

1 code implementation • 6 Nov 2023 • Vishvak Murahari, Ameet Deshpande, Peter Clark, Tanmay Rajpurohit, Ashish Sabharwal, Karthik Narasimhan, Ashwin Kalyan

In this work, we address the shortcomings of quantitative metrics by proposing QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.

Paper
Code

Estimating Numbers without Regression

no code implementations • 9 Oct 2023 • Avijit Thawani, Jay Pujara, Ashwin Kalyan

Despite recent successes in language models, their ability to represent numbers is insufficient.

Language Modelling regression

Paper
Add Code

Distraction-free Embeddings for Robust VQA

no code implementations • 31 Aug 2023 • Atharvan Dogra, Deeksha Varshney, Ashwin Kalyan, Ameet Deshpande, Neeraj Kumar

The generation of effective latent representations and their subsequent refinement to incorporate precise information is an essential prerequisite for Vision-Language Understanding (VLU) tasks such as Video Question Answering (VQA).

Question Answering Video Question Answering +1

Paper
Add Code

Exploiting Generalization in Offline Reinforcement Learning via Unseen State Augmentations

no code implementations • 7 Aug 2023 • Nirbhay Modhe, Qiaozi Gao, Ashwin Kalyan, Dhruv Batra, Govind Thattai, Gaurav Sukhatme

Offline reinforcement learning (RL) methods strike a balance between exploration and exploitation by conservative value estimation -- penalizing values of unseen states and actions.

Offline RL reinforcement-learning +1

Paper
Add Code

C-STS: Conditional Semantic Textual Similarity

1 code implementation • 24 May 2023 • Ameet Deshpande, Carlos E. Jimenez, Howard Chen, Vishvak Murahari, Victoria Graf, Tanmay Rajpurohit, Ashwin Kalyan, Danqi Chen, Karthik Narasimhan

Semantic textual similarity (STS), a cornerstone task in NLP, measures the degree of similarity between a pair of sentences, and has broad application in fields such as information retrieval and natural language understanding.

Information Retrieval Language Modelling +8

Paper
Code

Anthropomorphization of AI: Opportunities and Risks

no code implementations • 24 May 2023 • Ameet Deshpande, Tanmay Rajpurohit, Karthik Narasimhan, Ashwin Kalyan

With widespread adoption of AI systems, and the push from stakeholders to make it human-like through alignment techniques, human voice, and pictorial avatars, the tendency for users to anthropomorphize it increases significantly.

Attribute

Paper
Add Code

RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs

1 code implementation • 15 May 2023 • Afra Feyza Akyürek, Ekin Akyürek, Aman Madaan, Ashwin Kalyan, Peter Clark, Derry Wijaya, Niket Tandon

Despite their unprecedented success, even the largest language models make mistakes.

reinforcement-learning Retrieval +1

Paper
Code

ProKnow: Process Knowledge for Safety Constrained and Explainable Question Generation for Mental Health Diagnostic Assistance

no code implementations • 13 May 2023 • Kaushik Roy, Manas Gaur, Misagh Soltani, Vipula Rawte, Ashwin Kalyan, Amit Sheth

LMs augmented with ProKnow guided method generated 89% safer questions in the depression and anxiety domain.

Question Generation Question-Generation

Paper
Add Code

Toxicity in ChatGPT: Analyzing Persona-assigned Language Models

no code implementations • 11 Apr 2023 • Ameet Deshpande, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan

Large language models (LLMs) have shown incredible capabilities and transcended the natural language processing (NLP) community, with adoption throughout many services like healthcare, therapy, education, and customer service.

Paper
Add Code

SPARTAN: Sparse Hierarchical Memory for Parameter-Efficient Transformers

1 code implementation • 29 Nov 2022 • Ameet Deshpande, Md Arafat Sultan, Anthony Ferritto, Ashwin Kalyan, Karthik Narasimhan, Avirup Sil

Fine-tuning pre-trained language models (PLMs) achieves impressive performance on a range of downstream tasks, and their sizes have consequently been getting bigger.

Paper
Code

Lila: A Unified Benchmark for Mathematical Reasoning

1 code implementation • 31 Oct 2022 • Swaroop Mishra, Matthew Finlayson, Pan Lu, Leonard Tang, Sean Welleck, Chitta Baral, Tanmay Rajpurohit, Oyvind Tafjord, Ashish Sabharwal, Peter Clark, Ashwin Kalyan

Mathematical reasoning skills are essential for general-purpose intelligent systems to perform tasks from grocery shopping to climate modeling.

Ranked #1 on Mathematical Reasoning on Lila (OOD)

Mathematical Reasoning Question Answering

Paper
Code

Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning

2 code implementations • 29 Sep 2022 • Pan Lu, Liang Qiu, Kai-Wei Chang, Ying Nian Wu, Song-Chun Zhu, Tanmay Rajpurohit, Peter Clark, Ashwin Kalyan

However, it is unknown if the models can handle more complex problems that involve math reasoning over heterogeneous information, such as tabular data.

Logical Reasoning Math +1

2,539

Paper
Code

Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering

1 code implementation • 20 Sep 2022 • Pan Lu, Swaroop Mishra, Tony Xia, Liang Qiu, Kai-Wei Chang, Song-Chun Zhu, Oyvind Tafjord, Peter Clark, Ashwin Kalyan

We further design language models to learn to generate lectures and explanations as the chain of thought (CoT) to mimic the multi-hop reasoning process when answering ScienceQA questions.

Ranked #5 on Science Question Answering on ScienceQA

Multimodal Deep Learning Multimodal Reasoning +5

544

Paper
Code

NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks

no code implementations • ACL 2022 • Swaroop Mishra, Arindam Mitra, Neeraj Varshney, Bhavdeep Sachdeva, Peter Clark, Chitta Baral, Ashwin Kalyan

Given the ubiquitous nature of numbers in text, reasoning with numbers to perform simple calculations is an important skill of AI systems.

Arithmetic Reasoning Mathematical Reasoning +1

Paper
Add Code

How Much Coffee Was Consumed During EMNLP 2019? Fermi Problems: A New Reasoning Challenge for AI

1 code implementation • EMNLP 2021 • Ashwin Kalyan, Abhinav Kumar, Arjun Chandrasekaran, Ashish Sabharwal, Peter Clark

FPs are commonly used in quizzes and interviews to bring out and evaluate the creative reasoning abilities of humans.

Paper
Code

Model-Advantage and Value-Aware Models for Model-Based Reinforcement Learning: Bridging the Gap in Theory and Practice

1 code implementation • 26 Jun 2021 • Nirbhay Modhe, Harish Kamath, Dhruv Batra, Ashwin Kalyan

This work shows that value-aware model learning, known for its numerous theoretical benefits, is also practically viable for solving challenging continuous control tasks in prevalent model-based reinforcement learning algorithms.

Continuous Control Model-based Reinforcement Learning

Paper
Code

Programming Puzzles

3 code implementations • 10 Jun 2021 • Tal Schuster, Ashwin Kalyan, Oleksandr Polozov, Adam Tauman Kalai

The dataset is comprehensive in that it spans problems of a range of difficulties and domains, ranging from trivial string manipulation problems, to classic programming puzzles (e. g., Tower of Hanoi), to interview/competitive-programming problems (e. g., dynamic programming), to longstanding open problems in algorithms and mathematics (e. g., factoring).

Code Generation Natural Language Understanding +1

948

Paper
Code

Bridging Worlds in Reinforcement Learning with Model-Advantage

no code implementations • ICML Workshop LifelongML 2020 • Nirbhay Modhe, Harish K Kamath, Dhruv Batra, Ashwin Kalyan

Despite the breakthroughs achieved by Reinforcement Learning (RL) in recent years, RL agents often fail to perform well in unseen environments.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

ADAPTIVE GENERATION OF PROGRAMMING PUZZLES

no code implementations • 25 Sep 2019 • Ashwin Kalyan, Oleksandr Polozov, Adam Tauman Kalai

Puzzles are objective in that one can easily test the correctness of a given solution x by seeing whether it satisfies f, unlike the most common representations for program synthesis: given input-output pairs or an English problem description, the correctness of a given solution is not determined and is debatable.

Program Synthesis

Paper
Add Code

Learn from Your Neighbor: Learning Multi-modal Mappings from Sparse Annotations

no code implementations • ICML 2018 • Ashwin Kalyan, Stefan Lee, Anitha Kannan, Dhruv Batra

Many structured prediction problems (particularly in vision and language domains) are ambiguous, with multiple outputs being correct for an input - e. g. there are many ways of describing an image, multiple ways of translating a sentence; however, exhaustively annotating the applicability of all possible outputs is intractable due to exponentially large output spaces (e. g. all English sentences).

Multi-Label Classification Question Generation +3

Paper
Add Code

Neural-Guided Deductive Search for Real-Time Program Synthesis from Examples

no code implementations • ICLR 2018 • Ashwin Kalyan, Abhishek Mohta, Oleksandr Polozov, Dhruv Batra, Prateek Jain, Sumit Gulwani

In this work, we propose Neural Guided Deductive Search (NGDS), a hybrid synthesis technique that combines the best of both symbolic logic techniques and statistical models.

Program Synthesis

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.