Search Results for author: Karthik Narasimhan

Found 73 papers, 43 papers with code

Can Language Models Solve Olympiad Programming?

1 code implementation16 Apr 2024 Quan Shi, Michael Tang, Karthik Narasimhan, Shunyu Yao

In this paper, we introduce the USACO benchmark with 307 problems from the USA Computing Olympiad, along with high-quality unit tests, reference code, and official analyses for each problem.

RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs

no code implementations12 Apr 2024 Shreyas Chaudhari, Pranjal Aggarwal, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan, Ameet Deshpande, Bruno Castro da Silva

A promising approach is reinforcement learning from human feedback (RLHF), which leverages human feedback to update the model in accordance with human preferences and mitigate issues like toxicity and hallucinations.

Language Modelling reinforcement-learning

Language-Guided World Models: A Model-Based Approach to AI Control

no code implementations24 Jan 2024 Alex Zhang, Khanh Nguyen, Jens Tuyls, Albert Lin, Karthik Narasimhan

Installing probabilistic world models into artificial agents opens an efficient channel for humans to communicate with and control these agents.

QualEval: Qualitative Evaluation for Model Improvement

1 code implementation6 Nov 2023 Vishvak Murahari, Ameet Deshpande, Peter Clark, Tanmay Rajpurohit, Ashish Sabharwal, Karthik Narasimhan, Ashwin Kalyan

In this work, we address the shortcomings of quantitative metrics by proposing QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.

Progressively Efficient Learning

no code implementations13 Oct 2023 Ruijie Zheng, Khanh Nguyen, Hal Daumé III, Furong Huang, Karthik Narasimhan

By equipping a learning agent with an abstract, dynamic language and an intrinsic motivation to learn with minimal communication effort, CEIL leads to emergence of a human-like pattern where the learner and the teacher communicate progressively efficiently by exchanging increasingly more abstract intentions.

Imitation Learning

SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

no code implementations10 Oct 2023 Carlos E. Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, Karthik Narasimhan

We find real-world software engineering to be a rich, sustainable, and challenging testbed for evaluating the next generation of language models.

Bug fixing Code Generation +1

FireAct: Toward Language Agent Fine-tuning

no code implementations9 Oct 2023 Baian Chen, Chang Shu, Ehsan Shareghi, Nigel Collier, Karthik Narasimhan, Shunyu Yao

Recent efforts have augmented language models (LMs) with external tools or environments, leading to the development of language agents that can reason and act.

Question Answering

Cognitive Architectures for Language Agents

2 code implementations5 Sep 2023 Theodore R. Sumers, Shunyu Yao, Karthik Narasimhan, Thomas L. Griffiths

Recent efforts have augmented large language models (LLMs) with external resources (e. g., the Internet) or internal control flows (e. g., prompt chaining) for tasks requiring grounding or reasoning, leading to a new class of language agents.

Decision Making

Scaling Laws for Imitation Learning in Single-Agent Games

no code implementations18 Jul 2023 Jens Tuyls, Dhruv Madeka, Kari Torkkola, Dean Foster, Karthik Narasimhan, Sham Kakade

Inspired by recent work in Natural Language Processing (NLP) where "scaling up" has resulted in increasingly more capable LLMs, we investigate whether carefully scaling up model and data size can bring similar improvements in the imitation learning setting for single-agent games.

Atari Games Imitation Learning +1

COLLIE: Systematic Construction of Constrained Text Generation Tasks

1 code implementation17 Jul 2023 Shunyu Yao, Howard Chen, Austin W. Hanjie, Runzhe Yang, Karthik Narasimhan

Text generation under constraints have seen increasing interests in natural language processing, especially with the rapidly improving capabilities of large language models.

Logical Reasoning Sentence +1

InstructEval: Systematic Evaluation of Instruction Selection Methods

no code implementations1 Jul 2023 Anirudh Ajith, Chris Pan, Mengzhou Xia, Ameet Deshpande, Karthik Narasimhan

In-context learning (ICL) performs tasks by prompting a large language model (LLM) using an instruction and a small set of annotated examples called demonstrations.

Benchmarking In-Context Learning +2

InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback

2 code implementations NeurIPS 2023 John Yang, Akshara Prabhakar, Karthik Narasimhan, Shunyu Yao

Our framework is language and platform agnostic, uses self-contained Docker environments to provide safe and reproducible execution, and is compatible out-of-the-box with traditional seq2seq coding methods, while enabling the development of new methods for interactive code generation.

Benchmarking Code Generation +1

C-STS: Conditional Semantic Textual Similarity

1 code implementation24 May 2023 Ameet Deshpande, Carlos E. Jimenez, Howard Chen, Vishvak Murahari, Victoria Graf, Tanmay Rajpurohit, Ashwin Kalyan, Danqi Chen, Karthik Narasimhan

Semantic textual similarity (STS), a cornerstone task in NLP, measures the degree of similarity between a pair of sentences, and has broad application in fields such as information retrieval and natural language understanding.

Information Retrieval Language Modelling +8

PruMUX: Augmenting Data Multiplexing with Model Compression

1 code implementation24 May 2023 Yushan Su, Vishvak Murahari, Karthik Narasimhan, Kai Li

As language models increase in size by the day, methods for efficient inference are critical to leveraging their capabilities for various applications.

Knowledge Distillation Model Compression

Anthropomorphization of AI: Opportunities and Risks

no code implementations24 May 2023 Ameet Deshpande, Tanmay Rajpurohit, Karthik Narasimhan, Ashwin Kalyan

With widespread adoption of AI systems, and the push from stakeholders to make it human-like through alignment techniques, human voice, and pictorial avatars, the tendency for users to anthropomorphize it increases significantly.

Attribute

Referral Augmentation for Zero-Shot Information Retrieval

1 code implementation24 May 2023 Michael Tang, Shunyu Yao, John Yang, Karthik Narasimhan

We propose Referral-Augmented Retrieval (RAR), a simple technique that concatenates document indices with referrals, i. e. text from other documents that cite or link to the given document, to provide significant performance gains for zero-shot information retrieval.

Information Retrieval Retrieval

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

3 code implementations NeurIPS 2023 Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan

Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference.

Decision Making Language Modelling

Toxicity in ChatGPT: Analyzing Persona-assigned Language Models

no code implementations11 Apr 2023 Ameet Deshpande, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan

Large language models (LLMs) have shown incredible capabilities and transcended the natural language processing (NLP) community, with adoption throughout many services like healthcare, therapy, education, and customer service.

Reflexion: Language Agents with Verbal Reinforcement Learning

2 code implementations NeurIPS 2023 Noah Shinn, Federico Cassano, Edward Berman, Ashwin Gopinath, Karthik Narasimhan, Shunyu Yao

Large language models (LLMs) have been increasingly used to interact with external environments (e. g., games, compilers, APIs) as goal-driven agents.

Decision Making reinforcement-learning

MUX-PLMs: Data Multiplexing for High-throughput Language Models

1 code implementation24 Feb 2023 Vishvak Murahari, Ameet Deshpande, Carlos E. Jimenez, Izhak Shafran, Mingqiu Wang, Yuan Cao, Karthik Narasimhan

The widespread adoption of large language models such as ChatGPT and Bard has led to unprecedented demand for these technologies.

SemSup-XC: Semantic Supervision for Zero and Few-shot Extreme Classification

1 code implementation26 Jan 2023 Pranjal Aggarwal, Ameet Deshpande, Karthik Narasimhan

In this paper, we develop SemSup-XC, a model that achieves state-of-the-art zero-shot and few-shot performance on three XC datasets derived from legal, e-commerce, and Wikipedia data.

Contrastive Learning

Building Scalable Video Understanding Benchmarks through Sports

no code implementations17 Jan 2023 Aniket Agarwal, Alex Zhang, Karthik Narasimhan, Igor Gilitschenski, Vishvak Murahari, Yash Kant

Our human studies indicate that ASAP can align videos and annotations with high fidelity, precision, and speed.

Video Understanding

Controllable Text Generation with Language Constraints

no code implementations20 Dec 2022 Howard Chen, Huihan Li, Danqi Chen, Karthik Narasimhan

We consider the task of text generation in language models with constraints specified in natural language.

Attribute Language Modelling +1

SPARTAN: Sparse Hierarchical Memory for Parameter-Efficient Transformers

1 code implementation29 Nov 2022 Ameet Deshpande, Md Arafat Sultan, Anthony Ferritto, Ashwin Kalyan, Karthik Narasimhan, Avirup Sil

Fine-tuning pre-trained language models (PLMs) achieves impressive performance on a range of downstream tasks, and their sizes have consequently been getting bigger.

ALIGN-MLM: Word Embedding Alignment is Crucial for Multilingual Pre-training

1 code implementation15 Nov 2022 Henry Tang, Ameet Deshpande, Karthik Narasimhan

In particular, ALIGN-MLM outperforms XLM and MLM by 35 and 30 F1 points on POS-tagging for transfer between languages that differ both in their script and word order (left-to-right v. s.

POS POS Tagging +2

ReAct: Synergizing Reasoning and Acting in Language Models

5 code implementations6 Oct 2022 Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao

While large language models (LLMs) have demonstrated impressive capabilities across tasks in language understanding and interactive decision making, their abilities for reasoning (e. g. chain-of-thought prompting) and acting (e. g. action plan generation) have primarily been studied as separate topics.

Decision Making Fact Verification +2

WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents

1 code implementation4 Jul 2022 Shunyu Yao, Howard Chen, John Yang, Karthik Narasimhan

Existing benchmarks for grounding language in interactive environments either lack real-world linguistic elements, or prove difficult to scale up due to substantial human involvement in the collection of data or feedback signals.

Imitation Learning Navigate

Leveraging Language for Accelerated Learning of Tool Manipulation

no code implementations27 Jun 2022 Allen Z. Ren, Bharat Govil, Tsung-Yen Yang, Karthik Narasimhan, Anirudha Majumdar

Robust and generalized tool manipulation requires an understanding of the properties and affordances of different tools.

Meta-Learning

Using Natural Language and Program Abstractions to Instill Human Inductive Biases in Machines

1 code implementation23 May 2022 Sreejan Kumar, Carlos G. Correa, Ishita Dasgupta, Raja Marjieh, Michael Y. Hu, Robert D. Hawkins, Nathaniel D. Daw, Jonathan D. Cohen, Karthik Narasimhan, Thomas L. Griffiths

Co-training on these representations result in more human-like behavior in downstream meta-reinforcement learning agents than less abstract controls (synthetic language descriptions, program induction without learned primitives), suggesting that the abstraction supported by these representations is key.

Meta-Learning Meta Reinforcement Learning +2

Can Rationalization Improve Robustness?

1 code implementation NAACL 2022 Howard Chen, Jacqueline He, Karthik Narasimhan, Danqi Chen

Our experiments reveal that the rationale models show the promise to improve robustness, while they struggle in certain scenarios--when the rationalizer is sensitive to positional bias or lexical choices of attack text.

Sentence

CARETS: A Consistency And Robustness Evaluative Test Suite for VQA

1 code implementation ACL 2022 Carlos E. Jimenez, Olga Russakovsky, Karthik Narasimhan

We introduce CARETS, a systematic test suite to measure consistency and robustness of modern VQA models through a series of six fine-grained capability tests.

Negation Question Generation +2

SemSup: Semantic Supervision for Simple and Scalable Zero-shot Generalization

1 code implementation26 Feb 2022 Austin W. Hanjie, Ameet Deshpande, Karthik Narasimhan

Prior work along this vein have largely used expensive per-instance annotation or singular class-level descriptions, but per-instance descriptions are hard to scale and single class descriptions may not be rich enough.

Semantic Similarity Semantic Textual Similarity +3

DataMUX: Data Multiplexing for Neural Networks

1 code implementation18 Feb 2022 Vishvak Murahari, Carlos E. Jimenez, Runzhe Yang, Karthik Narasimhan

In this paper, we introduce data multiplexing (DataMUX), a technique that enables deep neural networks to process multiple inputs simultaneously using a single compact representation.

Image Classification named-entity-recognition +5

Multi-Query Video Retrieval

1 code implementation10 Jan 2022 Zeyu Wang, Yu Wu, Karthik Narasimhan, Olga Russakovsky

Retrieving target videos based on text descriptions is a task of great practical value and has received increasing attention over the past few years.

Retrieval Video Retrieval

Multi-Stage Episodic Control for Strategic Exploration in Text Games

1 code implementation ICLR 2022 Jens Tuyls, Shunyu Yao, Sham Kakade, Karthik Narasimhan

Text adventure games present unique challenges to reinforcement learning methods due to their combinatorially large action spaces and sparse rewards.

SILG: The Multi-domain Symbolic Interactive Language Grounding Benchmark

no code implementations NeurIPS 2021 Victor Zhong, Austin Hanjie, Sida Wang, Karthik Narasimhan, Luke Zettlemoyer

We hope SILG enables the community to quickly identify new methodolo- gies for language grounding that generalize to a diverse set of environments and their associated challenges.

Grounded language learning NetHack

When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer

2 code implementations NAACL 2022 Ameet Deshpande, Partha Talukdar, Karthik Narasimhan

While recent work on multilingual language models has demonstrated their capacity for cross-lingual zero-shot transfer on downstream tasks, there is a lack of consensus in the community as to what shared properties between languages enable such transfer.

Cross-Lingual Transfer

SILG: The Multi-environment Symbolic Interactive Language Grounding Benchmark

1 code implementation20 Oct 2021 Victor Zhong, Austin W. Hanjie, Sida I. Wang, Karthik Narasimhan, Luke Zettlemoyer

We hope SILG enables the community to quickly identify new methodologies for language grounding that generalize to a diverse set of environments and their associated challenges.

Grounded language learning NetHack

Revelio: ML-Generated Debugging Queries for Distributed Systems

no code implementations28 Jun 2021 Pradeep Dogga, Karthik Narasimhan, Anirudh Sivaraman, Shiv Kumar Saini, George Varghese, Ravi Netravali

A major difficulty in debugging distributed systems lies in manually determining which of the many available debugging tools to use and how to query its logs.

Self-Attention Networks Can Process Bounded Hierarchical Languages

1 code implementation ACL 2021 Shunyu Yao, Binghui Peng, Christos Papadimitriou, Karthik Narasimhan

Despite their impressive performance in NLP, self-attention networks were recently proved to be limited for processing formal languages with hierarchical structure, such as $\mathsf{Dyck}_k$, the language consisting of well-nested parentheses of $k$ types.

Hard Attention

Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning

1 code implementation19 Jan 2021 Austin W. Hanjie, Victor Zhong, Karthik Narasimhan

We investigate the use of natural language to drive the generalization of control policies and introduce the new multi-task environment Messenger with free-form text manuals describing the environment dynamics.

reinforcement-learning Reinforcement Learning (RL) +1

Connecting Context-specific Adaptation in Humans to Meta-learning

no code implementations27 Nov 2020 Rachit Dubey, Erin Grant, Michael Luo, Karthik Narasimhan, Thomas Griffiths

This work connects the context-sensitive nature of cognitive control to a method for meta-learning with context-conditioned adaptation.

Meta-Learning

Improving Dialog Systems for Negotiation with Personality Modeling

1 code implementation ACL 2021 Runzhe Yang, Jingxiao Chen, Karthik Narasimhan

In this paper, we explore the ability to model and infer personality types of opponents, predict their responses, and use this information to adapt a dialog agent's high-level strategy in negotiation tasks.

Projection-Based Constrained Policy Optimization

no code implementations ICLR 2020 Tsung-Yen Yang, Justinian Rosca, Karthik Narasimhan, Peter J. Ramadge

We consider the problem of learning control policies that optimize a reward function while satisfying constraints due to considerations of safety, fairness, or other costs.

Fairness

Keep CALM and Explore: Language Models for Action Generation in Text-based Games

1 code implementation EMNLP 2020 Shunyu Yao, Rohan Rao, Matthew Hausknecht, Karthik Narasimhan

In this paper, we propose the Contextual Action Language Model (CALM) to generate a compact set of action candidates at each game state.

Action Generation Language Modelling +1

Towards Unique and Informative Captioning of Images

1 code implementation ECCV 2020 Zeyu Wang, Berthy Feng, Karthik Narasimhan, Olga Russakovsky

We find that modern captioning systems return higher likelihoods for incorrect distractor sentences compared to ground truth captions, and that evaluation metrics like SPICE can be 'topped' using simple captioning systems relying on object detectors.

Image Captioning Re-Ranking

Accelerating Safe Reinforcement Learning with Constraint-mismatched Policies

no code implementations20 Jun 2020 Tsung-Yen Yang, Justinian Rosca, Karthik Narasimhan, Peter J. Ramadge

We consider the problem of reinforcement learning when provided with (1) a baseline control policy and (2) a set of constraints that the learner must satisfy.

Fairness reinforcement-learning +2

Universal Adversarial Attacks with Natural Triggers for Text Classification

1 code implementation NAACL 2021 Liwei Song, Xinwei Yu, Hsuan-Tung Peng, Karthik Narasimhan

Recent work has demonstrated the vulnerability of modern text classifiers to universal adversarial attacks, which are input-agnostic sequences of words added to text processed by classifiers.

General Classification text-classification +1

Take the Scenic Route: Improving Generalization in Vision-and-Language Navigation

no code implementations31 Mar 2020 Felix Yu, Zhiwei Deng, Karthik Narasimhan, Olga Russakovsky

In the Vision-and-Language Navigation (VLN) task, an agent with egocentric vision navigates to a destination given natural language instructions.

Vision and Language Navigation

A Generalized Algorithm for Multi-Objective Reinforcement Learning and Policy Adaptation

3 code implementations NeurIPS 2019 Runzhe Yang, Xingyuan Sun, Karthik Narasimhan

We introduce a new algorithm for multi-objective reinforcement learning (MORL) with linear preferences, with the goal of enabling few-shot adaptation to new tasks.

Multi-Objective Reinforcement Learning reinforcement-learning

Calibration, Entropy Rates, and Memory in Language Models

no code implementations ICML 2020 Mark Braverman, Xinyi Chen, Sham M. Kakade, Karthik Narasimhan, Cyril Zhang, Yi Zhang

Building accurate language models that capture meaningful long-term dependencies is a core challenge in natural language processing.

Task-Agnostic Dynamics Priors for Deep Reinforcement Learning

1 code implementation13 May 2019 Yilun Du, Karthik Narasimhan

While model-based deep reinforcement learning (RL) holds great promise for sample efficiency and generalization, learning an accurate dynamics model is often challenging and requires substantial interaction with the environment.

reinforcement-learning Reinforcement Learning (RL)

Learning Physics Priors for Deep Reinforcement Learing

no code implementations27 Sep 2018 Yilun Du, Karthik Narasimhan

While model-based deep reinforcement learning (RL) holds great promise for sample efficiency and generalization, learning an accurate dynamics model is challenging and often requires substantial interactions with the environment.

Reinforcement Learning (RL) Transfer Learning

Improving Language Understanding by Generative Pre-Training

11 code implementations Preprint 2018 Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever

We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task.

Cloze Test Document Classification +6

Grounding Language for Transfer in Deep Reinforcement Learning

1 code implementation1 Aug 2017 Karthik Narasimhan, Regina Barzilay, Tommi Jaakkola

In this paper, we explore the utilization of natural language to drive transfer for reinforcement learning (RL).

reinforcement-learning Reinforcement Learning (RL)

Representation Learning for Grounded Spatial Reasoning

1 code implementation TACL 2018 Michael Janner, Karthik Narasimhan, Regina Barzilay

The interpretation of spatial references is highly contextual, requiring joint inference over both language and the environment.

reinforcement-learning Reinforcement Learning (RL) +1

Unsupervised Learning of Morphological Forests

no code implementations TACL 2017 Jiaming Luo, Karthik Narasimhan, Regina Barzilay

This paper focuses on unsupervised modeling of morphological families, collectively comprising a forest over the language vocabulary.

Clustering

sk_p: a neural program corrector for MOOCs

no code implementations11 Jul 2016 Yewen Pu, Karthik Narasimhan, Armando Solar-Lezama, Regina Barzilay

We present a novel technique for automatic program correction in MOOCs, capable of fixing both syntactic and semantic errors without manual, problem specific correction strategies.

Machine Translation Translation

An Unsupervised Method for Uncovering Morphological Chains

1 code implementation TACL 2015 Karthik Narasimhan, Regina Barzilay, Tommi Jaakkola

In contrast, we propose a model for unsupervised morphological analysis that integrates orthographic and semantic views of words.

Morphological Analysis

JUMP-Means: Small-Variance Asymptotics for Markov Jump Processes

no code implementations1 Mar 2015 Jonathan H. Huggins, Karthik Narasimhan, Ardavan Saeedi, Vikash K. Mansinghka

We derive the small-variance asymptotics for parametric and nonparametric MJPs for both directly observed and hidden state models.

Cannot find the paper you are looking for? You can Submit a new open access paper.