Search Results for author: Shunyu Yao

Found 31 papers, 23 papers with code

Can Language Models Solve Olympiad Programming?

1 code implementation16 Apr 2024 Quan Shi, Michael Tang, Karthik Narasimhan, Shunyu Yao

In this paper, we introduce the USACO benchmark with 307 problems from the USA Computing Olympiad, along with high-quality unit tests, reference code, and official analyses for each problem.

OS-Copilot: Towards Generalist Computer Agents with Self-Improvement

1 code implementation12 Feb 2024 Zhiyong Wu, Chengcheng Han, Zichen Ding, Zhenmin Weng, Zhoumianze Liu, Shunyu Yao, Tao Yu, Lingpeng Kong

Autonomous interaction with the computer has been a longstanding challenge with great potential, and the recent proliferation of large language models (LLMs) has markedly accelerated progress in building digital agents.

Large Language Model for Multi-objective Evolutionary Optimization

1 code implementation19 Oct 2023 Fei Liu, Xi Lin, Zhenkun Wang, Shunyu Yao, Xialiang Tong, Mingxuan Yuan, Qingfu Zhang

It is also promising to see the operator only learned from a few instances can have robust generalization performance on unseen problems with quite different patterns and settings.

Evolutionary Algorithms Language Modelling +3

SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

no code implementations10 Oct 2023 Carlos E. Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, Karthik Narasimhan

We find real-world software engineering to be a rich, sustainable, and challenging testbed for evaluating the next generation of language models.

Bug fixing Code Generation +1

FireAct: Toward Language Agent Fine-tuning

no code implementations9 Oct 2023 Baian Chen, Chang Shu, Ehsan Shareghi, Nigel Collier, Karthik Narasimhan, Shunyu Yao

Recent efforts have augmented language models (LMs) with external tools or environments, leading to the development of language agents that can reason and act.

Question Answering

Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve

no code implementations24 Sep 2023 R. Thomas McCoy, Shunyu Yao, Dan Friedman, Matthew Hardy, Thomas L. Griffiths

This approach - which we call the teleological approach - leads us to identify three factors that we hypothesize will influence LLM accuracy: the probability of the task to be performed, the probability of the target output, and the probability of the provided input.

Cognitive Architectures for Language Agents

2 code implementations5 Sep 2023 Theodore R. Sumers, Shunyu Yao, Karthik Narasimhan, Thomas L. Griffiths

Recent efforts have augmented large language models (LLMs) with external resources (e. g., the Internet) or internal control flows (e. g., prompt chaining) for tasks requiring grounding or reasoning, leading to a new class of language agents.

Decision Making

COLLIE: Systematic Construction of Constrained Text Generation Tasks

1 code implementation17 Jul 2023 Shunyu Yao, Howard Chen, Austin W. Hanjie, Runzhe Yang, Karthik Narasimhan

Text generation under constraints have seen increasing interests in natural language processing, especially with the rapidly improving capabilities of large language models.

Logical Reasoning Sentence +1

InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback

2 code implementations NeurIPS 2023 John Yang, Akshara Prabhakar, Karthik Narasimhan, Shunyu Yao

Our framework is language and platform agnostic, uses self-contained Docker environments to provide safe and reproducible execution, and is compatible out-of-the-box with traditional seq2seq coding methods, while enabling the development of new methods for interactive code generation.

Benchmarking Code Generation +1

Referral Augmentation for Zero-Shot Information Retrieval

1 code implementation24 May 2023 Michael Tang, Shunyu Yao, John Yang, Karthik Narasimhan

We propose Referral-Augmented Retrieval (RAR), a simple technique that concatenates document indices with referrals, i. e. text from other documents that cite or link to the given document, to provide significant performance gains for zero-shot information retrieval.

Information Retrieval Retrieval

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

3 code implementations NeurIPS 2023 Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan

Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference.

Decision Making Language Modelling

Personality Understanding of Fictional Characters during Book Reading

1 code implementation17 May 2023 Mo Yu, Jiangnan Li, Shunyu Yao, Wenjie Pang, Xiaochen Zhou, Zhou Xiao, Fandong Meng, Jie zhou

As readers engage with a story, their understanding of a character evolves based on new events and information; and multiple fine-grained aspects of personalities can be perceived.

EC^2: Emergent Communication for Embodied Control

no code implementations19 Apr 2023 Yao Mu, Shunyu Yao, Mingyu Ding, Ping Luo, Chuang Gan

We learn embodied representations of video trajectories, emergent language, and natural language using a language model, which is then used to finetune a lightweight policy network for downstream control.

Contrastive Learning Language Modelling

Reflexion: Language Agents with Verbal Reinforcement Learning

2 code implementations NeurIPS 2023 Noah Shinn, Federico Cassano, Edward Berman, Ashwin Gopinath, Karthik Narasimhan, Shunyu Yao

Large language models (LLMs) have been increasingly used to interact with external environments (e. g., games, compilers, APIs) as goal-driven agents.

Decision Making reinforcement-learning

EC2: Emergent Communication for Embodied Control

no code implementations CVPR 2023 Yao Mu, Shunyu Yao, Mingyu Ding, Ping Luo, Chuang Gan

We learn embodied representations of video trajectories, emergent language, and natural language using a language model, which is then used to finetune a lightweight policy network for downstream control.

Contrastive Learning Language Modelling

Revisiting the Roles of "Text" in Text Games

no code implementations15 Oct 2022 Yi Gu, Shunyu Yao, Chuang Gan, Joshua B. Tenenbaum, Mo Yu

Text games present opportunities for natural language understanding (NLU) methods to tackle reinforcement learning (RL) challenges.

Natural Language Understanding Passage Retrieval +2

ReAct: Synergizing Reasoning and Acting in Language Models

5 code implementations6 Oct 2022 Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao

While large language models (LLMs) have demonstrated impressive capabilities across tasks in language understanding and interactive decision making, their abilities for reasoning (e. g. chain-of-thought prompting) and acting (e. g. action plan generation) have primarily been studied as separate topics.

Decision Making Fact Verification +2

WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents

1 code implementation4 Jul 2022 Shunyu Yao, Howard Chen, John Yang, Karthik Narasimhan

Existing benchmarks for grounding language in interactive environments either lack real-world linguistic elements, or prove difficult to scale up due to substantial human involvement in the collection of data or feedback signals.

Imitation Learning Navigate

TVShowGuess: Character Comprehension in Stories as Speaker Guessing

1 code implementation NAACL 2022 Yisi Sang, Xiangyang Mou, Mo Yu, Shunyu Yao, Jing Li, Jeffrey Stanton

We propose a new task for assessing machines' skills of understanding fictional characters in narrative stories.

Linking Emergent and Natural Languages via Corpus Transfer

1 code implementation ICLR 2022 Shunyu Yao, Mo Yu, Yang Zhang, Karthik R Narasimhan, Joshua B. Tenenbaum, Chuang Gan

In this work, we propose a novel way to establish such a link by corpus transfer, i. e. pretraining on a corpus of emergent language for downstream natural language tasks, which is in contrast to prior work that directly transfers speaker and listener parameters.

Attribute Disentanglement +2

Multi-Stage Episodic Control for Strategic Exploration in Text Games

1 code implementation ICLR 2022 Jens Tuyls, Shunyu Yao, Sham Kakade, Karthik Narasimhan

Text adventure games present unique challenges to reinforcement learning methods due to their combinatorially large action spaces and sparse rewards.

DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering

no code implementations3 Jan 2022 Shunyu Yao, RuiZhe Zhong, Yichao Yan, Guangtao Zhai, Xiaokang Yang

Specifically, neural radiance field takes lip movements features and personalized attributes as two disentangled conditions, where lip movements are directly predicted from the audio inputs to achieve lip-synchronized generation.

Neural Rendering Talking Head Generation

Calibrated RGB-D Salient Object Detection

1 code implementation CVPR 2021 Wei Ji, Jingjing Li, Shuang Yu, Miao Zhang, Yongri Piao, Shunyu Yao, Qi Bi, Kai Ma, Yefeng Zheng, Huchuan Lu, Li Cheng

Complex backgrounds and similar appearances between objects and their surroundings are generally recognized as challenging scenarios in Salient Object Detection (SOD).

Object object-detection +3

Self-Attention Networks Can Process Bounded Hierarchical Languages

1 code implementation ACL 2021 Shunyu Yao, Binghui Peng, Christos Papadimitriou, Karthik Narasimhan

Despite their impressive performance in NLP, self-attention networks were recently proved to be limited for processing formal languages with hierarchical structure, such as $\mathsf{Dyck}_k$, the language consisting of well-nested parentheses of $k$ types.

Hard Attention

Keep CALM and Explore: Language Models for Action Generation in Text-based Games

1 code implementation EMNLP 2020 Shunyu Yao, Rohan Rao, Matthew Hausknecht, Karthik Narasimhan

In this paper, we propose the Contextual Action Language Model (CALM) to generate a compact set of action candidates at each game state.

Action Generation Language Modelling +1

Modeling Expectation Violation in Intuitive Physics with Coarse Probabilistic Object Representations

1 code implementation NeurIPS 2019 Kevin Smith, Lingjie Mei, Shunyu Yao, Jiajun Wu, Elizabeth Spelke, Josh Tenenbaum, Tomer Ullman

We also present a new test set for measuring violations of physical expectations, using a range of scenarios derived from developmental psychology.

Scene Understanding

3D-Aware Scene Manipulation via Inverse Graphics

1 code implementation NeurIPS 2018 Shunyu Yao, Tzu Ming Harry Hsu, Jun-Yan Zhu, Jiajun Wu, Antonio Torralba, William T. Freeman, Joshua B. Tenenbaum

In this work, we propose 3D scene de-rendering networks (3D-SDN) to address the above issues by integrating disentangled representations for semantics, geometry, and appearance into a deep generative model.

Disentanglement Object

Cannot find the paper you are looking for? You can Submit a new open access paper.