Search Results for author: Shuyan Zhou

Found 19 papers, 15 papers with code

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

no code implementations • 11 Apr 2024 • Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh Jing Hua, Zhoujun Cheng, Dongchan Shin, Fangyu Lei, Yitao Liu, Yiheng Xu, Shuyan Zhou, Silvio Savarese, Caiming Xiong, Victor Zhong, Tao Yu

Autonomous agents that accomplish complex computer tasks with minimal human interventions have the potential to transform human-computer interaction, significantly enhancing accessibility and productivity.

Benchmarking

Paper
Add Code

VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks

1 code implementation • 24 Jan 2024 • Jing Yu Koh, Robert Lo, Lawrence Jang, Vikram Duvvur, Ming Chong Lim, Po-Yu Huang, Graham Neubig, Shuyan Zhou, Ruslan Salakhutdinov, Daniel Fried

Through extensive quantitative and qualitative analysis, we identify several limitations of text-only LLM agents, and reveal gaps in the capabilities of state-of-the-art multimodal language agents.

142

Paper
Code

WebArena: A Realistic Web Environment for Building Autonomous Agents

1 code implementation • 25 Jul 2023 • Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Yonatan Bisk, Daniel Fried, Uri Alon, Graham Neubig

Building upon our environment, we release a set of benchmark tasks focusing on evaluating the functional correctness of task completions.

540

Paper
Code

Hierarchical Prompting Assists Large Language Model on Web Navigation

3 code implementations • 23 May 2023 • Abishek Sridhar, Robert Lo, Frank F. Xu, Hao Zhu, Shuyan Zhou

Large language models (LLMs) struggle on processing complicated observations in interactive decision making tasks.

Decision Making Language Modelling +1

540

Paper
Code

Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation

no code implementations • 1 May 2023 • Patrick Fernandes, Aman Madaan, Emmy Liu, António Farinhas, Pedro Henrique Martins, Amanda Bertsch, José G. C. de Souza, Shuyan Zhou, Tongshuang Wu, Graham Neubig, André F. T. Martins

Many recent advances in natural language generation have been fueled by training large language models on internet-scale data.

Text Generation

Paper
Add Code

CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code

1 code implementation • 10 Feb 2023 • Shuyan Zhou, Uri Alon, Sumit Agarwal, Graham Neubig

We release five language-specific pretrained models to use with our publicly available code.

Code Generation

134

Paper
Code

Causal Reasoning of Entities and Events in Procedural Texts

1 code implementation • 26 Jan 2023 • Li Zhang, Hainiu Xu, Yue Yang, Shuyan Zhou, Weiqiu You, Manni Arora, Chris Callison-Burch

By injecting the causal relations between entities and events as intermediate reasoning steps in our representation, we further boost the performance to . 67 F1.

Paper
Code

Execution-Based Evaluation for Open-Domain Code Generation

1 code implementation • 20 Dec 2022 • Zhiruo Wang, Shuyan Zhou, Daniel Fried, Graham Neubig

To extend the scope of coding queries to more realistic settings, we propose ODEX, the first Open-Domain EXecution-based natural language (NL) to Python code generation dataset.

Code Generation Memorization

Paper
Code

PAL: Program-aided Language Models

2 code implementations • 18 Nov 2022 • Luyu Gao, Aman Madaan, Shuyan Zhou, Uri Alon, PengFei Liu, Yiming Yang, Jamie Callan, Graham Neubig

Much of this success can be attributed to prompting methods such as "chain-of-thought'', which employ LLMs for both understanding the problem description by decomposing it into steps, as well as solving each step of the problem.

Ranked #17 on Arithmetic Reasoning on GSM8K

Arithmetic Reasoning GSM8K +2

1,169

Paper
Code

Language Models of Code are Few-Shot Commonsense Learners

1 code implementation • 13 Oct 2022 • Aman Madaan, Shuyan Zhou, Uri Alon, Yiming Yang, Graham Neubig

In all these natural language tasks, we show that using our approach, a code generation LM (CODEX) outperforms natural-LMs that are fine-tuned on the target task (e. g., T5) and other strong LMs such as GPT-3 in the few-shot setting.

Code Generation

Paper
Code

DocPrompting: Generating Code by Retrieving the Docs

2 code implementations • 13 Jul 2022 • Shuyan Zhou, Uri Alon, Frank F. Xu, Zhiruo Wang, Zhengbao Jiang, Graham Neubig

Publicly available source-code libraries are continuously growing and changing.

Code Generation

220

Paper
Code

MCoNaLa: A Benchmark for Code Generation from Multiple Natural Languages

1 code implementation • 16 Mar 2022 • Zhiruo Wang, Grace Cuenca, Shuyan Zhou, Frank F. Xu, Graham Neubig

While there has been a recent burgeoning of applications at the intersection of natural and programming languages, such as code generation and code summarization, these applications are usually English-centric.

Code Generation Code Summarization

Paper
Code

Show Me More Details: Discovering Hierarchies of Procedures from Semi-structured Web Data

1 code implementation • ACL 2022 • Shuyan Zhou, Li Zhang, Yue Yang, Qing Lyu, Pengcheng Yin, Chris Callison-Burch, Graham Neubig

To this end, we develop a simple and efficient method that links steps (e. g., "purchase a camera") in an article to other articles with similar goals (e. g., "how to choose a camera"), recursively constructing the KB.

Retrieval Video Retrieval

Paper
Code

Procedures as Programs: Hierarchical Control of Situated Agents through Natural Language

no code implementations • NAACL (SUKI) 2022 • Shuyan Zhou, Pengcheng Yin, Graham Neubig

When humans conceive how to perform a particular task, they do so hierarchically: splitting higher-level tasks into smaller sub-tasks.

Instruction Following

Paper
Add Code

Soft Gazetteers for Low-Resource Named Entity Recognition

1 code implementation • ACL 2020 • Shruti Rijhwani, Shuyan Zhou, Graham Neubig, Jaime Carbonell

However, designing such features for low-resource languages is challenging, because exhaustive entity gazetteers do not exist in these languages.

Cross-Lingual Entity Linking Entity Linking +4

Paper
Code

Improving Candidate Generation for Low-resource Cross-lingual Entity Linking

1 code implementation • TACL 2020 • Shuyan Zhou, Shruti Rijhawani, John Wieting, Jaime Carbonell, Graham Neubig

Cross-lingual entity linking (XEL) is the task of finding referents in a target-language knowledge base (KB) for mentions extracted from source-language texts.

Cross-Lingual Entity Linking Entity Linking +1

Paper
Code

Towards Zero-resource Cross-lingual Entity Linking

1 code implementation • WS 2019 • Shuyan Zhou, Shruti Rijhwani, Graham Neubig

Cross-lingual entity linking (XEL) grounds named entities in a source language to an English Knowledge Base (KB), such as Wikipedia.

Cross-Lingual Entity Linking Entity Linking

Paper
Code

Improving Robustness of Neural Machine Translation with Multi-task Learning

1 code implementation • WS 2019 • Shuyan Zhou, Xiangkai Zeng, Yingqi Zhou, Antonios Anastasopoulos, Graham Neubig

While neural machine translation (NMT) achieves remarkable performance on clean, in-domain text, performance is known to degrade drastically when facing text which is full of typos, grammatical errors and other varieties of noise.

Machine Translation Multi-Task Learning +2

Paper
Code

Aggregated Semantic Matching for Short Text Entity Linking

no code implementations • CONLL 2018 • Feng Nie, Shuyan Zhou, Jing Liu, Jinpeng Wang, Chin-Yew Lin, Rong pan

The task of entity linking aims to identify concepts mentioned in a text fragments and link them to a reference knowledge base.

Card Games Entity Linking +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.