no code implementations • 11 Apr 2024 • Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh Jing Hua, Zhoujun Cheng, Dongchan Shin, Fangyu Lei, Yitao Liu, Yiheng Xu, Shuyan Zhou, Silvio Savarese, Caiming Xiong, Victor Zhong, Tao Yu
Autonomous agents that accomplish complex computer tasks with minimal human interventions have the potential to transform human-computer interaction, significantly enhancing accessibility and productivity.
1 code implementation • 24 Jan 2024 • Jing Yu Koh, Robert Lo, Lawrence Jang, Vikram Duvvur, Ming Chong Lim, Po-Yu Huang, Graham Neubig, Shuyan Zhou, Ruslan Salakhutdinov, Daniel Fried
Through extensive quantitative and qualitative analysis, we identify several limitations of text-only LLM agents, and reveal gaps in the capabilities of state-of-the-art multimodal language agents.
1 code implementation • 25 Jul 2023 • Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Yonatan Bisk, Daniel Fried, Uri Alon, Graham Neubig
Building upon our environment, we release a set of benchmark tasks focusing on evaluating the functional correctness of task completions.
3 code implementations • 23 May 2023 • Abishek Sridhar, Robert Lo, Frank F. Xu, Hao Zhu, Shuyan Zhou
Large language models (LLMs) struggle on processing complicated observations in interactive decision making tasks.
no code implementations • 1 May 2023 • Patrick Fernandes, Aman Madaan, Emmy Liu, António Farinhas, Pedro Henrique Martins, Amanda Bertsch, José G. C. de Souza, Shuyan Zhou, Tongshuang Wu, Graham Neubig, André F. T. Martins
Many recent advances in natural language generation have been fueled by training large language models on internet-scale data.
1 code implementation • 10 Feb 2023 • Shuyan Zhou, Uri Alon, Sumit Agarwal, Graham Neubig
We release five language-specific pretrained models to use with our publicly available code.
1 code implementation • 26 Jan 2023 • Li Zhang, Hainiu Xu, Yue Yang, Shuyan Zhou, Weiqiu You, Manni Arora, Chris Callison-Burch
By injecting the causal relations between entities and events as intermediate reasoning steps in our representation, we further boost the performance to . 67 F1.
1 code implementation • 20 Dec 2022 • Zhiruo Wang, Shuyan Zhou, Daniel Fried, Graham Neubig
To extend the scope of coding queries to more realistic settings, we propose ODEX, the first Open-Domain EXecution-based natural language (NL) to Python code generation dataset.
2 code implementations • 18 Nov 2022 • Luyu Gao, Aman Madaan, Shuyan Zhou, Uri Alon, PengFei Liu, Yiming Yang, Jamie Callan, Graham Neubig
Much of this success can be attributed to prompting methods such as "chain-of-thought'', which employ LLMs for both understanding the problem description by decomposing it into steps, as well as solving each step of the problem.
Ranked #18 on Arithmetic Reasoning on GSM8K
1 code implementation • 13 Oct 2022 • Aman Madaan, Shuyan Zhou, Uri Alon, Yiming Yang, Graham Neubig
In all these natural language tasks, we show that using our approach, a code generation LM (CODEX) outperforms natural-LMs that are fine-tuned on the target task (e. g., T5) and other strong LMs such as GPT-3 in the few-shot setting.
2 code implementations • 13 Jul 2022 • Shuyan Zhou, Uri Alon, Frank F. Xu, Zhiruo Wang, Zhengbao Jiang, Graham Neubig
Publicly available source-code libraries are continuously growing and changing.
1 code implementation • 16 Mar 2022 • Zhiruo Wang, Grace Cuenca, Shuyan Zhou, Frank F. Xu, Graham Neubig
While there has been a recent burgeoning of applications at the intersection of natural and programming languages, such as code generation and code summarization, these applications are usually English-centric.
1 code implementation • ACL 2022 • Shuyan Zhou, Li Zhang, Yue Yang, Qing Lyu, Pengcheng Yin, Chris Callison-Burch, Graham Neubig
To this end, we develop a simple and efficient method that links steps (e. g., "purchase a camera") in an article to other articles with similar goals (e. g., "how to choose a camera"), recursively constructing the KB.
no code implementations • NAACL (SUKI) 2022 • Shuyan Zhou, Pengcheng Yin, Graham Neubig
When humans conceive how to perform a particular task, they do so hierarchically: splitting higher-level tasks into smaller sub-tasks.
1 code implementation • ACL 2020 • Shruti Rijhwani, Shuyan Zhou, Graham Neubig, Jaime Carbonell
However, designing such features for low-resource languages is challenging, because exhaustive entity gazetteers do not exist in these languages.
1 code implementation • TACL 2020 • Shuyan Zhou, Shruti Rijhawani, John Wieting, Jaime Carbonell, Graham Neubig
Cross-lingual entity linking (XEL) is the task of finding referents in a target-language knowledge base (KB) for mentions extracted from source-language texts.
1 code implementation • WS 2019 • Shuyan Zhou, Shruti Rijhwani, Graham Neubig
Cross-lingual entity linking (XEL) grounds named entities in a source language to an English Knowledge Base (KB), such as Wikipedia.
1 code implementation • WS 2019 • Shuyan Zhou, Xiangkai Zeng, Yingqi Zhou, Antonios Anastasopoulos, Graham Neubig
While neural machine translation (NMT) achieves remarkable performance on clean, in-domain text, performance is known to degrade drastically when facing text which is full of typos, grammatical errors and other varieties of noise.
no code implementations • CONLL 2018 • Feng Nie, Shuyan Zhou, Jing Liu, Jinpeng Wang, Chin-Yew Lin, Rong pan
The task of entity linking aims to identify concepts mentioned in a text fragments and link them to a reference knowledge base.