Search Results for author: Zora Zhiruo Wang

Found 6 papers, 5 papers with code

Benchmarking Failures in Tool-Augmented Language Models

1 code implementation18 Mar 2025 Eduardo Treviño, Hugo Contant, James Ngai, Graham Neubig, Zora Zhiruo Wang

Further, to study possible mitigation of the failures, we enable real-time human interaction, named the Ask-and-Help (AAH) method, to provide missing information or replace non-functional tools.

Benchmarking Text Generation

CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation

no code implementations28 Jan 2025 Faria Huq, Zora Zhiruo Wang, Frank F. Xu, Tianyue Ou, Shuyan Zhou, Jeffrey P. Bigham, Graham Neubig

CowPilot reduces the number of steps humans need to perform by allowing agents to propose next steps, while users are able to pause, reject, or take alternative actions.

AutoPresent: Designing Structured Visuals from Scratch

1 code implementation1 Jan 2025 Jiaxin Ge, Zora Zhiruo Wang, Xuhui Zhou, Yi-Hao Peng, Sanjay Subramanian, Qinyue Tan, Maarten Sap, Alane Suhr, Daniel Fried, Graham Neubig, Trevor Darrell

We benchmark end-to-end image generation and program generation methods with a variety of models, and find that programmatic methods produce higher-quality slides in user-interactable formats.

Image Generation

Agent Workflow Memory

1 code implementation11 Sep 2024 Zora Zhiruo Wang, Jiayuan Mao, Daniel Fried, Graham Neubig

Despite the potential of language model-based agents to solve real-world tasks such as web navigation, current methods still struggle with long-horizon tasks with complex action trajectories.

AI Agent Language Modeling +1

ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness?

2 code implementations19 Jul 2024 Siddhant Waghjale, Vishruth Veerendranath, Zora Zhiruo Wang, Daniel Fried

Although large language models (LLMs) have been largely successful in generating functionally correct programs, conditioning models to produce efficient solutions while ensuring correctness remains a challenge.

Benchmarking Code Generation +1

CodeRAG-Bench: Can Retrieval Augment Code Generation?

1 code implementation20 Jun 2024 Zora Zhiruo Wang, Akari Asai, Xinyan Velocity Yu, Frank F. Xu, Yiqing Xie, Graham Neubig, Daniel Fried

We aggregate documents from five sources for models to retrieve contexts: competition solutions, online tutorials, library documentation, StackOverflow posts, and GitHub repositories.

Code Generation RAG +1

Cannot find the paper you are looking for? You can Submit a new open access paper.