Search Results for author: Lawrence Jang

Found 5 papers, 4 papers with code

VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks

1 code implementation24 Oct 2024 Lawrence Jang, Yinheng Li, Charles Ding, Justin Lin, Paul Pu Liang, Dan Zhao, Rogerio Bonatti, Kazuhito Koishida

Videos are often used to learn or extract the necessary information to complete tasks in ways different than what text and static imagery alone can provide.

Video Understanding

VLM Agents Generate Their Own Memories: Distilling Experience into Embodied Programs of Thought

no code implementations20 Jun 2024 Gabriel Sarch, Lawrence Jang, Michael J. Tarr, William W. Cohen, Kenneth Marino, Katerina Fragkiadaki

We propose In-Context Abstraction Learning (ICAL), a method that builds a memory of multimodal experience from sub-optimal demonstrations and human feedback.

Action Anticipation Continual Learning +5

VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks

1 code implementation24 Jan 2024 Jing Yu Koh, Robert Lo, Lawrence Jang, Vikram Duvvur, Ming Chong Lim, Po-Yu Huang, Graham Neubig, Shuyan Zhou, Ruslan Salakhutdinov, Daniel Fried

Through extensive quantitative and qualitative analysis, we identify several limitations of text-only LLM agents, and reveal gaps in the capabilities of state-of-the-art multimodal language agents.

Cannot find the paper you are looking for? You can Submit a new open access paper.