no code implementations • 26 Feb 2024 • Anne Wu, Kianté Brantley, Yoav Artzi
This study evaluates three state-of-the-art MLLMs -- GPT-4V, Gemini Pro, and the open-source model IDEFICS -- on the compositional natural language vision reasoning task NLVR.
no code implementations • 16 Feb 2024 • Zhaolin Gao, Kianté Brantley, Thorsten Joachims
In this paper, we envision a use case where authors can receive LLM-generated reviews that uncover weak points in the current draft.
no code implementations • 6 Oct 2023 • Ge Gao, Jonathan D. Chang, Claire Cardie, Kianté Brantley, Thorsten Joachim
Text retrieval plays a crucial role in incorporating factual knowledge for decision making into language processing pipelines, ranging from chat-based web search to question answering systems.
1 code implementation • 10 Jul 2023 • Kianté Brantley, Zhichong Fang, Sarah Dean, Thorsten Joachims
The feedback that users provide through their choices (e. g., clicks, purchases) is one of the most common types of data readily available for training search and recommendation algorithms.
no code implementations • 2 Mar 2023 • Felix Faltings, Michel Galley, Baolin Peng, Kianté Brantley, Weixin Cai, Yizhe Zhang, Jianfeng Gao, Bill Dolan
Unfortunately, this means most of the research on text, code, and image generation has focused on non-interactive settings, whereby the model is expected to get everything right without accounting for any input from a user who may be willing to help.
no code implementations • 3 Nov 2022 • Anne Wu, Kianté Brantley, Noriyuki Kojima, Yoav Artzi
We present lilGym, a new benchmark for language-conditioned reinforcement learning in visual environments.
3 code implementations • 3 Oct 2022 • Rajkumar Ramamurthy, Prithviraj Ammanabrolu, Kianté Brantley, Jack Hessel, Rafet Sifa, Christian Bauckhage, Hannaneh Hajishirzi, Yejin Choi
To help answer this, we first introduce an open-source modular library, RL4LMs (Reinforcement Learning for Language Models), for optimizing language generators with RL.
no code implementations • 3 Mar 2021 • Kianté Brantley, Soroush Mehri, Geoffrey J. Gordon
They also form a natural bridge between model-based and model-free RL methods: like the former they make predictions about future experiences, and like the latter they allow efficient prediction of total discounted rewards.
1 code implementation • NeurIPS 2020 • Kianté Brantley, Miroslav Dudik, Thodoris Lykouris, Sobhan Miryoosefi, Max Simchowitz, Aleksandrs Slivkins, Wen Sun
We propose an algorithm for tabular episodic reinforcement learning with constraints.
1 code implementation • ACL 2020 • Kianté Brantley, Amr Sharaf, Hal Daumé III
Imitation learning algorithms provide state-of-the-art results on many structured prediction tasks by learning near-optimal search policies.
1 code implementation • NeurIPS 2019 • Sobhan Miryoosefi, Kianté Brantley, Hal Daumé III, Miroslav Dudik, Robert Schapire
In standard reinforcement learning (RL), a learning agent seeks to optimize the overall reward.
1 code implementation • WS 2019 • Sean Welleck, Kianté Brantley, Hal Daumé III, Kyunghyun Cho
Standard sequential generation methods assume a pre-specified generation order, such as text generation methods which generate words from left to right.
no code implementations • WS 2017 • Amr Sharaf, Shi Feng, Khanh Nguyen, Kianté Brantley, Hal Daumé III
We describe the University of Maryland machine translation systems submitted to the WMT17 German-English Bandit Learning Task.