Search Results for author: Yonatan Bisk

Found 83 papers, 38 papers with code

An HDP Model for Inducing Combinatory Categorial Grammars

no code implementations TACL 2013 Yonatan Bisk, Julia Hockenmaier

We introduce a novel nonparametric Bayesian model for the induction of Combinatory Categorial Grammars from POS-tagged text.

POS

Unsupervised Neural Hidden Markov Models

2 code implementations WS 2016 Ke Tran, Yonatan Bisk, Ashish Vaswani, Daniel Marcu, Kevin Knight

In this work, we present the first results for neuralizing an Unsupervised Hidden Markov Model.

TAG

Evaluating Induced CCG Parsers on Grounded Semantic Parsing

1 code implementation EMNLP 2016 Yonatan Bisk, Siva Reddy, John Blitzer, Julia Hockenmaier, Mark Steedman

We compare the effectiveness of four different syntactic CCG parsers for a semantic slot-filling task to explore how much syntactic supervision is required for downstream semantic analysis.

Semantic Parsing slot-filling +1

Natural Language Inference from Multiple Premises

no code implementations IJCNLP 2017 Alice Lai, Yonatan Bisk, Julia Hockenmaier

We define a novel textual entailment task that requires inference over multiple premise sentences.

Natural Language Inference

Synthetic and Natural Noise Both Break Neural Machine Translation

3 code implementations ICLR 2018 Yonatan Belinkov, Yonatan Bisk

Character-based neural machine translation (NMT) models alleviate out-of-vocabulary issues, learn morphology, and move us closer to completely end-to-end translation systems.

Machine Translation NMT +1

Learning Interpretable Spatial Operations in a Rich 3D Blocks World

no code implementations10 Dec 2017 Yonatan Bisk, Kevin J. Shih, Yejin Choi, Daniel Marcu

In this paper, we study the problem of mapping natural language instructions to complex spatial actions in a 3D blocks world.

CHALET: Cornell House Agent Learning Environment

2 code implementations23 Jan 2018 Claudia Yan, Dipendra Misra, Andrew Bennnett, Aaron Walsman, Yonatan Bisk, Yoav Artzi

We present CHALET, a 3D house simulator with support for navigation and manipulation.

Balancing Shared Autonomy with Human-Robot Communication

no code implementations20 May 2018 Rosario Scalise, Yonatan Bisk, Maxwell Forbes, Daqing Yi, Yejin Choi, Siddhartha Srinivasa

Robotic agents that share autonomy with a human should leverage human domain knowledge and account for their preferences when completing a task.

Inducing Grammars with and for Neural Machine Translation

no code implementations ACL 2018 Ke Tran, Yonatan Bisk

To address both of these issues we introduce a model that simultaneously translates while inducing dependency trees.

Machine Translation NMT +1

SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference

1 code implementation EMNLP 2018 Rowan Zellers, Yonatan Bisk, Roy Schwartz, Yejin Choi

Given a partial description like "she opened the hood of the car," humans can reason about the situation and anticipate what might come next ("then, she examined the engine").

Common Sense Reasoning Multiple-choice +2

Shifting the Baseline: Single Modality Performance on Visual Navigation & QA

no code implementations1 Nov 2018 Jesse Thomason, Daniel Gordon, Yonatan Bisk

We demonstrate the surprising strength of unimodal baselines in multimodal domains, and make concrete recommendations for best practices in future research.

Question Answering Visual Navigation

Early Fusion for Goal Directed Robotic Vision

no code implementations21 Nov 2018 Aaron Walsman, Yonatan Bisk, Saadia Gabriel, Dipendra Misra, Yoav Artzi, Yejin Choi, Dieter Fox

Building perceptual systems for robotics which perform well under tight computational budgets requires novel architectures which rethink the traditional computer vision pipeline.

Imitation Learning Retrieval

From Recognition to Cognition: Visual Commonsense Reasoning

4 code implementations CVPR 2019 Rowan Zellers, Yonatan Bisk, Ali Farhadi, Yejin Choi

While this task is easy for humans, it is tremendously difficult for today's vision systems, requiring higher-order cognition and commonsense reasoning about the world.

Multiple-choice Multiple Choice Question Answering (MCQA) +1

Character-based Surprisal as a Model of Reading Difficulty in the Presence of Error

no code implementations2 Feb 2019 Michael Hahn, Frank Keller, Yonatan Bisk, Yonatan Belinkov

Also, transpositions are more difficult than misspellings, and a high error rate increases difficulty for all words, including correct ones.

Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

1 code implementation CVPR 2019 Liyiming Ke, Xiujun Li, Yonatan Bisk, Ari Holtzman, Zhe Gan, Jingjing Liu, Jianfeng Gao, Yejin Choi, Siddhartha Srinivasa

We present the Frontier Aware Search with backTracking (FAST) Navigator, a general framework for action decoding, that achieves state-of-the-art results on the Room-to-Room (R2R) Vision-and-Language navigation challenge of Anderson et.

Vision and Language Navigation Vision-Language Navigation

Prospection: Interpretable Plans From Language By Predicting the Future

no code implementations20 Mar 2019 Chris Paxton, Yonatan Bisk, Jesse Thomason, Arunkumar Byravan, Dieter Fox

High-level human instructions often correspond to behaviors with multiple implicit steps.

Improving Robot Success Detection using Static Object Data

1 code implementation2 Apr 2019 Rosario Scalise, Jesse Thomason, Yonatan Bisk, Siddhartha Srinivasa

We collect over 13 hours of egocentric manipulation data for training a model to reason about whether a robot successfully placed unseen objects in or on one another.

Object

HellaSwag: Can a Machine Really Finish Your Sentence?

2 code implementations ACL 2019 Rowan Zellers, Ari Holtzman, Yonatan Bisk, Ali Farhadi, Yejin Choi

In this paper, we show that commonsense inference still proves difficult for even state-of-the-art models, by presenting HellaSwag, a new challenge dataset.

Natural Language Inference Sentence +1

Defending Against Neural Fake News

4 code implementations NeurIPS 2019 Rowan Zellers, Ari Holtzman, Hannah Rashkin, Yonatan Bisk, Ali Farhadi, Franziska Roesner, Yejin Choi

We find that best current discriminators can classify neural fake news from real, human-written, news with 73% accuracy, assuming access to a moderate level of training data.

Computer Security Fake News Detection +1

Shifting the Baseline: Single Modality Performance on Visual Navigation \& QA

no code implementations NAACL 2019 Jesse Thomason, Daniel Gordon, Yonatan Bisk

We demonstrate the surprising strength of unimodal baselines in multimodal domains, and make concrete recommendations for best practices in future research.

Visual Navigation

Benchmarking Hierarchical Script Knowledge

1 code implementation NAACL 2019 Yonatan Bisk, Jan Buys, Karl Pichotta, Yejin Choi

Understanding procedural language requires reasoning about both hierarchical and temporal relations between events.

Benchmarking

Robust Navigation with Language Pretraining and Stochastic Sampling

1 code implementation IJCNLP 2019 Xiujun Li, Chunyuan Li, Qiaolin Xia, Yonatan Bisk, Asli Celikyilmaz, Jianfeng Gao, Noah Smith, Yejin Choi

Core to the vision-and-language navigation (VLN) challenge is building robust instruction representations and action decoding schemes, which can generalize well to previously unseen instructions and environments.

Vision and Language Navigation

ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

7 code implementations CVPR 2020 Mohit Shridhar, Jesse Thomason, Daniel Gordon, Yonatan Bisk, Winson Han, Roozbeh Mottaghi, Luke Zettlemoyer, Dieter Fox

We present ALFRED (Action Learning From Realistic Environments and Directives), a benchmark for learning a mapping from natural language instructions and egocentric vision to sequences of actions for household tasks.

Natural Language Visual Grounding

Multi-View Learning for Vision-and-Language Navigation

no code implementations2 Mar 2020 Qiaolin Xia, Xiujun Li, Chunyuan Li, Yonatan Bisk, Zhifang Sui, Jianfeng Gao, Yejin Choi, Noah A. Smith

Learning to navigate in a visual environment following natural language instructions is a challenging task because natural language instructions are highly variable, ambiguous, and under-specified.

MULTI-VIEW LEARNING Navigate +1

Experience Grounds Language

2 code implementations EMNLP 2020 Yonatan Bisk, Ari Holtzman, Jesse Thomason, Jacob Andreas, Yoshua Bengio, Joyce Chai, Mirella Lapata, Angeliki Lazaridou, Jonathan May, Aleksandr Nisnevich, Nicolas Pinto, Joseph Turian

Language understanding research is held back by a failure to relate language to the physical world it describes and to the social interactions it facilitates.

Representation Learning

RMM: A Recursive Mental Model for Dialog Navigation

1 code implementation2 May 2020 Homero Roman Roman, Yonatan Bisk, Jesse Thomason, Asli Celikyilmaz, Jianfeng Gao

In this paper, we go beyond instruction following and introduce a two-agent task where one agent navigates and asks questions that a second, guiding agent answers.

Answer Generation Instruction Following

The Return of Lexical Dependencies: Neural Lexicalized PCFGs

3 code implementations29 Jul 2020 Hao Zhu, Yonatan Bisk, Graham Neubig

In this paper we demonstrate that $\textit{context free grammar (CFG) based methods for grammar induction benefit from modeling lexical dependencies}$.

ALFWorld: Aligning Text and Embodied Environments for Interactive Learning

1 code implementation8 Oct 2020 Mohit Shridhar, Xingdi Yuan, Marc-Alexandre Côté, Yonatan Bisk, Adam Trischler, Matthew Hausknecht

ALFWorld enables the creation of a new BUTLER agent whose abstract knowledge, learned in TextWorld, corresponds directly to concrete, visually grounded actions.

Natural Language Visual Grounding Scene Understanding

RMM: A Recursive Mental Model for Dialogue Navigation

1 code implementation Findings of the Association for Computational Linguistics 2020 Homero Roman Roman, Yonatan Bisk, Jesse Thomason, Asli Celikyilmaz, Jianfeng Gao

In this paper, we go beyond instruction following and introduce a two-agent task where one agent navigates and asks questions that a second, guiding agent answers.

Answer Generation Instruction Following

Imagining Grounded Conceptual Representations from Perceptual Information in Situated Guessing Games

no code implementations COLING 2020 Alessandro Suglia, Antonio Vergari, Ioannis Konstas, Yonatan Bisk, Emanuele Bastianelli, Andrea Vanzo, Oliver Lemon

However, as shown by Suglia et al. (2020), existing models fail to learn truly multi-modal representations, relying instead on gold category labels for objects in the scene both at training and inference time.

Object

Knowledge-driven Data Construction for Zero-shot Evaluation in Commonsense Question Answering

1 code implementation7 Nov 2020 Kaixin Ma, Filip Ilievski, Jonathan Francis, Yonatan Bisk, Eric Nyberg, Alessandro Oltramari

Guided by a set of hypotheses, the framework studies how to transform various pre-existing knowledge resources into a form that is most effective for pre-training models.

Language Modelling Question Answering

BUTLER: Building Understanding in TextWorld via Language for Embodied Reasoning

no code implementations ICLR 2021 Mohit Shridhar, Xingdi Yuan, Marc-Alexandre Cote, Yonatan Bisk, Adam Trischler, Matthew Hausknecht

ALFWorld enables the creation of a new BUTLER agent whose abstract knowledge, learned in TextWorld, corresponds directly to concrete, visually grounded actions.

Scene Understanding

Token-Level Contrast for Video and Language Alignment

no code implementations1 Jan 2021 Jianwei Yang, Yonatan Bisk, Jianfeng Gao

Building video and language understanding models requires grounding linguistic concepts and video contents into a shared space.

Worst of Both Worlds: Biases Compound in Pre-trained Vision-and-Language Models

no code implementations NAACL (GeBNLP) 2022 Tejas Srinivasan, Yonatan Bisk

Numerous works have analyzed biases in vision and pre-trained language models individually - however, less attention has been paid to how these biases interact in multimodal settings.

Grounding 'Grounding' in NLP

no code implementations4 Jun 2021 Khyathi Raghavi Chandu, Yonatan Bisk, Alan W Black

And finally, (3) How to advance our current definition to bridge the gap with Cognitive Science?

Few-shot Language Coordination by Modeling Theory of Mind

no code implementations12 Jul 2021 Hao Zhu, Graham Neubig, Yonatan Bisk

Positive results from our experiments hint at the importance of explicitly modeling communication as a socio-pragmatic progress.

Language Grounding with 3D Objects

2 code implementations26 Jul 2021 Jesse Thomason, Mohit Shridhar, Yonatan Bisk, Chris Paxton, Luke Zettlemoyer

We introduce several CLIP-based models for distinguishing objects and demonstrate that while recent advances in jointly modeling vision and language are useful for robotic language understanding, it is still the case that these image-based models are weaker at understanding the 3D nature of objects -- properties which play a key role in manipulation.

TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment

no code implementations ICCV 2021 Jianwei Yang, Yonatan Bisk, Jianfeng Gao

This is motivated by the observation that for a video-text pair, the content words in the text, such as nouns and verbs, are more likely to be aligned with the visual contents in the video than the function words.

Ranked #3 on Temporal Action Localization on CrossTask (using extra training data)

Action Segmentation Contrastive Learning +5

WebQA: Multihop and Multimodal QA

2 code implementations CVPR 2022 Yingshan Chang, Mridu Narang, Hisami Suzuki, Guihong Cao, Jianfeng Gao, Yonatan Bisk

Scaling Visual Question Answering (VQA) to the open-domain and multi-hop nature of web searches, requires fundamental advances in visual representation learning, knowledge aggregation, and language generation.

Image Retrieval Multimodal Reasoning +4

Dependency Induction Through the Lens of Visual Perception

1 code implementation CoNLL (EMNLP) 2021 Ruisi Su, Shruti Rijhwani, Hao Zhu, Junxian He, Xinyu Wang, Yonatan Bisk, Graham Neubig

Our experiments find that concreteness is a strong indicator for learning dependency grammars, improving the direct attachment score (DAS) by over 50\% as compared to state-of-the-art models trained on pure text.

Constituency Grammar Induction Dependency Parsing

Learning When and What to Ask: a Hierarchical Reinforcement Learning Framework

no code implementations29 Sep 2021 Khanh Xuan Nguyen, Yonatan Bisk, Hal Daumé III

Results on a simulated human-assisted navigation problem demonstrate the effectiveness of our framework: aided with an interaction policy learned by our method, a navigation policy achieves up to a 7× improvement in task success rate compared to performing tasks only by itself.

Hierarchical Reinforcement Learning reinforcement-learning +1

Shaped Rewards Bias Emergent Language

no code implementations29 Sep 2021 Brendon Boldt, Yonatan Bisk, David R Mortensen

The second is shaped rewards which are designed specifically to make the task easier to learn by introducing biases in the learning process.

Inductive Bias

Symmetric Machine Theory of Mind

no code implementations29 Sep 2021 Melanie Sclar, Graham Neubig, Yonatan Bisk

Theory of mind (ToM), the ability to understand others' thoughts and desires, is a cornerstone of human intelligence.

FILM: Following Instructions in Language with Modular Methods

1 code implementation ICLR 2022 So Yeon Min, Devendra Singh Chaplot, Pradeep Ravikumar, Yonatan Bisk, Ruslan Salakhutdinov

In contrast, we propose a modular method with structured representations that (1) builds a semantic map of the scene and (2) performs exploration with a semantic search policy, to achieve the natural language goal.

Imitation Learning Instruction Following

A Framework for Learning to Request Rich and Contextually Useful Information from Humans

no code implementations14 Oct 2021 Khanh Nguyen, Yonatan Bisk, Hal Daumé III

We show that the agent can take advantage of different types of information depending on the context, and analyze the benefits and challenges of learning the assistance-requesting policy when the assistant can recursively decompose tasks into subtasks.

Decision Making Hierarchical Reinforcement Learning

KAT: A Knowledge Augmented Transformer for Vision-and-Language

1 code implementation NAACL 2022 Liangke Gui, Borui Wang, Qiuyuan Huang, Alex Hauptmann, Yonatan Bisk, Jianfeng Gao

The primary focus of recent work with largescale transformers has been on optimizing the amount of information packed into the model's parameters.

Answer Generation Retrieval +1

Training Vision-Language Transformers from Captions

1 code implementation19 May 2022 Liangke Gui, Yingshan Chang, Qiuyuan Huang, Subhojit Som, Alex Hauptmann, Jianfeng Gao, Yonatan Bisk

Vision-Language Transformers can be learned without low-level human labels (e. g. class labels, bounding boxes, etc).

On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization

no code implementations24 May 2022 Shruti Palaskar, Akshita Bhagia, Yonatan Bisk, Florian Metze, Alan W Black, Ana Marasović

Combining the visual modality with pretrained language models has been surprisingly effective for simple descriptive tasks such as image captioning.

Descriptive Image Captioning +5

Transformers are Adaptable Task Planners

no code implementations6 Jul 2022 Vidhi Jain, Yixin Lin, Eric Undersander, Yonatan Bisk, Akshara Rai

Every home is different, and every person likes things done in their particular way.

Attribute

Don't Copy the Teacher: Data and Model Challenges in Embodied Dialogue

1 code implementation10 Oct 2022 So Yeon Min, Hao Zhu, Ruslan Salakhutdinov, Yonatan Bisk

We provide empirical comparisons of metrics, analysis of three models, and make suggestions for how the field might best progress.

Imitation Learning Instruction Following

Self-Supervised Object Goal Navigation with In-Situ Finetuning

no code implementations9 Dec 2022 So Yeon Min, Yao-Hung Hubert Tsai, Wei Ding, Ali Farhadi, Ruslan Salakhutdinov, Yonatan Bisk, Jian Zhang

In contrast, our LocCon shows the most robust transfer in the real world among the set of models we compare to, and that the real-world performance of all models can be further improved with self-supervised LocCon in-situ training.

Contrastive Learning Navigate +2

EXCALIBUR: Encouraging and Evaluating Embodied Exploration

no code implementations CVPR 2023 Hao Zhu, Raghav Kapoor, So Yeon Min, Winson Han, Jiatai Li, Kaiwen Geng, Graham Neubig, Yonatan Bisk, Aniruddha Kembhavi, Luca Weihs

Humans constantly explore and learn about their environment out of curiosity, gather information, and update their models of the world.

The Framework Tax: Disparities Between Inference Efficiency in NLP Research and Deployment

1 code implementation13 Feb 2023 Jared Fernandez, Jacob Kahn, Clara Na, Yonatan Bisk, Emma Strubell

In this work, we examine this phenomenon through a series of case studies analyzing the effects of model design decisions, framework paradigms, and hardware platforms on total model latency.

Computational Efficiency

Computational Language Acquisition with Theory of Mind

1 code implementation2 Mar 2023 Andy Liu, Hao Zhu, Emmy Liu, Yonatan Bisk, Graham Neubig

We also find some evidence that increasing task difficulty in the training process results in more fluent and precise utterances in evaluation.

Language Acquisition

SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs

no code implementations NeurIPS 2023 Lijun Yu, Yong Cheng, Zhiruo Wang, Vivek Kumar, Wolfgang Macherey, Yanping Huang, David A. Ross, Irfan Essa, Yonatan Bisk, Ming-Hsuan Yang, Kevin Murphy, Alexander G. Hauptmann, Lu Jiang

In this work, we introduce Semantic Pyramid AutoEncoder (SPAE) for enabling frozen LLMs to perform both understanding and generation tasks involving non-linguistic modalities such as images or videos.

In-Context Learning multimodal generation

WebArena: A Realistic Web Environment for Building Autonomous Agents

1 code implementation25 Jul 2023 Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Yonatan Bisk, Daniel Fried, Uri Alon, Graham Neubig

Building upon our environment, we release a set of benchmark tasks focusing on evaluating the functional correctness of task completions.

MAEA: Multimodal Attribution for Embodied AI

no code implementations25 Jul 2023 Vidhi Jain, Jayant Sravan Tamarapalli, Sahiti Yerramilli, Yonatan Bisk

Understanding multimodal perception for embodied AI is an open question because such inputs may contain highly complementary as well as redundant information for the task.

Reasoning about the Unseen for Efficient Outdoor Object Navigation

1 code implementation18 Sep 2023 Quanting Xie, Tianyi Zhang, Kedi Xu, Matthew Johnson-Roberson, Yonatan Bisk

We introduce a new task OUTDOOR, a new mechanism for Large Language Models (LLMs) to accurately hallucinate possible futures, and a new computationally aware success metric for pushing research forward in this more complex domain.

Navigate Object

SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents

1 code implementation18 Oct 2023 Xuhui Zhou, Hao Zhu, Leena Mathur, Ruohong Zhang, Haofei Yu, Zhengyang Qi, Louis-Philippe Morency, Yonatan Bisk, Daniel Fried, Graham Neubig, Maarten Sap

We present SOTOPIA, an open-ended environment to simulate complex social interactions between artificial agents and evaluate their social intelligence.

Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis

no code implementations14 Dec 2023 Yafei Hu, Quanting Xie, Vidhi Jain, Jonathan Francis, Jay Patrikar, Nikhil Keetha, Seungchan Kim, Yaqi Xie, Tianyi Zhang, Shibo Zhao, Yu Quan Chong, Chen Wang, Katia Sycara, Matthew Johnson-Roberson, Dhruv Batra, Xiaolong Wang, Sebastian Scherer, Zsolt Kira, Fei Xia, Yonatan Bisk

Motivated by the impressive open-set performance and content generation capabilities of web-scale, large-capacity pre-trained models (i. e., foundation models) in research fields such as Natural Language Processing (NLP) and Computer Vision (CV), we devote this survey to exploring (i) how these existing foundation models from NLP and CV can be applied to the field of robotics, and also exploring (ii) what a robotics-specific foundation model would look like.

VISREAS: Complex Visual Reasoning with Unanswerable Questions

no code implementations23 Feb 2024 Syeda Nahida Akter, Sangwu Lee, Yingshan Chang, Yonatan Bisk, Eric Nyberg

The unique feature of this task, validating question answerability with respect to an image before answering, and the poor performance of state-of-the-art models inspired the design of a new modular baseline, LOGIC2VISION that reasons by producing and executing pseudocode without any external modules to generate the answer.

Question Answering Visual Question Answering +1

SOTOPIA-$π$: Interactive Learning of Socially Intelligent Language Agents

1 code implementation13 Mar 2024 Ruiyi Wang, Haofei Yu, Wenxin Zhang, Zhengyang Qi, Maarten Sap, Graham Neubig, Yonatan Bisk, Hao Zhu

Motivated by this gap, we propose an interactive learning method, SOTOPIA-$\pi$, improving the social intelligence of language agents.

Language Modelling Large Language Model

Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation

no code implementations25 Mar 2024 Yingshan Chang, Yasi Zhang, Zhiyuan Fang, YingNian Wu, Yonatan Bisk, Feng Gao

We hypothesize that the underlying phenomenological coverage has not been proportionally scaled up, leading to a skew of the presented phenomenon which harms generalization.

Relational Reasoning Text-to-Image Generation

Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward

1 code implementation1 Apr 2024 Ruohong Zhang, Liangke Gui, Zhiqing Sun, Yihao Feng, Keyang Xu, Yuanhan Zhang, Di Fu, Chunyuan Li, Alexander Hauptmann, Yonatan Bisk, Yiming Yang

Preference modeling techniques, such as direct preference optimization (DPO), has shown effective in enhancing the generalization abilities of large language model (LLM).

Instruction Following Language Modelling +3

AgentKit: Flow Engineering with Graphs, not Coding

1 code implementation17 Apr 2024 Yue Wu, Yewen Fan, So Yeon Min, Shrimai Prabhumoye, Stephen Mcaleer, Yonatan Bisk, Ruslan Salakhutdinov, Yuanzhi Li, Tom Mitchell

The chains of nodes can be designed to explicitly enforce a naturally structured "thought process".

Cannot find the paper you are looking for? You can Submit a new open access paper.