Search Results for author: Kenneth Marino

Found 13 papers, 5 papers with code

OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge

1 code implementation CVPR 2019 Kenneth Marino, Mohammad Rastegari, Ali Farhadi, Roozbeh Mottaghi

In this paper, we address the task of knowledge-based visual question answering and provide a benchmark, called OK-VQA, where the image content is not sufficient to answer the questions, encouraging methods that rely on external knowledge resources.

object-detection Object Detection +3

A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge

1 code implementation3 Jun 2022 Dustin Schwenk, Apoorv Khandelwal, Christopher Clark, Kenneth Marino, Roozbeh Mottaghi

In contrast to the existing knowledge-based VQA datasets, the questions generally cannot be answered by simply querying a knowledge base, and instead require some form of commonsense reasoning about the scene depicted in the image.

Question Answering Visual Question Answering +1

Ask Your Humans: Using Human Instructions to Improve Generalization in Reinforcement Learning

1 code implementation ICLR 2021 Valerie Chen, Abhinav Gupta, Kenneth Marino

We also find that incorporating natural language allows the model to generalize to unseen tasks in a zero-shot setting and to learn quickly from a few demonstrations.

Multi-Task Learning reinforcement-learning +1

The Pose Knows: Video Forecasting by Generating Pose Futures

1 code implementation ICCV 2017 Jacob Walker, Kenneth Marino, Abhinav Gupta, Martial Hebert

First we explicitly model the high level structure of active objects in the scene---humans---and use a VAE to model the possible future movements of humans in the pose space.

Human Pose Forecasting Video Prediction

The More You Know: Using Knowledge Graphs for Image Classification

no code implementations CVPR 2017 Kenneth Marino, Ruslan Salakhutdinov, Abhinav Gupta

One characteristic that sets humans apart from modern learning-based computer vision algorithms is the ability to acquire knowledge about the world and use that knowledge to reason about the visual world.

Classification General Classification +3

Hierarchical RL Using an Ensemble of Proprioceptive Periodic Policies

no code implementations ICLR 2019 Kenneth Marino, Abhinav Gupta, Rob Fergus, Arthur Szlam

The high-level policy is trained using a sparse, task-dependent reward, and operates by choosing which of the low-level policies to run at any given time.

Agent as Scientist: Learning to Verify Hypotheses

no code implementations25 Sep 2019 Kenneth Marino, Rob Fergus, Arthur Szlam, Abhinav Gupta

In order to train the agents, we exploit the underlying structure in the majority of hypotheses -- they can be formulated as triplets (pre-condition, action sequence, post-condition).

Collaborating with language models for embodied reasoning

no code implementations1 Feb 2023 Ishita Dasgupta, Christine Kaeser-Chen, Kenneth Marino, Arun Ahuja, Sheila Babayan, Felix Hill, Rob Fergus

On the other hand, Large Scale Language Models (LSLMs) have exhibited strong reasoning ability and the ability to to adapt to new tasks through in-context learning.

In-Context Learning Language Modelling +2

Cannot find the paper you are looking for? You can Submit a new open access paper.