Search Results for author: Cristian Rodriguez-Opazo

Found 10 papers, 7 papers with code

LocFormer: Enabling Transformers to Perform Temporal Moment Localization on Long Untrimmed Videos With a Feature Sampling Approach

no code implementations19 Dec 2021 Cristian Rodriguez-Opazo, Edison Marrese-Taylor, Basura Fernando, Hiroya Takamura, Qi Wu

We propose LocFormer, a Transformer-based model for video grounding which operates at a constant memory footprint regardless of the video length, i. e. number of frames.

Inductive Bias Video Grounding

Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models

3 code implementations ICCV 2021 Zheyuan Liu, Cristian Rodriguez-Opazo, Damien Teney, Stephen Gould

We demonstrate that with a relatively simple architecture, CIRPLANT outperforms existing methods on open-domain images, while matching state-of-the-art accuracy on the existing narrow datasets, such as fashion.

Text-Image Retrieval Visual Reasoning

Language and Visual Entity Relationship Graph for Agent Navigation

1 code implementation NeurIPS 2020 Yicong Hong, Cristian Rodriguez-Opazo, Yuankai Qi, Qi Wu, Stephen Gould

From both the textual and visual perspectives, we find that the relationships among the scene, its objects, and directional clues are essential for the agent to interpret complex instructions and correctly perceive the environment.

Dynamic Time Warping Navigate +2

DORi: Discovering Object Relationship for Moment Localization of a Natural-Language Query in Video

1 code implementation13 Oct 2020 Cristian Rodriguez-Opazo, Edison Marrese-Taylor, Basura Fernando, Hongdong Li, Stephen Gould

This paper studies the task of temporal moment localization in a long untrimmed video using natural language query.

A Multi-modal Approach to Fine-grained Opinion Mining on Video Reviews

no code implementations WS 2020 Edison Marrese-Taylor, Cristian Rodriguez-Opazo, Jorge A. Balazs, Stephen Gould, Yutaka Matsuo

Despite the recent advances in opinion mining for written reviews, few works have tackled the problem on other sources of reviews.

Opinion Mining

Sub-Instruction Aware Vision-and-Language Navigation

1 code implementation EMNLP 2020 Yicong Hong, Cristian Rodriguez-Opazo, Qi Wu, Stephen Gould

Vision-and-language navigation requires an agent to navigate through a real 3D environment following natural language instructions.

Navigate Vision and Language Navigation

Proposal-free Temporal Moment Localization of a Natural-Language Query in Video using Guided Attention

1 code implementation20 Aug 2019 Cristian Rodriguez-Opazo, Edison Marrese-Taylor, Fatemeh Sadat Saleh, Hongdong Li, Stephen Gould

Given an untrimmed video and a sentence as the query, the goal is to determine the starting, and the ending, of the relevant visual moment in the video, that corresponds to the query sentence.

Cannot find the paper you are looking for? You can Submit a new open access paper.