Search Results for author: Paola Cascante-Bonilla

Found 15 papers, 8 papers with code

Natural Language Inference Improves Compositionality in Vision-Language Models

no code implementations29 Oct 2024 Paola Cascante-Bonilla, Yu Hou, Yang Trista Cao, Hal Daumé III, Rachel Rudinger

Compositional reasoning in Vision-Language Models (VLMs) remains challenging as these models often struggle to relate objects, attributes, and spatial relationships.

Natural Language Inference

PropTest: Automatic Property Testing for Improved Visual Programming

no code implementations25 Mar 2024 Jaywon Koo, Ziyan Yang, Paola Cascante-Bonilla, Baishakhi Ray, Vicente Ordonez

We propose PropTest, a general strategy that improves visual programming by further using an LLM to generate code that tests for visual properties in an initial round of proposed solutions.

Question Answering Referring Expression +3

Learning from Synthetic Data for Visual Grounding

no code implementations20 Mar 2024 Ruozhen He, Ziyan Yang, Paola Cascante-Bonilla, Alexander C. Berg, Vicente Ordonez

This paper extensively investigates the effectiveness of synthetic training data to improve the capabilities of vision-and-language models for grounding textual descriptions to image regions.

Language Modelling Large Language Model +2

Improved Visual Grounding through Self-Consistent Explanations

no code implementations CVPR 2024 Ruozhen He, Paola Cascante-Bonilla, Ziyan Yang, Alexander C. Berg, Vicente Ordonez

Vision-and-language models trained to match images with text can be combined with visual explanation methods to point to the locations of specific objects in an image.

Language Modelling Large Language Model +1

Going Beyond Nouns With Vision & Language Models Using Synthetic Data

1 code implementation ICCV 2023 Paola Cascante-Bonilla, Khaled Shehada, James Seale Smith, Sivan Doveh, Donghyun Kim, Rameswar Panda, Gül Varol, Aude Oliva, Vicente Ordonez, Rogerio Feris, Leonid Karlinsky

We contribute Synthetic Visual Concepts (SyViC) - a million-scale synthetic dataset and data generation codebase allowing to generate additional suitable data to improve VLC understanding and compositional reasoning of VL models.

Sentence Visual Reasoning

On the Transferability of Visual Features in Generalized Zero-Shot Learning

1 code implementation22 Nov 2022 Paola Cascante-Bonilla, Leonid Karlinsky, James Seale Smith, Yanjun Qi, Vicente Ordonez

Generalized Zero-Shot Learning (GZSL) aims to train a classifier that can generalize to unseen classes, using a set of attributes as auxiliary information, and the visual features extracted from a pre-trained convolutional neural network.

Generalized Zero-Shot Learning Knowledge Distillation +2

ConStruct-VL: Data-Free Continual Structured VL Concepts Learning

1 code implementation CVPR 2023 James Seale Smith, Paola Cascante-Bonilla, Assaf Arbelle, Donghyun Kim, Rameswar Panda, David Cox, Diyi Yang, Zsolt Kira, Rogerio Feris, Leonid Karlinsky

This leads to reasoning mistakes, which need to be corrected as they occur by teaching VL models the missing SVLC skills; often this must be done using private data where the issue was found, which naturally leads to a data-free continual (no task-id) VL learning setting.

SimVQA: Exploring Simulated Environments for Visual Question Answering

no code implementations CVPR 2022 Paola Cascante-Bonilla, Hui Wu, Letao Wang, Rogerio Feris, Vicente Ordonez

By exploiting 3D and physics simulation platforms, we provide a pipeline to generate synthetic data to expand and replace type-specific questions and answers without risking the exposure of sensitive or personal data that might be present in real images.

Data Augmentation Diversity +2

Evolving Image Compositions for Feature Representation Learning

no code implementations16 Jun 2021 Paola Cascante-Bonilla, Arshdeep Sekhon, Yanjun Qi, Vicente Ordonez

This paper proposes PatchMix, a data augmentation method that creates new samples by composing patches from pairs of images in a grid-like pattern.

Data Augmentation Representation Learning +1

Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning

1 code implementation16 Jan 2020 Paola Cascante-Bonilla, Fuwen Tan, Yanjun Qi, Vicente Ordonez

Pseudo-labeling works by applying pseudo-labels to samples in the unlabeled set by using a model trained on the combination of the labeled samples and any previously pseudo-labeled samples, and iteratively repeating this process in a self-training cycle.

Image Classification

Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries

1 code implementation NeurIPS 2019 Fuwen Tan, Paola Cascante-Bonilla, Xiaoxiao Guo, Hui Wu, Song Feng, Vicente Ordonez

We show that using multiple rounds of natural language queries as input can be surprisingly effective to find arbitrarily specific images of complex scenes.

Image Retrieval Natural Language Queries +1

Chat-crowd: A Dialog-based Platform for Visual Layout Composition

no code implementations NAACL 2019 Paola Cascante-Bonilla, Xuwang Yin, Vicente Ordonez, Song Feng

In this paper we introduce Chat-crowd, an interactive environment for visual layout composition via conversational interactions.

Goal-Oriented Dialog

Cannot find the paper you are looking for? You can Submit a new open access paper.