Search Results for author: Iacer Calixto

Found 24 papers, 6 papers with code

VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena

1 code implementation14 Dec 2021 Letitia Parcalabescu, Michele Cafagna, Lilitta Muradjan, Anette Frank, Iacer Calixto, Albert Gatt

We propose VALSE (Vision And Language Structured Evaluation), a novel benchmark designed for testing general-purpose pretrained vision and language (V&L) models for their visio-linguistic grounding capabilities on specific linguistic phenomena.

Wikipedia Entities as Rendezvous across Languages: Grounding Multilingual Language Models by Predicting Wikipedia Hyperlinks

no code implementations NAACL 2021 Iacer Calixto, Alessandro Raganato, Tommaso Pasini

Further adding extra languages lead to improvements in most tasks up to a certain point, but overall we found it non-trivial to scale improvements in model transferability by training on ever increasing amounts of Wikipedia languages.

Knowledge Graphs

Seeing past words: Testing the cross-modal capabilities of pretrained V&L models on counting tasks

no code implementations ACL (mmsr, IWCS) 2021 Letitia Parcalabescu, Albert Gatt, Anette Frank, Iacer Calixto

We investigate the reasoning ability of pretrained vision and language (V&L) models in two tasks that require multimodal integration: (1) discriminating a correct image-sentence pair from an incorrect one, and (2) counting entities in an image.

A Study on the Autoregressive and non-Autoregressive Multi-label Learning

no code implementations3 Dec 2020 Elham J. Barezi, Iacer Calixto, Kyunghyun Cho, Pascale Fung

These tasks are hard because the label space is usually (i) very large, e. g. thousands or millions of labels, (ii) very sparse, i. e. very few labels apply to each input document, and (iii) highly correlated, meaning that the existence of one label changes the likelihood of predicting all other labels.

Multi-Label Learning

Are scene graphs good enough to improve Image Captioning?

1 code implementation Asian Chapter of the Association for Computational Linguistics 2020 Victor Milewski, Marie-Francine Moens, Iacer Calixto

Overall, we find no significant difference between models that use scene graph features and models that only use object detection features across different captioning metrics, which suggests that existing scene graph generation models are still too noisy to be useful in image captioning.

Graph Attention Graph Generation +3

ImagiFilter: A resource to enable the semi-automatic mining of images at scale

1 code implementation20 Aug 2020 Houda Alberts, Iacer Calixto

In this paper, we describe and publicly release an image dataset along with pretrained models designed to (semi-)automatically filter out undesirable images from very large image collections, possibly obtained from the web.

English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too

no code implementations Asian Chapter of the Association for Computational Linguistics 2020 Jason Phang, Iacer Calixto, Phu Mon Htut, Yada Pruksachatkun, Haokun Liu, Clara Vania, Katharina Kann, Samuel R. Bowman

Intermediate-task training---fine-tuning a pretrained model on an intermediate task before fine-tuning again on the target task---often improves model performance substantially on language understanding tasks in monolingual English settings.

Question Answering Zero-Shot Cross-Lingual Transfer

Latent Variable Model for Multi-modal Translation

1 code implementation ACL 2019 Iacer Calixto, Miguel Rios, Wilker Aziz

In this work, we propose to model the interaction between visual and textual features for multi-modal neural machine translation (MMT) through a latent variable model.

Multimodal Machine Translation Multi-Task Learning +1

Linguistic realisation as machine translation: Comparing different MT models for AMR-to-text generation

no code implementations WS 2017 Thiago Castro Ferreira, Iacer Calixto, S Wubben, er, Emiel Krahmer

In this paper, we study AMR-to-text generation, framing it as a translation task and comparing two different MT approaches (Phrase-based and Neural MT).

AMR-to-Text Generation Machine Translation +2

Sentence-Level Multilingual Multi-modal Embedding for Natural Language Processing

no code implementations RANLP 2017 Iacer Calixto, Qun Liu

We propose a novel discriminative ranking model that learns embeddings from multilingual and multi-modal data, meaning that our model can take advantage of images and descriptions in multiple languages to improve embedding quality.

Machine Translation Re-Ranking +2

Incorporating Global Visual Features into Attention-based Neural Machine Translation.

no code implementations EMNLP 2017 Iacer Calixto, Qun Liu

We introduce multi-modal, attention-based neural machine translation (NMT) models which incorporate visual features into different parts of both the encoder and the decoder.

Machine Translation Text Generation +2

Using Images to Improve Machine-Translating E-Commerce Product Listings.

no code implementations EACL 2017 Iacer Calixto, Daniel Stein, Evgeny Matusov, Pintu Lohar, Sheila Castilho, Andy Way

We evaluate our models quantitatively using BLEU and TER and find that (i) additional synthetic data has a general positive impact on text-only and multi-modal NMT models, and that (ii) using a multi-modal NMT model for re-ranking n-best lists improves TER significantly across different n-best list sizes.

Machine Translation Re-Ranking +1

Human Evaluation of Multi-modal Neural Machine Translation: A Case-Study on E-Commerce Listing Titles

no code implementations WS 2017 Iacer Calixto, Daniel Stein, Evgeny Matusov, Sheila Castilho, Andy Way

Nonetheless, human evaluators ranked translations from a multi-modal NMT model as better than those of a text-only NMT over 88{\%} of the time, which suggests that images do help NMT in this use-case.

Machine Translation Translation

Doubly-Attentive Decoder for Multi-modal Neural Machine Translation

no code implementations ACL 2017 Iacer Calixto, Qun Liu, Nick Campbell

We introduce a Multi-modal Neural Machine Translation model in which a doubly-attentive decoder naturally incorporates spatial visual features obtained using pre-trained convolutional neural networks, bridging the gap between image description and translation.

Multimodal Machine Translation Translation

Multilingual Multi-modal Embeddings for Natural Language Processing

no code implementations3 Feb 2017 Iacer Calixto, Qun Liu, Nick Campbell

We propose a novel discriminative model that learns embeddings from multilingual and multi-modal data, meaning that our model can take advantage of images and descriptions in multiple languages to improve embedding quality.

Machine Translation Re-Ranking +2

Incorporating Global Visual Features into Attention-Based Neural Machine Translation

no code implementations23 Jan 2017 Iacer Calixto, Qun Liu, Nick Campbell

We introduce multi-modal, attention-based neural machine translation (NMT) models which incorporate visual features into different parts of both the encoder and the decoder.

Multimodal Machine Translation Translation

Developing a Dataset for Evaluating Approaches for Document Expansion with Images

no code implementations LREC 2016 Debasis Ganguly, Iacer Calixto, Gareth Jones

Motivated by the adage that a {``}picture is worth a thousand words{''} it can be reasoned that automatically enriching the textual content of a document with relevant images can increase the readability of a document.

Information Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.