Search Results for author: Iacer Calixto

Found 33 papers, 10 papers with code

Multi3Generation: Multitask, Multilingual, Multimodal Language Generation

no code implementations EAMT 2022 Anabela Barreiro, José GC de Souza, Albert Gatt, Mehul Bhatt, Elena Lloret, Aykut Erdem, Dimitra Gkatzia, Helena Moniz, Irene Russo, Fabio Kepler, Iacer Calixto, Marcin Paprzycki, François Portet, Isabelle Augenstein, Mirela Alhasani

This paper presents the Multitask, Multilingual, Multimodal Language Generation COST Action – Multi3Generation (CA18231), an interdisciplinary network of research groups working on different aspects of language generation.

Text Generation

AnyMatch -- Efficient Zero-Shot Entity Matching with a Small Language Model

1 code implementation6 Sep 2024 Zeyu Zhang, Paul Groth, Iacer Calixto, Sebastian Schelter

Furthermore, our approach exhibits major cost benefits: the average prediction quality of AnyMatch is within 4. 4% of the state-of-the-art method MatchGPT with the proprietary trillion-parameter model GPT-4, yet AnyMatch requires four orders of magnitude less parameters and incurs a 3, 899 times lower inference cost (in dollars per 1, 000 tokens).

Attribute AutoML +3

Evaluating Linguistic Capabilities of Multimodal LLMs in the Lens of Few-Shot Learning

no code implementations17 Jul 2024 Mustafa Dogan, Ilker Kesen, Iacer Calixto, Aykut Erdem, Erkut Erdem

This study aims to evaluate the performance of MLLMs on the VALSE benchmark, focusing on the efficacy of few-shot In-Context Learning (ICL), and Chain-of-Thought (CoT) prompting.

Few-Shot Learning In-Context Learning

ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models

no code implementations13 Nov 2023 Ilker Kesen, Andrea Pedrotti, Mustafa Dogan, Michele Cafagna, Emre Can Acikgoz, Letitia Parcalabescu, Iacer Calixto, Anette Frank, Albert Gatt, Aykut Erdem, Erkut Erdem

With the ever-increasing popularity of pretrained Video-Language Models (VidLMs), there is a pressing need to develop robust evaluation methodologies that delve deeper into their visio-linguistic capabilities.

counterfactual Language Modelling

Soft-prompt tuning to predict lung cancer using primary care free-text Dutch medical notes

1 code implementation28 Mar 2023 Auke Elfrink, Iacopo Vagliano, Ameen Abu-Hanna, Iacer Calixto

We investigate different natural language processing (NLP) approaches based on contextualised word representations for the problem of early prediction of lung cancer using free-text patient medical notes of Dutch primary care physicians.

Contextualised Word Representations

Detecting Euphemisms with Literal Descriptions and Visual Imagery

1 code implementation8 Nov 2022 İlker Kesen, Aykut Erdem, Erkut Erdem, Iacer Calixto

In the second stage, we integrate visual supervision into our system using visual imageries, two sets of images generated by a text-to-image model by taking terms and descriptions as input.

Endowing Language Models with Multimodal Knowledge Graph Representations

1 code implementation27 Jun 2022 Ningyuan Huang, Yash R. Deshpande, Yibo Liu, Houda Alberts, Kyunghyun Cho, Clara Vania, Iacer Calixto

We use the recently released VisualSem KG as our external knowledge repository, which covers a subset of Wikipedia and WordNet entities, and compare a mix of tuple-based and graph-based algorithms to learn entity and relation representations that are grounded on the KG multimodal information.

Multilingual Named Entity Recognition named-entity-recognition +2

VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena

1 code implementation ACL 2022 Letitia Parcalabescu, Michele Cafagna, Lilitta Muradjan, Anette Frank, Iacer Calixto, Albert Gatt

We propose VALSE (Vision And Language Structured Evaluation), a novel benchmark designed for testing general-purpose pretrained vision and language (V&L) models for their visio-linguistic grounding capabilities on specific linguistic phenomena.

image-sentence alignment valid

Wikipedia Entities as Rendezvous across Languages: Grounding Multilingual Language Models by Predicting Wikipedia Hyperlinks

no code implementations NAACL 2021 Iacer Calixto, Alessandro Raganato, Tommaso Pasini

Further adding extra languages lead to improvements in most tasks up to a certain point, but overall we found it non-trivial to scale improvements in model transferability by training on ever increasing amounts of Wikipedia languages.

Knowledge Graphs

Seeing past words: Testing the cross-modal capabilities of pretrained V&L models on counting tasks

no code implementations ACL (mmsr, IWCS) 2021 Letitia Parcalabescu, Albert Gatt, Anette Frank, Iacer Calixto

We investigate the reasoning ability of pretrained vision and language (V&L) models in two tasks that require multimodal integration: (1) discriminating a correct image-sentence pair from an incorrect one, and (2) counting entities in an image.

Sentence Task 2

A Study on the Autoregressive and non-Autoregressive Multi-label Learning

no code implementations3 Dec 2020 Elham J. Barezi, Iacer Calixto, Kyunghyun Cho, Pascale Fung

These tasks are hard because the label space is usually (i) very large, e. g. thousands or millions of labels, (ii) very sparse, i. e. very few labels apply to each input document, and (iii) highly correlated, meaning that the existence of one label changes the likelihood of predicting all other labels.

Multi-Label Learning

Are scene graphs good enough to improve Image Captioning?

1 code implementation Asian Chapter of the Association for Computational Linguistics 2020 Victor Milewski, Marie-Francine Moens, Iacer Calixto

Overall, we find no significant difference between models that use scene graph features and models that only use object detection features across different captioning metrics, which suggests that existing scene graph generation models are still too noisy to be useful in image captioning.

Decoder Graph Attention +6

ImagiFilter: A resource to enable the semi-automatic mining of images at scale

1 code implementation20 Aug 2020 Houda Alberts, Iacer Calixto

In this paper, we describe and publicly release an image dataset along with pretrained models designed to (semi-)automatically filter out undesirable images from very large image collections, possibly obtained from the web.

English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too

no code implementations Asian Chapter of the Association for Computational Linguistics 2020 Jason Phang, Iacer Calixto, Phu Mon Htut, Yada Pruksachatkun, Haokun Liu, Clara Vania, Katharina Kann, Samuel R. Bowman

Intermediate-task training---fine-tuning a pretrained model on an intermediate task before fine-tuning again on the target task---often improves model performance substantially on language understanding tasks in monolingual English settings.

HellaSwag Question Answering +5

Latent Variable Model for Multi-modal Translation

1 code implementation ACL 2019 Iacer Calixto, Miguel Rios, Wilker Aziz

In this work, we propose to model the interaction between visual and textual features for multi-modal neural machine translation (MMT) through a latent variable model.

Decoder Multimodal Machine Translation +2

Linguistic realisation as machine translation: Comparing different MT models for AMR-to-text generation

no code implementations WS 2017 Thiago Castro Ferreira, Iacer Calixto, S Wubben, er, Emiel Krahmer

In this paper, we study AMR-to-text generation, framing it as a translation task and comparing two different MT approaches (Phrase-based and Neural MT).

AMR-to-Text Generation Machine Translation +2

Sentence-Level Multilingual Multi-modal Embedding for Natural Language Processing

no code implementations RANLP 2017 Iacer Calixto, Qun Liu

We propose a novel discriminative ranking model that learns embeddings from multilingual and multi-modal data, meaning that our model can take advantage of images and descriptions in multiple languages to improve embedding quality.

Machine Translation NMT +5

Incorporating Global Visual Features into Attention-based Neural Machine Translation.

no code implementations EMNLP 2017 Iacer Calixto, Qun Liu

We introduce multi-modal, attention-based neural machine translation (NMT) models which incorporate visual features into different parts of both the encoder and the decoder.

Decoder Machine Translation +5

Using Images to Improve Machine-Translating E-Commerce Product Listings.

no code implementations EACL 2017 Iacer Calixto, Daniel Stein, Evgeny Matusov, Pintu Lohar, Sheila Castilho, Andy Way

We evaluate our models quantitatively using BLEU and TER and find that (i) additional synthetic data has a general positive impact on text-only and multi-modal NMT models, and that (ii) using a multi-modal NMT model for re-ranking n-best lists improves TER significantly across different n-best list sizes.

Machine Translation NMT +2

Human Evaluation of Multi-modal Neural Machine Translation: A Case-Study on E-Commerce Listing Titles

no code implementations WS 2017 Iacer Calixto, Daniel Stein, Evgeny Matusov, Sheila Castilho, Andy Way

Nonetheless, human evaluators ranked translations from a multi-modal NMT model as better than those of a text-only NMT over 88{\%} of the time, which suggests that images do help NMT in this use-case.

Machine Translation NMT +1

Doubly-Attentive Decoder for Multi-modal Neural Machine Translation

no code implementations ACL 2017 Iacer Calixto, Qun Liu, Nick Campbell

We introduce a Multi-modal Neural Machine Translation model in which a doubly-attentive decoder naturally incorporates spatial visual features obtained using pre-trained convolutional neural networks, bridging the gap between image description and translation.

Decoder Multimodal Machine Translation +1

Multilingual Multi-modal Embeddings for Natural Language Processing

no code implementations3 Feb 2017 Iacer Calixto, Qun Liu, Nick Campbell

We propose a novel discriminative model that learns embeddings from multilingual and multi-modal data, meaning that our model can take advantage of images and descriptions in multiple languages to improve embedding quality.

Machine Translation NMT +5

Incorporating Global Visual Features into Attention-Based Neural Machine Translation

no code implementations23 Jan 2017 Iacer Calixto, Qun Liu, Nick Campbell

We introduce multi-modal, attention-based neural machine translation (NMT) models which incorporate visual features into different parts of both the encoder and the decoder.

Decoder Multimodal Machine Translation +3

Developing a Dataset for Evaluating Approaches for Document Expansion with Images

no code implementations LREC 2016 Debasis Ganguly, Iacer Calixto, Gareth Jones

Motivated by the adage that a {``}picture is worth a thousand words{''} it can be reasoned that automatically enriching the textual content of a document with relevant images can increase the readability of a document.

Information Retrieval Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.