no code implementations • EAMT 2022 • Anabela Barreiro, José GC de Souza, Albert Gatt, Mehul Bhatt, Elena Lloret, Aykut Erdem, Dimitra Gkatzia, Helena Moniz, Irene Russo, Fabio Kepler, Iacer Calixto, Marcin Paprzycki, François Portet, Isabelle Augenstein, Mirela Alhasani
This paper presents the Multitask, Multilingual, Multimodal Language Generation COST Action – Multi3Generation (CA18231), an interdisciplinary network of research groups working on different aspects of language generation.
1 code implementation • 6 Sep 2024 • Zeyu Zhang, Paul Groth, Iacer Calixto, Sebastian Schelter
Furthermore, our approach exhibits major cost benefits: the average prediction quality of AnyMatch is within 4. 4% of the state-of-the-art method MatchGPT with the proprietary trillion-parameter model GPT-4, yet AnyMatch requires four orders of magnitude less parameters and incurs a 3, 899 times lower inference cost (in dollars per 1, 000 tokens).
no code implementations • 17 Jul 2024 • Mustafa Dogan, Ilker Kesen, Iacer Calixto, Aykut Erdem, Erkut Erdem
This study aims to evaluate the performance of MLLMs on the VALSE benchmark, focusing on the efficacy of few-shot In-Context Learning (ICL), and Chain-of-Thought (CoT) prompting.
no code implementations • 19 Nov 2023 • Nishant Mishra, Gaurav Sahu, Iacer Calixto, Ameen Abu-Hanna, Issam H. Laradji
Generating high-quality summaries for chat dialogs often requires large labeled datasets.
no code implementations • 13 Nov 2023 • Ilker Kesen, Andrea Pedrotti, Mustafa Dogan, Michele Cafagna, Emre Can Acikgoz, Letitia Parcalabescu, Iacer Calixto, Anette Frank, Albert Gatt, Aykut Erdem, Erkut Erdem
With the ever-increasing popularity of pretrained Video-Language Models (VidLMs), there is a pressing need to develop robust evaluation methodologies that delve deeper into their visio-linguistic capabilities.
no code implementations • 3 Jul 2023 • Giovanni Cinà, Daniel Fernandez-Llaneza, Ludovico Deponte, Nishant Mishra, Tabea E. Röber, Sandro Pezzelle, Iacer Calixto, Rob Goedhart, Ş. İlker Birbil
Feature attribution methods have become a staple method to disentangle the complex behavior of black box models.
1 code implementation • 28 Mar 2023 • Auke Elfrink, Iacopo Vagliano, Ameen Abu-Hanna, Iacer Calixto
We investigate different natural language processing (NLP) approaches based on contextualised word representations for the problem of early prediction of lung cancer using free-text patient medical notes of Dutch primary care physicians.
1 code implementation • 8 Nov 2022 • İlker Kesen, Aykut Erdem, Erkut Erdem, Iacer Calixto
In the second stage, we integrate visual supervision into our system using visual imageries, two sets of images generated by a text-to-image model by taking terms and descriptions as input.
1 code implementation • 27 Jun 2022 • Ningyuan Huang, Yash R. Deshpande, Yibo Liu, Houda Alberts, Kyunghyun Cho, Clara Vania, Iacer Calixto
We use the recently released VisualSem KG as our external knowledge repository, which covers a subset of Wikipedia and WordNet entities, and compare a mix of tuple-based and graph-based algorithms to learn entity and relation representations that are grounded on the KG multimodal information.
Multilingual Named Entity Recognition named-entity-recognition +2
1 code implementation • ACL 2022 • Letitia Parcalabescu, Michele Cafagna, Lilitta Muradjan, Anette Frank, Iacer Calixto, Albert Gatt
We propose VALSE (Vision And Language Structured Evaluation), a novel benchmark designed for testing general-purpose pretrained vision and language (V&L) models for their visio-linguistic grounding capabilities on specific linguistic phenomena.
1 code implementation • 6 Dec 2021 • Ruan Chaves Rodrigues, Marcelo Akira Inuzuka, Juliana Resplande Sant'Anna Gomes, Acquila Santos Rocha, Iacer Calixto, Hugo Alexandre Dantas do Nascimento
Hashtag segmentation, also known as hashtag decomposition, is a common step in preprocessing pipelines for social media datasets.
no code implementations • NAACL 2021 • Iacer Calixto, Alessandro Raganato, Tommaso Pasini
Further adding extra languages lead to improvements in most tasks up to a certain point, but overall we found it non-trivial to scale improvements in model transferability by training on ever increasing amounts of Wikipedia languages.
no code implementations • ACL (mmsr, IWCS) 2021 • Letitia Parcalabescu, Albert Gatt, Anette Frank, Iacer Calixto
We investigate the reasoning ability of pretrained vision and language (V&L) models in two tasks that require multimodal integration: (1) discriminating a correct image-sentence pair from an incorrect one, and (2) counting entities in an image.
no code implementations • 3 Dec 2020 • Elham J. Barezi, Iacer Calixto, Kyunghyun Cho, Pascale Fung
These tasks are hard because the label space is usually (i) very large, e. g. thousands or millions of labels, (ii) very sparse, i. e. very few labels apply to each input document, and (iii) highly correlated, meaning that the existence of one label changes the likelihood of predicting all other labels.
1 code implementation • Asian Chapter of the Association for Computational Linguistics 2020 • Victor Milewski, Marie-Francine Moens, Iacer Calixto
Overall, we find no significant difference between models that use scene graph features and models that only use object detection features across different captioning metrics, which suggests that existing scene graph generation models are still too noisy to be useful in image captioning.
1 code implementation • 20 Aug 2020 • Houda Alberts, Iacer Calixto
In this paper, we describe and publicly release an image dataset along with pretrained models designed to (semi-)automatically filter out undesirable images from very large image collections, possibly obtained from the web.
1 code implementation • EMNLP (MRL) 2021 • Houda Alberts, Teresa Huang, Yash Deshpande, Yibo Liu, Kyunghyun Cho, Clara Vania, Iacer Calixto
We also release a neural multi-modal retrieval model that can use images or sentences as inputs and retrieves entities in the KG.
no code implementations • WS 2020 • Diksha Meghwal, Katharina Kann, Iacer Calixto, Stanislaw Jastrzebski
Pretrained language models have obtained impressive results for a large set of natural language understanding tasks.
no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Jason Phang, Iacer Calixto, Phu Mon Htut, Yada Pruksachatkun, Haokun Liu, Clara Vania, Katharina Kann, Samuel R. Bowman
Intermediate-task training---fine-tuning a pretrained model on an intermediate task before fine-tuning again on the target task---often improves model performance substantially on language understanding tasks in monolingual English settings.
Ranked #20 on Zero-Shot Cross-Lingual Transfer on XTREME
1 code implementation • ACL 2019 • Iacer Calixto, Miguel Rios, Wilker Aziz
In this work, we propose to model the interaction between visual and textual features for multi-modal neural machine translation (MMT) through a latent variable model.
Ranked #9 on Multimodal Machine Translation on Multi30K
no code implementations • WS 2017 • Thiago Castro Ferreira, Iacer Calixto, S Wubben, er, Emiel Krahmer
In this paper, we study AMR-to-text generation, framing it as a translation task and comparing two different MT approaches (Phrase-based and Neural MT).
no code implementations • RANLP 2017 • Iacer Calixto, Qun Liu
We propose a novel discriminative ranking model that learns embeddings from multilingual and multi-modal data, meaning that our model can take advantage of images and descriptions in multiple languages to improve embedding quality.
no code implementations • EMNLP 2017 • Iacer Calixto, Qun Liu
We introduce multi-modal, attention-based neural machine translation (NMT) models which incorporate visual features into different parts of both the encoder and the decoder.
no code implementations • EACL 2017 • Iacer Calixto, Daniel Stein, Evgeny Matusov, Pintu Lohar, Sheila Castilho, Andy Way
We evaluate our models quantitatively using BLEU and TER and find that (i) additional synthetic data has a general positive impact on text-only and multi-modal NMT models, and that (ii) using a multi-modal NMT model for re-ranking n-best lists improves TER significantly across different n-best list sizes.
no code implementations • WS 2017 • Iacer Calixto, Daniel Stein, Evgeny Matusov, Sheila Castilho, Andy Way
Nonetheless, human evaluators ranked translations from a multi-modal NMT model as better than those of a text-only NMT over 88{\%} of the time, which suggests that images do help NMT in this use-case.
no code implementations • ACL 2017 • Iacer Calixto, Qun Liu, Nick Campbell
We introduce a Multi-modal Neural Machine Translation model in which a doubly-attentive decoder naturally incorporates spatial visual features obtained using pre-trained convolutional neural networks, bridging the gap between image description and translation.
Ranked #11 on Multimodal Machine Translation on Multi30K
no code implementations • 3 Feb 2017 • Iacer Calixto, Qun Liu, Nick Campbell
We propose a novel discriminative model that learns embeddings from multilingual and multi-modal data, meaning that our model can take advantage of images and descriptions in multiple languages to improve embedding quality.
no code implementations • 23 Jan 2017 • Iacer Calixto, Qun Liu, Nick Campbell
We introduce multi-modal, attention-based neural machine translation (NMT) models which incorporate visual features into different parts of both the encoder and the decoder.
Ranked #10 on Multimodal Machine Translation on Multi30K
no code implementations • LREC 2016 • Debasis Ganguly, Iacer Calixto, Gareth Jones
Motivated by the adage that a {``}picture is worth a thousand words{''} it can be reasoned that automatically enriching the textual content of a document with relevant images can increase the readability of a document.