Search Results for author: Desmond Elliott

Found 43 papers, 24 papers with code

Revisiting Transformer-based Models for Long Document Classification

1 code implementation14 Apr 2022 Xiang Dai, Ilias Chalkidis, Sune Darkner, Desmond Elliott

The recent literature in text classification is biased towards short text sequences (e. g., sentences or paragraphs).

Classification Document Classification

MDAPT: Multilingual Domain Adaptive Pretraining in a Single Model

1 code implementation Findings (EMNLP) 2021 Rasmus Kær Jørgensen, Mareike Hartmann, Xiang Dai, Desmond Elliott

Domain adaptive pretraining, i. e. the continued unsupervised pretraining of a language model on domain-specific text, improves the modelling of text for downstream tasks within the domain.

Language Modelling named-entity-recognition +2

Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in Multimodal Transformers

2 code implementations EMNLP 2021 Stella Frank, Emanuele Bugliarello, Desmond Elliott

Models that have learned to construct cross-modal representations using both modalities are expected to perform worse when inputs are missing from a modality.

Language Modelling

The Role of Syntactic Planning in Compositional Image Captioning

1 code implementation EACL 2021 Emanuele Bugliarello, Desmond Elliott

Image captioning has focused on generalizing to images drawn from the same distribution as the training set, and not to the more challenging problem of generalizing to different distributions of images.

Image Captioning

Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework of Vision-and-Language BERTs

2 code implementations30 Nov 2020 Emanuele Bugliarello, Ryan Cotterell, Naoaki Okazaki, Desmond Elliott

Large-scale pretraining and task-specific fine-tuning is now the standard methodology for many tasks in computer vision and natural language processing.

Natural Language Processing

Multimodal Speech Recognition with Unstructured Audio Masking

no code implementations EMNLP (nlpbt) 2020 Tejas Srinivasan, Ramon Sanabria, Florian Metze, Desmond Elliott

Our experiments on the Flickr 8K Audio Captions Corpus show that multimodal ASR can generalize to recover different types of masked words in this unstructured masking setting.

Automatic Speech Recognition

Fine-Grained Grounding for Multimodal Speech Recognition

1 code implementation Findings of the Association for Computational Linguistics 2020 Tejas Srinivasan, Ramon Sanabria, Florian Metze, Desmond Elliott

In experiments on the Flickr8K Audio Captions Corpus, we find that our model improves over approaches that use global visual features, that the proposals enable the model to recover entities and other related words, such as adjectives, and that improvements are due to the model's ability to localize the correct proposals.

Automatic Speech Recognition

On Forgetting to Cite Older Papers: An Analysis of the ACL Anthology

1 code implementation ACL 2020 Marcel Bollmann, Desmond Elliott

The field of natural language processing is experiencing a period of unprecedented growth, and with it a surge of published papers.

Natural Language Processing

CompGuessWhat?!: A Multi-task Evaluation Framework for Grounded Language Learning

no code implementations ACL 2020 Alessandro Suglia, Ioannis Konstas, Andrea Vanzo, Emanuele Bastianelli, Desmond Elliott, Stella Frank, Oliver Lemon

To remedy this, we present GROLLA, an evaluation framework for Grounded Language Learning with Attributes with three sub-tasks: 1) Goal-oriented evaluation; 2) Object attribute prediction evaluation; and 3) Zero-shot evaluation.

Grounded language learning

The Sensitivity of Language Models and Humans to Winograd Schema Perturbations

2 code implementations ACL 2020 Mostafa Abdou, Vinit Ravishankar, Maria Barrett, Yonatan Belinkov, Desmond Elliott, Anders Søgaard

Large-scale pretrained language models are the major driving force behind recent improvements in performance on the Winograd Schema Challenge, a widely employed test of common sense reasoning ability.

Common Sense Reasoning Pretrained Language Models

Multimodal Machine Translation through Visuals and Speech

no code implementations28 Nov 2019 Umut Sulubacak, Ozan Caglayan, Stig-Arne Grönroos, Aku Rouhe, Desmond Elliott, Lucia Specia, Jörg Tiedemann

Multimodal machine translation involves drawing information from more than one modality, based on the assumption that the additional modalities will contain useful alternative views of the input data.

Image Captioning Multimodal Machine Translation +3

Bootstrapping Disjoint Datasets for Multilingual Multimodal Representation Learning

no code implementations9 Nov 2019 Ákos Kádár, Grzegorz Chrupała, Afra Alishahi, Desmond Elliott

However, we do find that using an external machine translation model to generate the synthetic data sets results in better performance.

Machine Translation Representation Learning +2

Adversarial Removal of Demographic Attributes Revisited

no code implementations IJCNLP 2019 Maria Barrett, Yova Kementchedjhieva, Yanai Elazar, Desmond Elliott, Anders S{\o}gaard

Elazar and Goldberg (2018) showed that protected attributes can be extracted from the representations of a debiased neural network for mention detection at above-chance levels, by evaluating a diagnostic classifier on a held-out subsample of the data it was trained on.

Understanding the Effect of Textual Adversaries in Multimodal Machine Translation

no code implementations WS 2019 Koel Dutta Chowdhury, Desmond Elliott

It is assumed that multimodal machine translation systems are better than text-only systems at translating phrases that have a direct correspondence in the image.

Multimodal Machine Translation Translation

Compositional Generalization in Image Captioning

1 code implementation CONLL 2019 Mitja Nikolaus, Mostafa Abdou, Matthew Lamm, Rahul Aralikatte, Desmond Elliott

Image captioning models are usually evaluated on their ability to describe a held-out set of images, not on their ability to generalize to unseen concepts.

Image Captioning

Cross-lingual Visual Verb Sense Disambiguation

1 code implementation NAACL 2019 Spandana Gella, Desmond Elliott, Frank Keller

We extend this line of work to the more challenging task of cross-lingual verb sense disambiguation, introducing the MultiSense dataset of 9, 504 images annotated with English, German, and Spanish verbs.

Machine Translation Translation

Talking about other people: an endless range of possibilities

1 code implementation WS 2018 Emiel van Miltenburg, Desmond Elliott, Piek Vossen

This taxonomy serves as a reference point to think about how other people should be described, and can be used to classify and compute statistics about labels applied to people.

Text Generation

How2: A Large-scale Dataset for Multimodal Language Understanding

1 code implementation1 Nov 2018 Ramon Sanabria, Ozan Caglayan, Shruti Palaskar, Desmond Elliott, Loïc Barrault, Lucia Specia, Florian Metze

In this paper, we introduce How2, a multimodal collection of instructional videos with English subtitles and crowdsourced Portuguese translations.

Automatic Speech Recognition Machine Translation +1

Adversarial Evaluation of Multimodal Machine Translation

no code implementations EMNLP 2018 Desmond Elliott

The promise of combining language and vision in multimodal machine translation is that systems will produce better translations by leveraging the image data.

Multimodal Machine Translation text similarity +1

Findings of the Third Shared Task on Multimodal Machine Translation

1 code implementation WS 2018 Lo{\"\i}c Barrault, Fethi Bougares, Lucia Specia, Chiraag Lala, Desmond Elliott, Stella Frank

In this task a source sentence in English is supplemented by an image and participating systems are required to generate a translation for such a sentence into German, French or Czech.

Multimodal Machine Translation Translation

Lessons learned in multilingual grounded language learning

1 code implementation CONLL 2018 Ákos Kádár, Desmond Elliott, Marc-Alexandre Côté, Grzegorz Chrupała, Afra Alishahi

Recent work has shown how to learn better visual-semantic embeddings by leveraging image descriptions in more than one language.

Grounded language learning

Measuring the Diversity of Automatic Image Descriptions

1 code implementation COLING 2018 Emiel van Miltenburg, Desmond Elliott, Piek Vossen

Automatic image description systems typically produce generic sentences that only make use of a small subset of the vocabulary available to them.

Text Generation

Cross-linguistic differences and similarities in image descriptions

1 code implementation WS 2017 Emiel van Miltenburg, Desmond Elliott, Piek Vossen

Automatic image description systems are commonly trained and evaluated on large image description datasets.

Imagination improves Multimodal Translation

no code implementations IJCNLP 2017 Desmond Elliott, Ákos Kádár

We decompose multimodal translation into two sub-tasks: learning to translate and learning visually grounded representations.

Translation

Room for improvement in automatic image description: an error analysis

1 code implementation13 Apr 2017 Emiel van Miltenburg, Desmond Elliott

In recent years we have seen rapid and significant progress in automatic image description but what are the open problems in this area?

Pragmatic factors in image description: the case of negations

1 code implementation WS 2016 Emiel van Miltenburg, Roser Morante, Desmond Elliott

We provide a qualitative analysis of the descriptions containing negations (no, not, n't, nobody, etc) in the Flickr30K corpus, and a categorization of negation uses.

A Corpus of Images and Text in Online News

no code implementations LREC 2016 Laura Hollink, Adriatik Bedjeti, Martin van Harmelen, Desmond Elliott

The corpus consists of JSON-LD files with the following data about each article: the original URL of the article on the news publisher{'}s website, the date of publication, the headline of the article, the URL of the image displayed with the article (if any), and the caption of that image.

Natural Language Processing

1 Million Captioned Dutch Newspaper Images

no code implementations LREC 2016 Desmond Elliott, Martijn Kleppe

Images naturally appear alongside text in a wide variety of media, such as books, magazines, newspapers, and in online articles.

Data-to-Text Generation Image Captioning +3

Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures

no code implementations15 Jan 2016 Raffaella Bernardi, Ruket Cakici, Desmond Elliott, Aykut Erdem, Erkut Erdem, Nazli Ikizler-Cinbis, Frank Keller, Adrian Muscat, Barbara Plank

Automatic description generation from natural images is a challenging problem that has recently received a large amount of interest from the computer vision and natural language processing communities.

Natural Language Processing

Multilingual Image Description with Neural Sequence Models

1 code implementation15 Oct 2015 Desmond Elliott, Stella Frank, Eva Hasler

In this paper we present an approach to multi-language image description bringing together insights from neural machine translation and neural image description.

Image Captioning Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.