Search Results for author: Ozan Caglayan

Found 19 papers, 10 papers with code

Supervised Visual Attention for Simultaneous Multimodal Machine Translation

no code implementations • 23 Jan 2022 • Veneta Haralampieva, Ozan Caglayan, Lucia Specia

A particular use for such multimodal systems is the task of simultaneous machine translation, where visual context has been shown to complement the partial information provided by the source sentence, especially in the early phases of translation.

Multimodal Machine Translation Sentence +1

Paper
Add Code

BERTGEN: Multi-task Generation through BERT

1 code implementation • ACL 2021 • Faidon Mitzalis, Ozan Caglayan, Pranava Madhyastha, Lucia Specia

We present BERTGEN, a novel generative, decoder-only model which extends BERT by fusing multimodal and multilingual pretrained models VL-BERT and M-BERT, respectively.

Image Captioning Multimodal Machine Translation +2

Paper
Code

Exploiting Multimodal Reinforcement Learning for Simultaneous Machine Translation

1 code implementation • EACL 2021 • Julia Ive, Andy Mingren Li, Yishu Miao, Ozan Caglayan, Pranava Madhyastha, Lucia Specia

This paper addresses the problem of simultaneous machine translation (SiMT) by exploring two main concepts: (a) adaptive policies to learn a good trade-off between high translation quality and low latency; and (b) visual information to support this process by providing additional (visual) contextual information which may be available before the textual input is produced.

Machine Translation reinforcement-learning +2

Paper
Code

Cross-lingual Visual Pre-training for Multimodal Machine Translation

1 code implementation • EACL 2021 • Ozan Caglayan, Menekse Kuyu, Mustafa Sercan Amac, Pranava Madhyastha, Erkut Erdem, Aykut Erdem, Lucia Specia

Pre-trained language models have been shown to improve performance in many natural language tasks substantially.

Language Modelling Multimodal Machine Translation +1

Paper
Code

MSVD-Turkish: A Comprehensive Multimodal Dataset for Integrated Vision and Language Research in Turkish

no code implementations • 13 Dec 2020 • Begum Citamak, Ozan Caglayan, Menekse Kuyu, Erkut Erdem, Aykut Erdem, Pranava Madhyastha, Lucia Specia

We hope that the MSVD-Turkish dataset and the results reported in this work will lead to better video captioning and multimodal machine translation models for Turkish and other morphology rich and agglutinative languages.

Multimodal Machine Translation Sentence +3

Paper
Add Code

Curious Case of Language Generation Evaluation Metrics: A Cautionary Tale

no code implementations • COLING 2020 • Ozan Caglayan, Pranava Madhyastha, Lucia Specia

Automatic evaluation of language generation systems is a well-studied problem in Natural Language Processing.

Image Captioning Machine Translation +3

Paper
Add Code

Simultaneous Machine Translation with Visual Context

1 code implementation • EMNLP 2020 • Ozan Caglayan, Julia Ive, Veneta Haralampieva, Pranava Madhyastha, Loïc Barrault, Lucia Specia

Simultaneous machine translation (SiMT) aims to translate a continuous input text stream into another language with the lowest latency and highest quality possible.

Machine Translation Translation

Paper
Code

Multimodal Machine Translation through Visuals and Speech

no code implementations • 28 Nov 2019 • Umut Sulubacak, Ozan Caglayan, Stig-Arne Grönroos, Aku Rouhe, Desmond Elliott, Lucia Specia, Jörg Tiedemann

Multimodal machine translation involves drawing information from more than one modality, based on the assumption that the additional modalities will contain useful alternative views of the input data.

Ranked #4 on Multimodal Machine Translation on Multi30K

Image Captioning Multimodal Machine Translation +4

Paper
Add Code

Transformer-based Cascaded Multimodal Speech Translation

no code implementations • EMNLP (IWSLT) 2019 • Zixiu Wu, Ozan Caglayan, Julia Ive, Josiah Wang, Lucia Specia

Upon conducting extensive experiments, we found that (i) the explored visual integration schemes often harm the translation performance for the transformer and additive deliberation, but considerably improve the cascade deliberation; (ii) the transformer and cascade deliberation integrate the visual modality better than the additive deliberation, as shown by the incongruence analysis.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Imperial College London Submission to VATEX Video Captioning Task

no code implementations • 16 Oct 2019 • Ozan Caglayan, Zixiu Wu, Pranava Madhyastha, Josiah Wang, Lucia Specia

This paper describes the Imperial College London team's submission to the 2019' VATEX video captioning challenge, where we first explore two sequence-to-sequence models, namely a recurrent (GRU) model and a transformer model, which generate captions from the I3D action features.

Video Captioning

Paper
Add Code

Probing the Need for Visual Context in Multimodal Machine Translation

no code implementations • NAACL 2019 • Ozan Caglayan, Pranava Madhyastha, Lucia Specia, Loïc Barrault

Current work on multimodal machine translation (MMT) has suggested that the visual modality is either unnecessary or only marginally beneficial.

Multimodal Machine Translation Translation

Paper
Add Code

Multimodal Grounding for Sequence-to-Sequence Speech Recognition

1 code implementation • 9 Nov 2018 • Ozan Caglayan, Ramon Sanabria, Shruti Palaskar, Loïc Barrault, Florian Metze

Specifically, in our previous work, we propose a multistep visual adaptive training approach which improves the accuracy of an audio-based Automatic Speech Recognition (ASR) system.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

151

Paper
Code

How2: A Large-scale Dataset for Multimodal Language Understanding

2 code implementations • 1 Nov 2018 • Ramon Sanabria, Ozan Caglayan, Shruti Palaskar, Desmond Elliott, Loïc Barrault, Lucia Specia, Florian Metze

In this paper, we introduce How2, a multimodal collection of instructional videos with English subtitles and crowdsourced Portuguese translations.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

151

Paper
Code

LIUM-CVC Submissions for WMT18 Multimodal Translation Task

no code implementations • WS 2018 • Ozan Caglayan, Adrien Bardet, Fethi Bougares, Loïc Barrault, Kai Wang, Marc Masana, Luis Herranz, Joost Van de Weijer

This paper describes the multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT18 Shared Task on Multimodal Translation.

Machine Translation Translation

Paper
Add Code

LIUM-CVC Submissions for WMT17 Multimodal Translation Task

no code implementations • WS 2017 • Ozan Caglayan, Walid Aransa, Adrien Bardet, Mercedes García-Martínez, Fethi Bougares, Loïc Barrault, Marc Masana, Luis Herranz, Joost Van de Weijer

This paper describes the monomodal and multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT17 Shared Task on Multimodal Translation.

Machine Translation Translation

Paper
Add Code

LIUM Machine Translation Systems for WMT17 News Translation Task

1 code implementation • WS 2017 • Mercedes García-Martínez, Ozan Caglayan, Walid Aransa, Adrien Bardet, Fethi Bougares, Loïc Barrault

This paper describes LIUM submissions to WMT17 News Translation Task for English-German, English-Turkish, English-Czech and English-Latvian language pairs.

Machine Translation Translation

Paper
Code

NMTPY: A Flexible Toolkit for Advanced Neural Machine Translation Systems

1 code implementation • 1 Jun 2017 • Ozan Caglayan, Mercedes García-Martínez, Adrien Bardet, Walid Aransa, Fethi Bougares, Loïc Barrault

nmtpy has been used for LIUM's top-ranked submissions to WMT Multimodal Machine Translation and News Translation tasks in 2016 and 2017.

Multimodal Machine Translation Translation

126

Paper
Code

Multimodal Attention for Neural Machine Translation

1 code implementation • 13 Sep 2016 • Ozan Caglayan, Loïc Barrault, Fethi Bougares

We show that a dedicated attention for each modality achieves up to 1. 6 points in BLEU and METEOR compared to a textual NMT baseline.

Image Captioning Machine Translation +2

126

Paper
Code

Does Multimodality Help Human and Machine for Translation and Image Captioning?

1 code implementation • WS 2016 • Ozan Caglayan, Walid Aransa, Yaxing Wang, Marc Masana, Mercedes García-Martínez, Fethi Bougares, Loïc Barrault, Joost Van de Weijer

This paper presents the systems developed by LIUM and CVC for the WMT16 Multimodal Machine Translation challenge.

Image Captioning Multimodal Machine Translation +1

126

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.