Search Results for author: Ozan Caglayan

Found 19 papers, 10 papers with code

Supervised Visual Attention for Simultaneous Multimodal Machine Translation

no code implementations23 Jan 2022 Veneta Haralampieva, Ozan Caglayan, Lucia Specia

A particular use for such multimodal systems is the task of simultaneous machine translation, where visual context has been shown to complement the partial information provided by the source sentence, especially in the early phases of translation.

Multimodal Machine Translation Sentence +1

BERTGEN: Multi-task Generation through BERT

1 code implementation ACL 2021 Faidon Mitzalis, Ozan Caglayan, Pranava Madhyastha, Lucia Specia

We present BERTGEN, a novel generative, decoder-only model which extends BERT by fusing multimodal and multilingual pretrained models VL-BERT and M-BERT, respectively.

Image Captioning Multimodal Machine Translation +2

Exploiting Multimodal Reinforcement Learning for Simultaneous Machine Translation

1 code implementation EACL 2021 Julia Ive, Andy Mingren Li, Yishu Miao, Ozan Caglayan, Pranava Madhyastha, Lucia Specia

This paper addresses the problem of simultaneous machine translation (SiMT) by exploring two main concepts: (a) adaptive policies to learn a good trade-off between high translation quality and low latency; and (b) visual information to support this process by providing additional (visual) contextual information which may be available before the textual input is produced.

Machine Translation reinforcement-learning +2

MSVD-Turkish: A Comprehensive Multimodal Dataset for Integrated Vision and Language Research in Turkish

no code implementations13 Dec 2020 Begum Citamak, Ozan Caglayan, Menekse Kuyu, Erkut Erdem, Aykut Erdem, Pranava Madhyastha, Lucia Specia

We hope that the MSVD-Turkish dataset and the results reported in this work will lead to better video captioning and multimodal machine translation models for Turkish and other morphology rich and agglutinative languages.

Multimodal Machine Translation Sentence +3

Simultaneous Machine Translation with Visual Context

1 code implementation EMNLP 2020 Ozan Caglayan, Julia Ive, Veneta Haralampieva, Pranava Madhyastha, Loïc Barrault, Lucia Specia

Simultaneous machine translation (SiMT) aims to translate a continuous input text stream into another language with the lowest latency and highest quality possible.

Machine Translation Translation

Multimodal Machine Translation through Visuals and Speech

no code implementations28 Nov 2019 Umut Sulubacak, Ozan Caglayan, Stig-Arne Grönroos, Aku Rouhe, Desmond Elliott, Lucia Specia, Jörg Tiedemann

Multimodal machine translation involves drawing information from more than one modality, based on the assumption that the additional modalities will contain useful alternative views of the input data.

Image Captioning Multimodal Machine Translation +4

Transformer-based Cascaded Multimodal Speech Translation

no code implementations EMNLP (IWSLT) 2019 Zixiu Wu, Ozan Caglayan, Julia Ive, Josiah Wang, Lucia Specia

Upon conducting extensive experiments, we found that (i) the explored visual integration schemes often harm the translation performance for the transformer and additive deliberation, but considerably improve the cascade deliberation; (ii) the transformer and cascade deliberation integrate the visual modality better than the additive deliberation, as shown by the incongruence analysis.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Imperial College London Submission to VATEX Video Captioning Task

no code implementations16 Oct 2019 Ozan Caglayan, Zixiu Wu, Pranava Madhyastha, Josiah Wang, Lucia Specia

This paper describes the Imperial College London team's submission to the 2019' VATEX video captioning challenge, where we first explore two sequence-to-sequence models, namely a recurrent (GRU) model and a transformer model, which generate captions from the I3D action features.

Video Captioning

Probing the Need for Visual Context in Multimodal Machine Translation

no code implementations NAACL 2019 Ozan Caglayan, Pranava Madhyastha, Lucia Specia, Loïc Barrault

Current work on multimodal machine translation (MMT) has suggested that the visual modality is either unnecessary or only marginally beneficial.

Multimodal Machine Translation Translation

Multimodal Grounding for Sequence-to-Sequence Speech Recognition

1 code implementation9 Nov 2018 Ozan Caglayan, Ramon Sanabria, Shruti Palaskar, Loïc Barrault, Florian Metze

Specifically, in our previous work, we propose a multistep visual adaptive training approach which improves the accuracy of an audio-based Automatic Speech Recognition (ASR) system.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

How2: A Large-scale Dataset for Multimodal Language Understanding

2 code implementations1 Nov 2018 Ramon Sanabria, Ozan Caglayan, Shruti Palaskar, Desmond Elliott, Loïc Barrault, Lucia Specia, Florian Metze

In this paper, we introduce How2, a multimodal collection of instructional videos with English subtitles and crowdsourced Portuguese translations.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

LIUM-CVC Submissions for WMT18 Multimodal Translation Task

no code implementations WS 2018 Ozan Caglayan, Adrien Bardet, Fethi Bougares, Loïc Barrault, Kai Wang, Marc Masana, Luis Herranz, Joost Van de Weijer

This paper describes the multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT18 Shared Task on Multimodal Translation.

Machine Translation Translation

LIUM-CVC Submissions for WMT17 Multimodal Translation Task

no code implementations WS 2017 Ozan Caglayan, Walid Aransa, Adrien Bardet, Mercedes García-Martínez, Fethi Bougares, Loïc Barrault, Marc Masana, Luis Herranz, Joost Van de Weijer

This paper describes the monomodal and multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT17 Shared Task on Multimodal Translation.

Machine Translation Translation

LIUM Machine Translation Systems for WMT17 News Translation Task

1 code implementation WS 2017 Mercedes García-Martínez, Ozan Caglayan, Walid Aransa, Adrien Bardet, Fethi Bougares, Loïc Barrault

This paper describes LIUM submissions to WMT17 News Translation Task for English-German, English-Turkish, English-Czech and English-Latvian language pairs.

Machine Translation Translation

NMTPY: A Flexible Toolkit for Advanced Neural Machine Translation Systems

1 code implementation1 Jun 2017 Ozan Caglayan, Mercedes García-Martínez, Adrien Bardet, Walid Aransa, Fethi Bougares, Loïc Barrault

nmtpy has been used for LIUM's top-ranked submissions to WMT Multimodal Machine Translation and News Translation tasks in 2016 and 2017.

Multimodal Machine Translation Translation

Multimodal Attention for Neural Machine Translation

1 code implementation13 Sep 2016 Ozan Caglayan, Loïc Barrault, Fethi Bougares

We show that a dedicated attention for each modality achieves up to 1. 6 points in BLEU and METEOR compared to a textual NMT baseline.

Image Captioning Machine Translation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.