no code implementations • 23 Jan 2022 • Veneta Haralampieva, Ozan Caglayan, Lucia Specia
A particular use for such multimodal systems is the task of simultaneous machine translation, where visual context has been shown to complement the partial information provided by the source sentence, especially in the early phases of translation.
1 code implementation • ACL 2021 • Faidon Mitzalis, Ozan Caglayan, Pranava Madhyastha, Lucia Specia
We present BERTGEN, a novel generative, decoder-only model which extends BERT by fusing multimodal and multilingual pretrained models VL-BERT and M-BERT, respectively.
1 code implementation • EACL 2021 • Julia Ive, Andy Mingren Li, Yishu Miao, Ozan Caglayan, Pranava Madhyastha, Lucia Specia
This paper addresses the problem of simultaneous machine translation (SiMT) by exploring two main concepts: (a) adaptive policies to learn a good trade-off between high translation quality and low latency; and (b) visual information to support this process by providing additional (visual) contextual information which may be available before the textual input is produced.
1 code implementation • EACL 2021 • Ozan Caglayan, Menekse Kuyu, Mustafa Sercan Amac, Pranava Madhyastha, Erkut Erdem, Aykut Erdem, Lucia Specia
Pre-trained language models have been shown to improve performance in many natural language tasks substantially.
no code implementations • 13 Dec 2020 • Begum Citamak, Ozan Caglayan, Menekse Kuyu, Erkut Erdem, Aykut Erdem, Pranava Madhyastha, Lucia Specia
We hope that the MSVD-Turkish dataset and the results reported in this work will lead to better video captioning and multimodal machine translation models for Turkish and other morphology rich and agglutinative languages.
no code implementations • COLING 2020 • Ozan Caglayan, Pranava Madhyastha, Lucia Specia
Automatic evaluation of language generation systems is a well-studied problem in Natural Language Processing.
1 code implementation • EMNLP 2020 • Ozan Caglayan, Julia Ive, Veneta Haralampieva, Pranava Madhyastha, Loïc Barrault, Lucia Specia
Simultaneous machine translation (SiMT) aims to translate a continuous input text stream into another language with the lowest latency and highest quality possible.
no code implementations • 28 Nov 2019 • Umut Sulubacak, Ozan Caglayan, Stig-Arne Grönroos, Aku Rouhe, Desmond Elliott, Lucia Specia, Jörg Tiedemann
Multimodal machine translation involves drawing information from more than one modality, based on the assumption that the additional modalities will contain useful alternative views of the input data.
Ranked #4 on
Multimodal Machine Translation
on Multi30K
no code implementations • EMNLP (IWSLT) 2019 • Zixiu Wu, Ozan Caglayan, Julia Ive, Josiah Wang, Lucia Specia
Upon conducting extensive experiments, we found that (i) the explored visual integration schemes often harm the translation performance for the transformer and additive deliberation, but considerably improve the cascade deliberation; (ii) the transformer and cascade deliberation integrate the visual modality better than the additive deliberation, as shown by the incongruence analysis.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 16 Oct 2019 • Ozan Caglayan, Zixiu Wu, Pranava Madhyastha, Josiah Wang, Lucia Specia
This paper describes the Imperial College London team's submission to the 2019' VATEX video captioning challenge, where we first explore two sequence-to-sequence models, namely a recurrent (GRU) model and a transformer model, which generate captions from the I3D action features.
no code implementations • NAACL 2019 • Ozan Caglayan, Pranava Madhyastha, Lucia Specia, Loïc Barrault
Current work on multimodal machine translation (MMT) has suggested that the visual modality is either unnecessary or only marginally beneficial.
1 code implementation • 9 Nov 2018 • Ozan Caglayan, Ramon Sanabria, Shruti Palaskar, Loïc Barrault, Florian Metze
Specifically, in our previous work, we propose a multistep visual adaptive training approach which improves the accuracy of an audio-based Automatic Speech Recognition (ASR) system.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
2 code implementations • 1 Nov 2018 • Ramon Sanabria, Ozan Caglayan, Shruti Palaskar, Desmond Elliott, Loïc Barrault, Lucia Specia, Florian Metze
In this paper, we introduce How2, a multimodal collection of instructional videos with English subtitles and crowdsourced Portuguese translations.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • WS 2018 • Ozan Caglayan, Adrien Bardet, Fethi Bougares, Loïc Barrault, Kai Wang, Marc Masana, Luis Herranz, Joost Van de Weijer
This paper describes the multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT18 Shared Task on Multimodal Translation.
no code implementations • WS 2017 • Ozan Caglayan, Walid Aransa, Adrien Bardet, Mercedes García-Martínez, Fethi Bougares, Loïc Barrault, Marc Masana, Luis Herranz, Joost Van de Weijer
This paper describes the monomodal and multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT17 Shared Task on Multimodal Translation.
1 code implementation • WS 2017 • Mercedes García-Martínez, Ozan Caglayan, Walid Aransa, Adrien Bardet, Fethi Bougares, Loïc Barrault
This paper describes LIUM submissions to WMT17 News Translation Task for English-German, English-Turkish, English-Czech and English-Latvian language pairs.
1 code implementation • 1 Jun 2017 • Ozan Caglayan, Mercedes García-Martínez, Adrien Bardet, Walid Aransa, Fethi Bougares, Loïc Barrault
nmtpy has been used for LIUM's top-ranked submissions to WMT Multimodal Machine Translation and News Translation tasks in 2016 and 2017.
1 code implementation • 13 Sep 2016 • Ozan Caglayan, Loïc Barrault, Fethi Bougares
We show that a dedicated attention for each modality achieves up to 1. 6 points in BLEU and METEOR compared to a textual NMT baseline.
1 code implementation • WS 2016 • Ozan Caglayan, Walid Aransa, Yaxing Wang, Marc Masana, Mercedes García-Martínez, Fethi Bougares, Loïc Barrault, Joost Van de Weijer
This paper presents the systems developed by LIUM and CVC for the WMT16 Multimodal Machine Translation challenge.