no code implementations • COLING 2022 • Yu Yu, Shahram Khadivi, Jia Xu
This paper introduces our Diversity Advanced Actor-Critic reinforcement learning (A2C) framework (DAAC) to improve the generalization and accuracy of Natural Language Processing (NLP).
1 code implementation • Findings (ACL) 2022 • Thuy-Trang Vu, Shahram Khadivi, Dinh Phung, Gholamreza Haffari
Generalising to unseen domains is under-explored and remains a challenge in neural machine translation.
no code implementations • 22 Oct 2023 • Baohao Liao, Michael Kozielski, Sanjika Hewavitharana, Jiangbo Yuan, Shahram Khadivi, Tomer Lancewicki
How to teach a model to learn embedding from different modalities without neglecting information from the less dominant modality is challenging.
no code implementations • 18 Oct 2023 • Frithjof Petrick, Christian Herold, Pavel Petrushkov, Shahram Khadivi, Hermann Ney
Finally, we explore language model fusion in the light of recent advancements in large language models.
no code implementations • 6 May 2023 • Thuy-Trang Vu, Shahram Khadivi, Mahsa Ghorbanali, Dinh Phung, Gholamreza Haffari
Acquiring new knowledge without forgetting what has been learned in a sequence of tasks is the central focus of continual learning (CL).
no code implementations • ICCV 2023 • Samitha Herath, Basura Fernando, Ehsan Abbasnejad, Munawar Hayat, Shahram Khadivi, Mehrtash Harandi, Hamid Rezatofighi, Gholamreza Haffari
EBL can be used to improve the instance selection for a self-training task on the unlabelled target domain, and 2. alignment and normalizing energy scores can learn domain-invariant representations.
no code implementations • 20 Oct 2022 • Thuy-Trang Vu, Shahram Khadivi, Xuanli He, Dinh Phung, Gholamreza Haffari
Previous works mostly focus on either multilingual or multi-domain aspects of neural machine translation (NMT).
1 code implementation • 24 Mar 2022 • Iñigo Urteaga, Moulay-Zaïdane Draïdia, Tomer Lancewicki, Shahram Khadivi
We propose a multi-armed bandit framework for the sequential selection of TLM pre-training hyperparameters, aimed at optimizing language model performance, in a resource efficient manner.
no code implementations • loresmt (COLING) 2022 • Mohaddeseh Bastan, Shahram Khadivi
Neural Machine Translation (NMT) models are strong enough to convey semantic and syntactic information from the source language to the target language.
no code implementations • ACL (IWSLT) 2021 • Evgeniia Tokarchuk, Jan Rosendahl, Weiyue Wang, Pavel Petrushkov, Tomer Lancewicki, Shahram Khadivi, Hermann Ney
Complex natural language applications such as speech translation or pivot translation traditionally rely on cascaded models.
no code implementations • 27 Sep 2021 • Evgeniia Tokarchuk, Jan Rosendahl, Weiyue Wang, Pavel Petrushkov, Tomer Lancewicki, Shahram Khadivi, Hermann Ney
Pivot-based neural machine translation (NMT) is commonly used in low-resource setups, especially for translation between non-English language pairs.
1 code implementation • WMT (EMNLP) 2021 • Baohao Liao, Shahram Khadivi, Sanjika Hewavitharana
Surprisingly, the smaller size of vocabularies perform better, and the extensive monolingual English data offers a modest improvement.
no code implementations • WMT (EMNLP) 2020 • Jingjing Huo, Christian Herold, Yingbo Gao, Leonard Dahlmann, Shahram Khadivi, Hermann Ney
Context-aware neural machine translation (NMT) is a promising direction to improve the translation quality by making use of the additional context, e. g., document-level translation, or having meta-information.
no code implementations • IJCNLP 2019 • Yunsu Kim, Petre Petrov, Pavel Petrushkov, Shahram Khadivi, Hermann Ney
We present effective pre-training strategies for neural machine translation (NMT) using parallel corpora involving a pivot language, i. e., source-pivot and pivot-target, leading to a significant improvement in source-target translation.
no code implementations • WS 2019 • Miguel Graça, Yunsu Kim, Julian Schamper, Shahram Khadivi, Hermann Ney
Back-translation - data augmentation by translating target monolingual data - is a crucial component in modern neural machine translation (NMT).
no code implementations • IWSLT (EMNLP) 2018 • Shen Yan, Leonard Dahlmann, Pavel Petrushkov, Sanjika Hewavitharana, Shahram Khadivi
Pre-training a model with word weights improves fine-tuning up to 1. 24% BLEU absolute and 1. 64% TER, respectively.
no code implementations • WS 2019 • Yunsu Kim, Hendrik Rosendahl, Nick Rossenbach, Jan Rosendahl, Shahram Khadivi, Hermann Ney
We propose a novel model architecture and training algorithm to learn bilingual sentence embeddings from a combination of parallel and monolingual data.
no code implementations • ACL 2018 • Pavel Petrushkov, Shahram Khadivi, Evgeny Matusov
We empirically investigate learning from partial feedback in neural machine translation (NMT), when partial feedback is collected by asking users to highlight a correct chunk of a translation.
no code implementations • NAACL 2018 • Julia Kreutzer, Shahram Khadivi, Evgeny Matusov, Stefan Riezler
We present the first real-world application of methods for improving neural machine translation (NMT) with human reinforcement, based on explicit and implicit user feedback collected on the eBay e-commerce platform.
no code implementations • MTSummit 2017 • Shahram Khadivi, Patrick Wilken, Leonard Dahlmann, Evgeny Matusov
In this paper, we discuss different methods which use meta information and richer context that may accompany source language input to improve machine translation quality.
no code implementations • EMNLP 2017 • Leonard Dahlmann, Evgeny Matusov, Pavel Petrushkov, Shahram Khadivi
In this paper, we introduce a hybrid search for attention-based neural machine translation (NMT).
1 code implementation • ALTA 2014 • Mohammad Aliannejadi, Masoud Kiaeeha, Shahram Khadivi, Saeed Shiry Ghidary
We experiment graph-based Semi-Supervised Learning (SSL) of Conditional Random Fields (CRF) for the application of Spoken Language Understanding (SLU) on unaligned data.
no code implementations • 7 Jan 2017 • Mohaddeseh Bastan, Shahram Khadivi, Mohammad Mehdi Homayounpour
This new loss function yields a total of 1. 87 point improvements in terms of BLEU score in the translation quality.
1 code implementation • AMTA 2016 • Wenhu Chen, Evgeny Matusov, Shahram Khadivi, Jan-Thorsten Peter
In this paper, we propose an effective way for biasing the attention mechanism of a sequence-to-sequence neural machine translation (NMT) model towards the well-studied statistical word alignment models.
no code implementations • LREC 2012 • Mahdi Khademian, Kaveh Taghipour, Saab Mansour, Shahram Khadivi
Achieving accurate translation, especially in multiple domain documents with statistical machine translation systems, requires more and more bilingual texts and this need becomes more critical when training such systems for language pairs with scarce training data.