1 code implementation • WMT (EMNLP) 2021 • Pinzhen Chen, Jindřich Helcl, Ulrich Germann, Laurie Burchell, Nikolay Bogoychev, Antonio Valerio Miceli Barone, Jonas Waldendorf, Alexandra Birch, Kenneth Heafield
This paper presents the University of Edinburgh’s constrained submissions of English-German and English-Hausa systems to the WMT 2021 shared task on news translation.
1 code implementation • WMT (EMNLP) 2021 • Proyag Pal, Alham Fikri Aji, Pinzhen Chen, Sukanta Sen
We describe the University of Edinburgh’s Bengali\leftrightarrowHindi constrained systems submitted to the WMT21 News Translation task.
1 code implementation • WMT (EMNLP) 2021 • Maximiliana Behnke, Nikolay Bogoychev, Alham Fikri Aji, Kenneth Heafield, Graeme Nail, Qianqian Zhu, Svetlana Tchistiakova, Jelmer Van der Linde, Pinzhen Chen, Sidharth Kashyap, Roman Grundkiewicz
We participated in all tracks of the WMT 2021 efficient machine translation task: single-core CPU, multi-core CPU, and GPU hardware with throughput and latency conditions.
1 code implementation • SemEval (NAACL) 2022 • Pinzhen Chen, Zheng Zhao
This paper presents a winning submission to the SemEval 2022 Task 1 on two sub-tasks: reverse dictionary and definition modelling.
no code implementations • 22 Apr 2024 • Dawei Zhu, Pinzhen Chen, Miaoran Zhang, Barry Haddow, Xiaoyu Shen, Dietrich Klakow
Traditionally, success in multilingual machine translation can be attributed to three key factors in training data: large volume, diverse translation directions, and high quality.
no code implementations • 7 Apr 2024 • Shaoxiong Ji, Pinzhen Chen
Fine-tuning large language models for multilingual downstream tasks requires a diverse set of languages to capture the nuances and structures of different linguistic contexts effectively.
1 code implementation • 1 Apr 2024 • Yijun Yang, Jie He, Pinzhen Chen, Víctor Gutiérrez-Basulto, Jeff Z. Pan
We hypothesize that simultaneously debiasing these objectives can be the key to generalisation over unseen prompts.
1 code implementation • 12 Mar 2024 • Hanxu Hu, Pinzhen Chen, Edoardo M. Ponti
Targeting the scarcity of sequential instructions in present-day data, we propose sequential instruction tuning, a simple yet effective strategy to automatically augment instruction tuning data and equip LLMs with the ability to execute multiple sequential instructions.
1 code implementation • 4 Mar 2024 • Zhanghao Hu, Yijun Yang, Junjie Xu, Yifu Qiu, Pinzhen Chen
Current approaches to question answering rely on pre-trained language models (PLMs) like RoBERTa.
no code implementations • 16 Nov 2023 • Nikolay Bogoychev, Pinzhen Chen, Barry Haddow, Alexandra Birch
Large language model (LLM) inference is computation and memory intensive, so we adapt lexical shortlisting to it hoping to improve both.
no code implementations • 9 Oct 2023 • Nikolay Bogoychev, Pinzhen Chen
Terminology correctness is important in the downstream application of machine translation, and a prevalent way to ensure this is to inject terminology constraints into a translation system.
no code implementations • 20 Sep 2023 • Vivek Iyer, Pinzhen Chen, Alexandra Birch
Resolving semantic ambiguity has long been recognised as a central challenge in the field of Machine Translation.
1 code implementation • 16 Sep 2023 • Pinzhen Chen, Shaoxiong Ji, Nikolay Bogoychev, Andrey Kutuzov, Barry Haddow, Kenneth Heafield
Foundational large language models (LLMs) can be instruction-tuned to perform open-domain question answering, facilitating applications like chat assistants.
no code implementations • 6 Jun 2023 • Pinzhen Chen, Zhicheng Guo, Barry Haddow, Kenneth Heafield
In this paper, we propose iterative translation refinement to leverage the power of large language models for more natural translation and post-editing.
1 code implementation • 15 May 2023 • Ashok Urlana, Pinzhen Chen, Zheng Zhao, Shay B. Cohen, Manish Shrivastava, Barry Haddow
This paper introduces PMIndiaSum, a multilingual and massively parallel summarization corpus focused on languages in India.
1 code implementation • 5 Feb 2023 • Pinzhen Chen, Gerasimos Lampouras
Advances in natural language processing, such as transfer learning from pre-trained language models, have impacted how models are trained for programming language tasks too.
1 code implementation • 20 Oct 2022 • Faheem Kirefu, Vivek Iyer, Pinzhen Chen, Laurie Burchell
For subtask 1 we explored the effects of constrained decoding on English and transliterated subwords in order to produce Hinglish.
1 code implementation • CCL 2022 • Zheng Zhao, Pinzhen Chen
Recent advances in the field of abstractive summarization leverage pre-trained language models rather than train a model from scratch.
1 code implementation • 9 May 2022 • Pinzhen Chen, Zheng Zhao
We build a dual-way neural dictionary to retrieve words given definitions, and produce definitions for queried words.
1 code implementation • EMNLP (insights) 2021 • Nikolay Bogoychev, Pinzhen Chen
Machine translation systems are vulnerable to domain mismatch, especially in a low-resource scenario.