Search Results for author: Pinzhen Chen

Found 20 papers, 14 papers with code

The University of Edinburgh’s English-German and English-Hausa Submissions to the WMT21 News Translation Task

1 code implementation • WMT (EMNLP) 2021 • Pinzhen Chen, Jindřich Helcl, Ulrich Germann, Laurie Burchell, Nikolay Bogoychev, Antonio Valerio Miceli Barone, Jonas Waldendorf, Alexandra Birch, Kenneth Heafield

This paper presents the University of Edinburgh’s constrained submissions of English-German and English-Hausa systems to the WMT 2021 shared task on news translation.

Translation

1,170

Paper
Code

The University of Edinburgh’s Bengali-Hindi Submissions to the WMT21 News Translation Task

1 code implementation • WMT (EMNLP) 2021 • Proyag Pal, Alham Fikri Aji, Pinzhen Chen, Sukanta Sen

We describe the University of Edinburgh’s Bengali\leftrightarrowHindi constrained systems submitted to the WMT21 News Translation task.

Translation

1,170

Paper
Code

Efficient Machine Translation with Model Pruning and Quantization

1 code implementation • WMT (EMNLP) 2021 • Maximiliana Behnke, Nikolay Bogoychev, Alham Fikri Aji, Kenneth Heafield, Graeme Nail, Qianqian Zhu, Svetlana Tchistiakova, Jelmer Van der Linde, Pinzhen Chen, Sidharth Kashyap, Roman Grundkiewicz

We participated in all tracks of the WMT 2021 efficient machine translation task: single-core CPU, multi-core CPU, and GPU hardware with throughput and latency conditions.

Knowledge Distillation Machine Translation +2

1,170

Paper
Code

Edinburgh at SemEval-2022 Task 1: Jointly Fishing for Word Embeddings and Definitions

1 code implementation • SemEval (NAACL) 2022 • Pinzhen Chen, Zheng Zhao

This paper presents a winning submission to the SemEval 2022 Task 1 on two sub-tasks: reverse dictionary and definition modelling.

Definition Extraction Definition Modelling +2

Paper
Code

Fine-Tuning Large Language Models to Translate: Will a Touch of Noisy Data in Misaligned Languages Suffice?

no code implementations • 22 Apr 2024 • Dawei Zhu, Pinzhen Chen, Miaoran Zhang, Barry Haddow, Xiaoyu Shen, Dietrich Klakow

Traditionally, success in multilingual machine translation can be attributed to three key factors in training data: large volume, diverse translation directions, and high quality.

Paper
Add Code

Lucky 52: How Many Languages Are Needed to Instruction Fine-Tune Large Language Models?

no code implementations • 7 Apr 2024 • Shaoxiong Ji, Pinzhen Chen

Fine-tuning large language models for multilingual downstream tasks requires a diverse set of languages to capture the nuances and structures of different linguistic contexts effectively.

Paper
Add Code

UniArk: Improving Generalisation and Consistency for Factual Knowledge Extraction through Debiasing

1 code implementation • 1 Apr 2024 • Yijun Yang, Jie He, Pinzhen Chen, Víctor Gutiérrez-Basulto, Jeff Z. Pan

We hypothesize that simultaneously debiasing these objectives can be the key to generalisation over unseen prompts.

Paper
Code

Fine-tuning Large Language Models with Sequential Instructions

1 code implementation • 12 Mar 2024 • Hanxu Hu, Pinzhen Chen, Edoardo M. Ponti

Targeting the scarcity of sequential instructions in present-day data, we propose sequential instruction tuning, a simple yet effective strategy to automatically augment instruction tuning data and equip LLMs with the ability to execute multiple sequential instructions.

Paper
Code

EEE-QA: Exploring Effective and Efficient Question-Answer Representations

1 code implementation • 4 Mar 2024 • Zhanghao Hu, Yijun Yang, Junjie Xu, Yifu Qiu, Pinzhen Chen

Current approaches to question answering rely on pre-trained language models (PLMs) like RoBERTa.

Knowledge Graphs Question Answering +1

Paper
Code

Large Language Model Inference with Lexical Shortlisting

no code implementations • 16 Nov 2023 • Nikolay Bogoychev, Pinzhen Chen, Barry Haddow, Alexandra Birch

Large language model (LLM) inference is computation and memory intensive, so we adapt lexical shortlisting to it hoping to improve both.

Language Modelling Large Language Model +1

Paper
Add Code

Terminology-Aware Translation with Constrained Decoding and Large Language Model Prompting

no code implementations • 9 Oct 2023 • Nikolay Bogoychev, Pinzhen Chen

Terminology correctness is important in the downstream application of machine translation, and a prevalent way to ensure this is to inject terminology constraints into a translation system.

Language Modelling Large Language Model +3

Paper
Add Code

Towards Effective Disambiguation for Machine Translation with Large Language Models

no code implementations • 20 Sep 2023 • Vivek Iyer, Pinzhen Chen, Alexandra Birch

Resolving semantic ambiguity has long been recognised as a central challenge in the field of Machine Translation.

Benchmarking In-Context Learning +3

Paper
Add Code

Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca

1 code implementation • 16 Sep 2023 • Pinzhen Chen, Shaoxiong Ji, Nikolay Bogoychev, Andrey Kutuzov, Barry Haddow, Kenneth Heafield

Foundational large language models (LLMs) can be instruction-tuned to perform open-domain question answering, facilitating applications like chat assistants.

Instruction Following Large Language Model +3

Paper
Code

Iterative Translation Refinement with Large Language Models

no code implementations • 6 Jun 2023 • Pinzhen Chen, Zhicheng Guo, Barry Haddow, Kenneth Heafield

In this paper, we propose iterative translation refinement to leverage the power of large language models for more natural translation and post-editing.

Language Modelling Large Language Model +1

Paper
Add Code

PMIndiaSum: Multilingual and Cross-lingual Headline Summarization for Languages in India

1 code implementation • 15 May 2023 • Ashok Urlana, Pinzhen Chen, Zheng Zhao, Shay B. Cohen, Manish Shrivastava, Barry Haddow

This paper introduces PMIndiaSum, a multilingual and massively parallel summarization corpus focused on languages in India.

Cross-Lingual Abstractive Summarization Multilingual NLP +1

Paper
Code

Exploring Data Augmentation for Code Generation Tasks

1 code implementation • 5 Feb 2023 • Pinzhen Chen, Gerasimos Lampouras

Advances in natural language processing, such as transfer learning from pre-trained language models, have impacted how models are trained for programming language tasks too.

Code Summarization Code Translation +2

Paper
Code

The University of Edinburgh's Submission to the WMT22 Code-Mixing Shared Task (MixMT)

1 code implementation • 20 Oct 2022 • Faheem Kirefu, Vivek Iyer, Pinzhen Chen, Laurie Burchell

For subtask 1 we explored the effects of constrained decoding on English and transliterated subwords in order to produce Hinglish.

Machine Translation Text Generation +1

1,170

Paper
Code

To Adapt or to Fine-tune: A Case Study on Abstractive Summarization

1 code implementation • CCL 2022 • Zheng Zhao, Pinzhen Chen

Recent advances in the field of abstractive summarization leverage pre-trained language models rather than train a model from scratch.

Abstractive Text Summarization Language Modelling +1

Paper
Code

A Unified Model for Reverse Dictionary and Definition Modelling

1 code implementation • 9 May 2022 • Pinzhen Chen, Zheng Zhao

We build a dual-way neural dictionary to retrieve words given definitions, and produce definitions for queried words.

Definition Modelling Reverse Dictionary

Paper
Code

The Highs and Lows of Simple Lexical Domain Adaptation Approaches for Neural Machine Translation

1 code implementation • EMNLP (insights) 2021 • Nikolay Bogoychev, Pinzhen Chen

Machine translation systems are vulnerable to domain mismatch, especially in a low-resource scenario.

Domain Adaptation Language Modelling +3

1,170

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.