Search Results for author: Dawei Zhu

Found 26 papers, 15 papers with code

Long Context Alignment with Short Instructions and Synthesized Positions

no code implementations7 May 2024 Wenhao Wu, Yizhong Wang, Yao Fu, Xiang Yue, Dawei Zhu, Sujian Li

Effectively handling instructions with extremely long context remains a challenge for Large Language Models (LLMs), typically necessitating high-quality long data and substantial computational resources.

16k Instruction Following

Fine-Tuning Large Language Models to Translate: Will a Touch of Noisy Data in Misaligned Languages Suffice?

no code implementations22 Apr 2024 Dawei Zhu, Pinzhen Chen, Miaoran Zhang, Barry Haddow, Xiaoyu Shen, Dietrich Klakow

Traditionally, success in multilingual machine translation can be attributed to three key factors in training data: large volume, diverse translation directions, and high quality.

Machine Translation Translation

LongEmbed: Extending Embedding Models for Long Context Retrieval

1 code implementation18 Apr 2024 Dawei Zhu, Liang Wang, Nan Yang, YiFan Song, Wenhao Wu, Furu Wei, Sujian Li

This paper explores context window extension of existing embedding models, pushing the limit to 32k without requiring additional training.

4k 8k +3

A Preference-driven Paradigm for Enhanced Translation with Large Language Models

no code implementations17 Apr 2024 Dawei Zhu, Sony Trenous, Xiaoyu Shen, Dietrich Klakow, Bill Byrne, Eva Hasler

Recent research has shown that large language models (LLMs) can achieve remarkable translation performance through supervised fine-tuning (SFT) using only a small amount of parallel data.

Sentence Translation

Robust Pronoun Fidelity with English LLMs: Are they Reasoning, Repeating, or Just Biased?

2 code implementations4 Apr 2024 Vagrant Gautam, Eileen Bingert, Dawei Zhu, Anne Lauscher, Dietrich Klakow

Robust, faithful and harm-free pronoun use for individuals is an important goal for language models as their use increases, but prior work tends to study only one or two of these characteristics at a time.

Decoder Sentence

CoUDA: Coherence Evaluation via Unified Data Augmentation

1 code implementation31 Mar 2024 Dawei Zhu, Wenhao Wu, YiFan Song, Fangwei Zhu, Ziqiang Cao, Sujian Li

Due to the scarcity of annotated data, data augmentation is commonly used for training coherence evaluation models.

Coherence Evaluation Data Augmentation

Retrieval-based Full-length Wikipedia Generation for Emergent Events

no code implementations28 Feb 2024 Jiebin Zhang, Eugene J. Yu, Qinyu Chen, Chenhao Xiong, Dawei Zhu, Han Qian, Mingbo Song, Xiaoguang Li, Qun Liu, Sujian Li

In today's fast-paced world, the growing demand to quickly generate comprehensive and accurate Wikipedia documents for emerging events is both crucial and challenging.


LawBench: Benchmarking Legal Knowledge of Large Language Models

1 code implementation28 Sep 2023 Zhiwei Fei, Xiaoyu Shen, Dawei Zhu, Fengzhe Zhou, Zhuo Han, Songyang Zhang, Kai Chen, Zongwen Shen, Jidong Ge

We hope this benchmark provides in-depth understanding of the LLMs' domain-specified capabilities and speed up the development of LLMs in the legal domain.

Benchmarking Memorization +1

PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training

1 code implementation19 Sep 2023 Dawei Zhu, Nan Yang, Liang Wang, YiFan Song, Wenhao Wu, Furu Wei, Sujian Li

To decouple train length from target length for efficient context window extension, we propose Positional Skip-wisE (PoSE) training that smartly simulates long inputs using a fixed context window.

2k Position

RestGPT: Connecting Large Language Models with Real-World RESTful APIs

no code implementations11 Jun 2023 YiFan Song, Weimin Xiong, Dawei Zhu, Wenhao Wu, Han Qian, Mingbo Song, Hailiang Huang, Cheng Li, Ke Wang, Rong Yao, Ye Tian, Sujian Li

To address the practical challenges of tackling complex instructions, we propose RestGPT, which exploits the power of LLMs and conducts a coarse-to-fine online planning mechanism to enhance the abilities of task decomposition and API selection.

Large Language Models are not Fair Evaluators

1 code implementation29 May 2023 Peiyi Wang, Lei LI, Liang Chen, Zefan Cai, Dawei Zhu, Binghuai Lin, Yunbo Cao, Qi Liu, Tianyu Liu, Zhifang Sui

In this paper, we uncover a systematic bias in the evaluation paradigm of adopting large language models~(LLMs), e. g., GPT-4, as a referee to score and compare the quality of responses generated by candidate models.

Language Modelling Large Language Model +1

Weaker Than You Think: A Critical Look at Weakly Supervised Learning

1 code implementation27 May 2023 Dawei Zhu, Xiaoyu Shen, Marius Mosbach, Andreas Stephan, Dietrich Klakow

In this paper, we revisit the setup of these approaches and find that the benefits brought by these approaches are significantly overestimated.

Weakly-supervised Learning

RepCL: Exploring Effective Representation for Continual Text Classification

no code implementations12 May 2023 YiFan Song, Peiyi Wang, Dawei Zhu, Tianyu Liu, Zhifang Sui, Sujian Li

Continual learning (CL) aims to constantly learn new knowledge over time while avoiding catastrophic forgetting on old tasks.

Continual Learning Representation Learning +2

ConFiguRe: Exploring Discourse-level Chinese Figures of Speech

1 code implementation COLING 2022 Dawei Zhu, Qiusi Zhan, Zhejian Zhou, YiFan Song, Jiebin Zhang, Sujian Li

Different from previous token-level or sentence-level counterparts, ConFiguRe aims at extracting a figurative unit from discourse-level context, and classifying the figurative unit into the right figure type.

Natural Language Understanding Sentence

Meta Self-Refinement for Robust Learning with Weak Supervision

1 code implementation15 May 2022 Dawei Zhu, Xiaoyu Shen, Michael A. Hedderich, Dietrich Klakow

Training deep neural networks (DNNs) under weak supervision has attracted increasing research attention as it can significantly reduce the annotation cost.

GraphPrompt: Graph-Based Prompt Templates for Biomedical Synonym Prediction

1 code implementation13 Nov 2021 Hanwen Xu, Jiayou Zhang, Zhirui Wang, Shizhuo Zhang, Megh Manoj Bhalerao, Yucong Liu, Dawei Zhu, Sheng Wang

In the expansion of biomedical dataset, the same category may be labeled with different terms, thus being tedious and onerous to curate these terms.

Neural Data-to-Text Generation with LM-based Text Augmentation

no code implementations EACL 2021 Ernie Chang, Xiaoyu Shen, Dawei Zhu, Vera Demberg, Hui Su

Our approach automatically augments the data available for training by (i) generating new text samples based on replacing specific values by alternative ones from the same category, (ii) generating new text samples based on GPT-2, and (iii) proposing an automatic method for pairing the new text samples with data samples.

Data-to-Text Generation Text Augmentation

Analysing the Noise Model Error for Realistic Noisy Label Data

3 code implementations24 Jan 2021 Michael A. Hedderich, Dawei Zhu, Dietrich Klakow

Distant and weak supervision allow to obtain large amounts of labeled training data quickly and cheaply, but these automatic annotations tend to contain a high amount of errors.

Image Manipulation with Natural Language using Two-sidedAttentive Conditional Generative Adversarial Network

no code implementations16 Dec 2019 Dawei Zhu, Aditya Mogadala, Dietrich Klakow

We propose the Two-sidEd Attentive conditional Generative Adversarial Network (TEA-cGAN) to generate semantically manipulated images while preserving other contents such as background intact.

Generative Adversarial Network Image Manipulation

Cannot find the paper you are looking for? You can Submit a new open access paper.