Search Results for author: Dawei Zhu

Found 24 papers, 15 papers with code

LongEmbed: Extending Embedding Models for Long Context Retrieval

1 code implementation • 18 Apr 2024 • Dawei Zhu, Liang Wang, Nan Yang, YiFan Song, Wenhao Wu, Furu Wei, Sujian Li

This paper explores context window extension of existing embedding models, pushing the limit to 32k without requiring additional training.

Paper
Code

A Preference-driven Paradigm for Enhanced Translation with Large Language Models

no code implementations • 17 Apr 2024 • Dawei Zhu, Sony Trenous, Xiaoyu Shen, Dietrich Klakow, Bill Byrne, Eva Hasler

Recent research has shown that large language models (LLMs) can achieve remarkable translation performance through supervised fine-tuning (SFT) using only a small amount of parallel data.

Sentence Translation

Paper
Add Code

Robust Pronoun Use Fidelity with English LLMs: Are they Reasoning, Repeating, or Just Biased?

1 code implementation • 4 Apr 2024 • Vagrant Gautam, Eileen Bingert, Dawei Zhu, Anne Lauscher, Dietrich Klakow

We find that while models can mostly faithfully reuse previously-specified pronouns in the presence of no distractors, they are significantly worse at processing she/her/her, singular they and neopronouns.

Sentence

Paper
Code

CoUDA: Coherence Evaluation via Unified Data Augmentation

1 code implementation • 31 Mar 2024 • Dawei Zhu, Wenhao Wu, YiFan Song, Fangwei Zhu, Ziqiang Cao, Sujian Li

Due to the scarcity of annotated data, data augmentation is commonly used for training coherence evaluation models.

Coherence Evaluation Data Augmentation

Paper
Code

Retrieval-based Full-length Wikipedia Generation for Emergent Events

no code implementations • 28 Feb 2024 • Jiebin Zhang, Eugene J. Yu, Qinyu Chen, Chenhao Xiong, Dawei Zhu, Han Qian, Mingbo Song, Xiaoguang Li, Qun Liu, Sujian Li

In today's fast-paced world, the growing demand to quickly generate comprehensive and accurate Wikipedia documents for emerging events is both crucial and challenging.

Retrieval

Paper
Add Code

InfoCL: Alleviating Catastrophic Forgetting in Continual Text Classification from An Information Theoretic Perspective

1 code implementation • 10 Oct 2023 • YiFan Song, Peiyi Wang, Weimin Xiong, Dawei Zhu, Tianyu Liu, Zhifang Sui, Sujian Li

Continual learning (CL) aims to constantly learn new knowledge over time while avoiding catastrophic forgetting on old tasks.

Continual Learning Contrastive Learning +3

Paper
Code

LawBench: Benchmarking Legal Knowledge of Large Language Models

1 code implementation • 28 Sep 2023 • Zhiwei Fei, Xiaoyu Shen, Dawei Zhu, Fengzhe Zhou, Zhuo Han, Songyang Zhang, Kai Chen, Zongwen Shen, Jidong Ge

We hope this benchmark provides in-depth understanding of the LLMs' domain-specified capabilities and speed up the development of LLMs in the legal domain.

Benchmarking Memorization +1

159

Paper
Code

PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training

1 code implementation • 19 Sep 2023 • Dawei Zhu, Nan Yang, Liang Wang, YiFan Song, Wenhao Wu, Furu Wei, Sujian Li

To decouple train length from target length for efficient context window extension, we propose Positional Skip-wisE (PoSE) training that smartly simulates long inputs using a fixed context window.

2k Position

138

Paper
Code

RestGPT: Connecting Large Language Models with Real-World RESTful APIs

no code implementations • 11 Jun 2023 • YiFan Song, Weimin Xiong, Dawei Zhu, Wenhao Wu, Han Qian, Mingbo Song, Hailiang Huang, Cheng Li, Ke Wang, Rong Yao, Ye Tian, Sujian Li

To address the practical challenges of tackling complex instructions, we propose RestGPT, which exploits the power of LLMs and conducts a coarse-to-fine online planning mechanism to enhance the abilities of task decomposition and API selection.

Paper
Add Code

Large Language Models are not Fair Evaluators

1 code implementation • 29 May 2023 • Peiyi Wang, Lei LI, Liang Chen, Zefan Cai, Dawei Zhu, Binghuai Lin, Yunbo Cao, Qi Liu, Tianyu Liu, Zhifang Sui

In this paper, we uncover a systematic bias in the evaluation paradigm of adopting large language models~(LLMs), e. g., GPT-4, as a referee to score and compare the quality of responses generated by candidate models.

Language Modelling Large Language Model +1

121

Paper
Code

Weaker Than You Think: A Critical Look at Weakly Supervised Learning

1 code implementation • 27 May 2023 • Dawei Zhu, Xiaoyu Shen, Marius Mosbach, Andreas Stephan, Dietrich Klakow

In this paper, we revisit the setup of these approaches and find that the benefits brought by these approaches are significantly overestimated.

Weakly-supervised Learning

Paper
Code

RepCL: Exploring Effective Representation for Continual Text Classification

no code implementations • 12 May 2023 • YiFan Song, Peiyi Wang, Dawei Zhu, Tianyu Liu, Zhifang Sui, Sujian Li

Continual learning (CL) aims to constantly learn new knowledge over time while avoiding catastrophic forgetting on old tasks.

Continual Learning Representation Learning +2

Paper
Add Code

DocRED-FE: A Document-Level Fine-Grained Entity And Relation Extraction Dataset

1 code implementation • 20 Mar 2023 • Hongbo Wang, Weimin Xiong, YiFan Song, Dawei Zhu, Yu Xia, Sujian Li

Joint entity and relation extraction (JERE) is one of the most important tasks in information extraction.

Joint Entity and Relation Extraction Relation +2

Paper
Code

ConFiguRe: Exploring Discourse-level Chinese Figures of Speech

1 code implementation • COLING 2022 • Dawei Zhu, Qiusi Zhan, Zhejian Zhou, YiFan Song, Jiebin Zhang, Sujian Li

Different from previous token-level or sentence-level counterparts, ConFiguRe aims at extracting a figurative unit from discourse-level context, and classifying the figurative unit into the right figure type.

Natural Language Understanding Sentence

Paper
Code

Task-Adaptive Pre-Training for Boosting Learning With Noisy Labels: A Study on Text Classification for African Languages

no code implementations • 3 Jun 2022 • Dawei Zhu, Michael A. Hedderich, Fangzhou Zhai, David Ifeoluwa Adelani, Dietrich Klakow

However, text classification in low-resource languages is still challenging due to the lack of annotated data.

Learning with noisy labels text-classification +1

Paper
Add Code

Meta Self-Refinement for Robust Learning with Weak Supervision

1 code implementation • 15 May 2022 • Dawei Zhu, Xiaoyu Shen, Michael A. Hedderich, Dietrich Klakow

Training deep neural networks (DNNs) under weak supervision has attracted increasing research attention as it can significantly reduce the annotation cost.

Paper
Code

Is BERT Robust to Label Noise? A Study on Learning with Noisy Labels in Text Classification

1 code implementation • insights (ACL) 2022 • Dawei Zhu, Michael A. Hedderich, Fangzhou Zhai, David Ifeoluwa Adelani, Dietrich Klakow

Incorrect labels in training data occur when human annotators make mistakes or when the data is generated via weak or distant supervision.

Learning with noisy labels text-classification +1

Paper
Code

GraphPrompt: Graph-Based Prompt Templates for Biomedical Synonym Prediction

1 code implementation • 13 Nov 2021 • Hanwen Xu, Jiayou Zhang, Zhirui Wang, Shizhuo Zhang, Megh Manoj Bhalerao, Yucong Liu, Dawei Zhu, Sheng Wang

In the expansion of biomedical dataset, the same category may be labeled with different terms, thus being tedious and onerous to curate these terms.

Paper
Code

Neural Data-to-Text Generation with LM-based Text Augmentation

no code implementations • EACL 2021 • Ernie Chang, Xiaoyu Shen, Dawei Zhu, Vera Demberg, Hui Su

Our approach automatically augments the data available for training by (i) generating new text samples based on replacing specific values by alternative ones from the same category, (ii) generating new text samples based on GPT-2, and (iii) proposing an automatic method for pairing the new text samples with data samples.

Data-to-Text Generation Text Augmentation

Paper
Add Code

Analysing the Noise Model Error for Realistic Noisy Label Data

3 code implementations • 24 Jan 2021 • Michael A. Hedderich, Dawei Zhu, Dietrich Klakow

Distant and weak supervision allow to obtain large amounts of labeled training data quickly and cheaply, but these automatic annotations tend to contain a high amount of errors.

Paper
Code

Transfer Learning and Distant Supervision for Multilingual Transformer Models: A Study on African Languages

1 code implementation • EMNLP 2020 • Michael A. Hedderich, David Adelani, Dawei Zhu, Jesujoba Alabi, Udia Markus, Dietrich Klakow

Multilingual transformer models like mBERT and XLM-RoBERTa have obtained great improvements for many NLP tasks on a variety of languages.

NER Topic Classification +1

Paper
Code

Distant Supervision and Noisy Label Learning for Low Resource Named Entity Recognition: A Study on Hausa and Yorùbá

no code implementations • 18 Mar 2020 • David Ifeoluwa Adelani, Michael A. Hedderich, Dawei Zhu, Esther van den Berg, Dietrich Klakow

Techniques such as distant and weak supervision can be used to create labeled data in a (semi-) automatic way.

Low Resource Named Entity Recognition named-entity-recognition +3

Paper
Add Code

An End-to-End Dialogue State Tracking System with Machine Reading Comprehension and Wide & Deep Classification

no code implementations • 19 Dec 2019 • Yue Ma, Zengfeng Zeng, Dawei Zhu, Xuan Li, Yiying Yang, Xiaoyuan Yao, Kaijie Zhou, Jianping Shen

This paper describes our approach in DSTC 8 Track 4: Schema-Guided Dialogue State Tracking.

Dialogue State Tracking General Classification +1

Paper
Add Code

Image Manipulation with Natural Language using Two-sidedAttentive Conditional Generative Adversarial Network

no code implementations • 16 Dec 2019 • Dawei Zhu, Aditya Mogadala, Dietrich Klakow

We propose the Two-sidEd Attentive conditional Generative Adversarial Network (TEA-cGAN) to generate semantically manipulated images while preserving other contents such as background intact.

Generative Adversarial Network Image Manipulation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.