Search Results for author: Hongyuan Lu

Found 19 papers, 6 papers with code

On Controlling Fallback Responses for Grounded Dialogue Generation

no code implementations Findings (ACL) 2022 Hongyuan Lu, Wai Lam, Hong Cheng, Helen Meng

We propose a novel framework that automatically generates a control token with the generator to bias the succeeding response towards informativeness for answerable contexts and fallback for unanswerable contexts in an end-to-end manner.

Dialogue Generation Informativeness

Dictionary Insertion Prompting for Multilingual Reasoning on Multilingual Large Language Models

no code implementations2 Nov 2024 Hongyuan Lu, Zixuan Li, Wai Lam

As current training data for Large Language Models (LLMs) are dominated by English corpus, they are English-centric and they present impressive performance on English reasoning tasks.\footnote{This paper primarily studies English-centric models, but our method could be universal by using the centric language in the dictionary for non-English-centric LLMs.}

GSM8K Math

Clean Evaluations on Contaminated Visual Language Models

no code implementations9 Oct 2024 Hongyuan Lu, Shujie Miao, Wai Lam

We found that it is a simple yet effective method for reducing the effect of data contamination and fortunately, it is also harmful to be used as a data augmentation method during training.

Data Augmentation

Toxic Subword Pruning for Dialogue Response Generation on Large Language Models

no code implementations5 Oct 2024 Hongyuan Lu, Wai Lam

Fortunately, our findings suggest that ToxPrune simultaneously improves the toxic language model NSFW-3B on the task of dialogue response generation obviously.

Language Modelling Machine Translation +2

Stephanie: Step-by-Step Dialogues for Mimicking Human Interactions in Social Conversations

no code implementations4 Jul 2024 Hao Yang, Hongyuan Lu, Xinhua Zeng, Yang Liu, Xiang Zhang, Haoran Yang, Yumeng Zhang, Shan Huang, Yiran Wei, Wai Lam

In the rapidly evolving field of natural language processing, dialogue systems primarily employ a single-step dialogue paradigm.

Chatbot

Unveiling the Generalization Power of Fine-Tuned Large Language Models

1 code implementation14 Mar 2024 Haoran Yang, Yumeng Zhang, Jiaqi Xu, Hongyuan Lu, Pheng Ann Heng, Wai Lam

While Large Language Models (LLMs) have demonstrated exceptional multitasking abilities, fine-tuning these models on downstream, domain-specific datasets is often necessary to yield superior performance on test sets compared to their counterparts without fine-tuning.

In-Context Learning

Consecutive Batch Model Editing with HooK Layers

1 code implementation8 Mar 2024 Shuaiyi Li, Yang Deng, Deng Cai, Hongyuan Lu, Liang Chen, Wai Lam

As the typical retraining paradigm is unacceptably time- and resource-consuming, researchers are turning to model editing to find an effective way that supports both consecutive and batch scenarios to edit the model behavior directly.

model Model Editing

TPE: Towards Better Compositional Reasoning over Conceptual Tools with Multi-persona Collaboration

no code implementations28 Sep 2023 Hongru Wang, Huimin Wang, Lingzhi Wang, Minda Hu, Rui Wang, Boyang Xue, Hongyuan Lu, Fei Mi, Kam-Fai Wong

Large language models (LLMs) have demonstrated exceptional performance in planning the use of various functional tools, such as calculators and retrievers, particularly in question-answering tasks.

Question Answering Response Generation

Not All Metrics Are Guilty: Improving NLG Evaluation by Diversifying References

2 code implementations24 May 2023 Tianyi Tang, Hongyuan Lu, Yuchen Eleanor Jiang, Haoyang Huang, Dongdong Zhang, Wayne Xin Zhao, Tom Kocmi, Furu Wei

Most research about natural language generation (NLG) relies on evaluation benchmarks with limited references for a sample, which may result in poor correlations with human judgements.

Machine Translation nlg evaluation +3

Chain-of-Symbol Prompting Elicits Planning in Large Langauge Models

1 code implementation17 May 2023 Hanxu Hu, Hongyuan Lu, Huajian Zhang, Yun-Ze Song, Wai Lam, Yue Zhang

To this end, we propose a novel method called CoS (Chain-of-Symbol Prompting) that represents the complex environments with condensed symbolic spatial representations during the chained intermediate thinking steps.

Chain-of-Dictionary Prompting Elicits Translation in Large Language Models

1 code implementation11 May 2023 Hongyuan Lu, Haoran Yang, Haoyang Huang, Dongdong Zhang, Wai Lam, Furu Wei

Large language models (LLMs) have shown surprisingly good performance in multilingual neural machine translation (MNMT) even when trained without parallel data.

In-Context Learning Machine Translation +1

Advancing Multilingual Pre-training: TRIP Triangular Document-level Pre-training for Multilingual Language Models

no code implementations15 Dec 2022 Hongyuan Lu, Haoyang Huang, Shuming Ma, Dongdong Zhang, Wai Lam, Furu Wei

Despite the success of multilingual sequence-to-sequence pre-training, most existing approaches rely on document-level monolingual corpora in many different languages, sentence-level bilingual corpora,\footnote{In this paper, we use `bilingual corpora' to denote parallel corpora with `bilingual translation pairs' in many different language pairs, each consisting of two sentences/documents with the same meaning written in different languages.

Abstractive Text Summarization Cross-Lingual Abstractive Summarization +4

Revamping Multilingual Agreement Bidirectionally via Switched Back-translation for Multilingual Neural Machine Translation

no code implementations28 Sep 2022 Hongyuan Lu, Haoyang Huang, Shuming Ma, Dongdong Zhang, Furu Wei, Wai Lam

Despite the fact that multilingual agreement (MA) has shown its importance for multilingual neural machine translation (MNMT), current methodologies in the field have two shortages: (i) require parallel data between multiple language pairs, which is not always realistic and (ii) optimize the agreement in an ambiguous direction, which hampers the translation performance.

Document Level Machine Translation Document Translation +2

PCC: Paraphrasing with Bottom-k Sampling and Cyclic Learning for Curriculum Data Augmentation

1 code implementation17 Aug 2022 Hongyuan Lu, Wai Lam

This paper presents \textbf{PCC}: \textbf{P}araphrasing with Bottom-k Sampling and \textbf{C}yclic Learning for \textbf{C}urriculum Data Augmentation, a novel CDA framework via paraphrasing, which exploits the textual paraphrase similarity as the curriculum difficulty measure.

Data Augmentation Dialogue Generation +3

Partner Personas Generation for Diverse Dialogue Generation

no code implementations27 Nov 2021 Hongyuan Lu, Wai Lam, Hong Cheng, Helen M. Meng

We incorporate reinforcement learning with a dedicatedly designed critic network for reward judgement.

Dialogue Generation Response Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.