no code implementations • dialdoc (ACL) 2022 • Kun Li, Tianhua Zhang, Liping Tang, Junan Li, Hongyuan Lu, Xixin Wu, Helen Meng
For the response generator, we use grounding span prediction as an auxiliary task to be jointly trained with the main task of response generation.
no code implementations • Findings (ACL) 2022 • Hongyuan Lu, Wai Lam, Hong Cheng, Helen Meng
We propose a novel framework that automatically generates a control token with the generator to bias the succeeding response towards informativeness for answerable contexts and fallback for unanswerable contexts in an end-to-end manner.
no code implementations • NAACL 2022 • Hongyuan Lu, Wai Lam, Hong Cheng, Helen Meng
Incorporating personas information allows diverse and engaging responses in dialogue response generation.
no code implementations • 2 Nov 2024 • Hongyuan Lu, Zixuan Li, Wai Lam
As current training data for Large Language Models (LLMs) are dominated by English corpus, they are English-centric and they present impressive performance on English reasoning tasks.\footnote{This paper primarily studies English-centric models, but our method could be universal by using the centric language in the dictionary for non-English-centric LLMs.}
no code implementations • 9 Oct 2024 • Hongyuan Lu, Shujie Miao, Wai Lam
We found that it is a simple yet effective method for reducing the effect of data contamination and fortunately, it is also harmful to be used as a data augmentation method during training.
no code implementations • 5 Oct 2024 • Hongyuan Lu, Wai Lam
Fortunately, our findings suggest that ToxPrune simultaneously improves the toxic language model NSFW-3B on the task of dialogue response generation obviously.
no code implementations • 4 Jul 2024 • Hao Yang, Hongyuan Lu, Xinhua Zeng, Yang Liu, Xiang Zhang, Haoran Yang, Yumeng Zhang, Shan Huang, Yiran Wei, Wai Lam
In the rapidly evolving field of natural language processing, dialogue systems primarily employ a single-step dialogue paradigm.
1 code implementation • 14 Mar 2024 • Haoran Yang, Yumeng Zhang, Jiaqi Xu, Hongyuan Lu, Pheng Ann Heng, Wai Lam
While Large Language Models (LLMs) have demonstrated exceptional multitasking abilities, fine-tuning these models on downstream, domain-specific datasets is often necessary to yield superior performance on test sets compared to their counterparts without fine-tuning.
1 code implementation • 8 Mar 2024 • Shuaiyi Li, Yang Deng, Deng Cai, Hongyuan Lu, Liang Chen, Wai Lam
As the typical retraining paradigm is unacceptably time- and resource-consuming, researchers are turning to model editing to find an effective way that supports both consecutive and batch scenarios to edit the model behavior directly.
no code implementations • 15 Nov 2023 • Wenhong Zhu, Hongkun Hao, Zhiwei He, Yunze Song, Yumeng Zhang, Hanxu Hu, Yiran Wei, Rui Wang, Hongyuan Lu
The best candidate is finally selected from this set based on the BLEURT score.
no code implementations • 28 Sep 2023 • Hongru Wang, Huimin Wang, Lingzhi Wang, Minda Hu, Rui Wang, Boyang Xue, Hongyuan Lu, Fei Mi, Kam-Fai Wong
Large language models (LLMs) have demonstrated exceptional performance in planning the use of various functional tools, such as calculators and retrievers, particularly in question-answering tasks.
no code implementations • 9 Sep 2023 • Hongyuan Lu, Wai Lam
This is well motivated as augmenting data via paraphrasing effectively improves neural language models.
2 code implementations • 24 May 2023 • Tianyi Tang, Hongyuan Lu, Yuchen Eleanor Jiang, Haoyang Huang, Dongdong Zhang, Wayne Xin Zhao, Tom Kocmi, Furu Wei
Most research about natural language generation (NLG) relies on evaluation benchmarks with limited references for a sample, which may result in poor correlations with human judgements.
1 code implementation • 17 May 2023 • Hanxu Hu, Hongyuan Lu, Huajian Zhang, Yun-Ze Song, Wai Lam, Yue Zhang
To this end, we propose a novel method called CoS (Chain-of-Symbol Prompting) that represents the complex environments with condensed symbolic spatial representations during the chained intermediate thinking steps.
1 code implementation • 11 May 2023 • Hongyuan Lu, Haoran Yang, Haoyang Huang, Dongdong Zhang, Wai Lam, Furu Wei
Large language models (LLMs) have shown surprisingly good performance in multilingual neural machine translation (MNMT) even when trained without parallel data.
no code implementations • 15 Dec 2022 • Hongyuan Lu, Haoyang Huang, Shuming Ma, Dongdong Zhang, Wai Lam, Furu Wei
Despite the success of multilingual sequence-to-sequence pre-training, most existing approaches rely on document-level monolingual corpora in many different languages, sentence-level bilingual corpora,\footnote{In this paper, we use `bilingual corpora' to denote parallel corpora with `bilingual translation pairs' in many different language pairs, each consisting of two sentences/documents with the same meaning written in different languages.
Abstractive Text Summarization
Cross-Lingual Abstractive Summarization
+4
no code implementations • 28 Sep 2022 • Hongyuan Lu, Haoyang Huang, Shuming Ma, Dongdong Zhang, Furu Wei, Wai Lam
Despite the fact that multilingual agreement (MA) has shown its importance for multilingual neural machine translation (MNMT), current methodologies in the field have two shortages: (i) require parallel data between multiple language pairs, which is not always realistic and (ii) optimize the agreement in an ambiguous direction, which hampers the translation performance.
1 code implementation • 17 Aug 2022 • Hongyuan Lu, Wai Lam
This paper presents \textbf{PCC}: \textbf{P}araphrasing with Bottom-k Sampling and \textbf{C}yclic Learning for \textbf{C}urriculum Data Augmentation, a novel CDA framework via paraphrasing, which exploits the textual paraphrase similarity as the curriculum difficulty measure.
no code implementations • 27 Nov 2021 • Hongyuan Lu, Wai Lam, Hong Cheng, Helen M. Meng
We incorporate reinforcement learning with a dedicatedly designed critic network for reward judgement.