Search Results for author: Yasheng Wang

Found 47 papers, 22 papers with code

NEZHA: Neural Contextualized Representation for Chinese Language Understanding

10 code implementations31 Aug 2019 Junqiu Wei, Xiaozhe Ren, Xiaoguang Li, Wenyong Huang, Yi Liao, Yasheng Wang, Jiashu Lin, Xin Jiang, Xiao Chen, Qun Liu

The pre-trained language models have achieved great successes in various natural language understanding (NLU) tasks due to its capacity to capture the deep contextualized information in text by pre-training on large-scale corpora.

named-entity-recognition Named Entity Recognition +6

GPT-based Generation for Classical Chinese Poetry

2 code implementations29 Jun 2019 Yi Liao, Yasheng Wang, Qun Liu, Xin Jiang

We present a simple yet effective method for generating high quality classical Chinese poetry with Generative Pre-trained Language Model (GPT).

Language Modelling

JABER and SABER: Junior and Senior Arabic BERt

1 code implementation8 Dec 2021 Abbas Ghaddar, Yimeng Wu, Ahmad Rashid, Khalil Bibi, Mehdi Rezagholizadeh, Chao Xing, Yasheng Wang, Duan Xinyu, Zhefeng Wang, Baoxing Huai, Xin Jiang, Qun Liu, Philippe Langlais

Language-specific pre-trained models have proven to be more accurate than multilingual ones in a monolingual evaluation setting, Arabic is no exception.

Language Modelling NER

Unified Mandarin TTS Front-end Based on Distilled BERT Model

1 code implementation31 Dec 2020 Yang Zhang, Liqun Deng, Yasheng Wang

The front-end module in a typical Mandarin text-to-speech system (TTS) is composed of a long pipeline of text processing components, which requires extensive efforts to build and is prone to large accumulative model size and cascade errors.

Knowledge Distillation Language Modelling +1

Multi-channel Reverse Dictionary Model

1 code implementation18 Dec 2019 Lei Zhang, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Qun Liu, Maosong Sun

A reverse dictionary takes the description of a target word as input and outputs the target word together with other words that match the description.

Reverse Dictionary Sentence

Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger

2 code implementations ACL 2021 Fanchao Qi, Mukai Li, Yangyi Chen, Zhengyan Zhang, Zhiyuan Liu, Yasheng Wang, Maosong Sun

As far as we know, almost all existing textual backdoor attack methods insert additional contents into normal samples as triggers, which causes the trigger-embedded samples to be detected and the backdoor attacks to be blocked without much effort.

Backdoor Attack

Sub-Character Tokenization for Chinese Pretrained Language Models

2 code implementations1 Jun 2021 Chenglei Si, Zhengyan Zhang, Yingfa Chen, Fanchao Qi, Xiaozhi Wang, Zhiyuan Liu, Yasheng Wang, Qun Liu, Maosong Sun

2) Pronunciation-based SubChar tokenizers can encode Chinese homophones into the same transliteration sequences and produce the same tokenization output, hence being robust to homophone typos.

Chinese Word Segmentation Computational Efficiency +2

Red Alarm for Pre-trained Models: Universal Vulnerability to Neuron-Level Backdoor Attacks

1 code implementation ICML Workshop AML 2021 Zhengyan Zhang, Guangxuan Xiao, Yongwei Li, Tian Lv, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Xin Jiang, Maosong Sun

In this work, we demonstrate the universal vulnerability of PTMs, where fine-tuned PTMs can be easily controlled by backdoor attacks in arbitrary downstream tasks.

Backdoor Attack

Towards Identifying Social Bias in Dialog Systems: Frame, Datasets, and Benchmarks

1 code implementation16 Feb 2022 Jingyan Zhou, Jiawen Deng, Fei Mi, Yitong Li, Yasheng Wang, Minlie Huang, Xin Jiang, Qun Liu, Helen Meng

The research of open-domain dialog systems has been greatly prospered by neural models trained on large-scale corpora, however, such corpora often introduce various safety problems (e. g., offensive languages, biases, and toxic behaviors) that significantly hinder the deployment of dialog systems in practice.

Bias Detection Open-Domain Dialog

Better Robustness by More Coverage: Adversarial Training with Mixup Augmentation for Robust Fine-tuning

1 code implementation31 Dec 2020 Chenglei Si, Zhengyan Zhang, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Qun Liu, Maosong Sun

In this work, we propose a simple and effective method to cover a much larger proportion of the attack search space, called Adversarial and Mixup Data Augmentation (AMDA).

Adversarial Robustness Text Augmentation +2

Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios

1 code implementation30 Jan 2024 Shijue Huang, Wanjun Zhong, Jianqiao Lu, Qi Zhu, Jiahui Gao, Weiwen Liu, Yutai Hou, Xingshan Zeng, Yasheng Wang, Lifeng Shang, Xin Jiang, Ruifeng Xu, Qun Liu

The recent trend of using Large Language Models (LLMs) as tool agents in real-world applications underscores the necessity for comprehensive evaluations of their capabilities, particularly in complex scenarios involving planning, creating, and using tools.

Benchmarking

Improving Sequence Modeling Ability of Recurrent Neural Networks via Sememes

1 code implementation20 Oct 2019 Yujia Qin, Fanchao Qi, Sicong Ouyang, Zhiyuan Liu, Cheng Yang, Yasheng Wang, Qun Liu, Maosong Sun

Sememes, the minimum semantic units of human languages, have been successfully utilized in various natural language processing applications.

Adversarial Attack Language Modelling +2

Sparse Structure Search for Delta Tuning

1 code implementation NIPS 2022 Shengding Hu, Zhen Zhang, Ning Ding, Yadao Wang, Yasheng Wang, Zhiyuan Liu, Maosong Sun

Generally, DT methods exquisitely design delta modules (DT modules) which could be applied to arbitrary fine-grained positions inside PTMs.

Constructing Highly Inductive Contexts for Dialogue Safety through Controllable Reverse Generation

1 code implementation4 Dec 2022 Zhexin Zhang, Jiale Cheng, Hao Sun, Jiawen Deng, Fei Mi, Yasheng Wang, Lifeng Shang, Minlie Huang

In order to detect such toxic generations, existing methods rely on templates, real-world data extraction, crowdsourcing workers, or automatic generation to construct adversarial contexts that are likely to induce toxic generations.

Response Generation

DecompEval: Evaluating Generated Texts as Unsupervised Decomposed Question Answering

1 code implementation13 Jul 2023 Pei Ke, Fei Huang, Fei Mi, Yasheng Wang, Qun Liu, Xiaoyan Zhu, Minlie Huang

Existing evaluation metrics for natural language generation (NLG) tasks face the challenges on generalization ability and interpretability.

Dialogue Generation nlg evaluation +3

PanGu-Coder: Program Synthesis with Function-Level Language Modeling

1 code implementation22 Jul 2022 Fenia Christopoulou, Gerasimos Lampouras, Milan Gritta, Guchun Zhang, Yinpeng Guo, Zhongqi Li, Qi Zhang, Meng Xiao, Bo Shen, Lin Li, Hao Yu, Li Yan, Pingyi Zhou, Xin Wang, Yuchi Ma, Ignacio Iacobacci, Yasheng Wang, Guangtai Liang, Jiansheng Wei, Xin Jiang, Qianxiang Wang, Qun Liu

We present PanGu-Coder, a pretrained decoder-only language model adopting the PanGu-Alpha architecture for text-to-code generation, i. e. the synthesis of programming language solutions given a natural language problem description.

Code Generation Language Modelling +2

Improving Factual Consistency for Knowledge-Grounded Dialogue Systems via Knowledge Enhancement and Alignment

1 code implementation12 Oct 2023 Boyang Xue, Weichao Wang, Hongru Wang, Fei Mi, Rui Wang, Yasheng Wang, Lifeng Shang, Xin Jiang, Qun Liu, Kam-Fai Wong

Inspired by previous work which identified that feed-forward networks (FFNs) within Transformers are responsible for factual knowledge expressions, we investigate two methods to efficiently improve the factual expression capability {of FFNs} by knowledge enhancement and alignment respectively.

SynCoBERT: Syntax-Guided Multi-Modal Contrastive Pre-Training for Code Representation

no code implementations10 Aug 2021 Xin Wang, Yasheng Wang, Fei Mi, Pingyi Zhou, Yao Wan, Xiao Liu, Li Li, Hao Wu, Jin Liu, Xin Jiang

Code representation learning, which aims to encode the semantics of source code into distributed vectors, plays an important role in recent deep-learning-based models for code intelligence.

Clone Detection Code Search +5

CINS: Comprehensive Instruction for Few-shot Learning in Task-oriented Dialog Systems

no code implementations10 Sep 2021 Fei Mi, Yitong Li, Yasheng Wang, Xin Jiang, Qun Liu

As labeling cost for different modules in task-oriented dialog (ToD) systems is high, a major challenge in practice is to learn different tasks with the least amount of labeled data.

dialog state tracking Few-Shot Learning +3

UniMS: A Unified Framework for Multimodal Summarization with Knowledge Distillation

no code implementations13 Sep 2021 Zhengkun Zhang, Xiaojun Meng, Yasheng Wang, Xin Jiang, Qun Liu, Zhenglu Yang

Specially, we adopt knowledge distillation from a vision-language pretrained model to improve image selection, which avoids any requirement on the existence and quality of image captions.

Abstractive Text Summarization Image Captioning +2

Pan More Gold from the Sand: Refining Open-domain Dialogue Training with Noisy Self-Retrieval Generation

no code implementations COLING 2022 Yihe Wang, Yitong Li, Yasheng Wang, Fei Mi, Pingyi Zhou, Xin Wang, Jin Liu, Xin Jiang, Qun Liu

Experiments over publicly available datasets demonstrate that our method can help models generate better responses, even such training data are usually impressed as low-quality data.

Dialogue Generation Retrieval

HyperPELT: Unified Parameter-Efficient Language Model Tuning for Both Language and Vision-and-Language Tasks

no code implementations8 Mar 2022 Zhengkun Zhang, Wenya Guo, Xiaojun Meng, Yasheng Wang, Yadao Wang, Xin Jiang, Qun Liu, Zhenglu Yang

In this paper, we design a novel unified parameter-efficient transfer learning framework that works effectively on both pure language and V&L tasks.

Language Modelling Multi-Task Learning

Compilable Neural Code Generation with Compiler Feedback

no code implementations Findings (ACL) 2022 Xin Wang, Yasheng Wang, Yao Wan, Fei Mi, Yitong Li, Pingyi Zhou, Jin Liu, Hao Wu, Xin Jiang, Qun Liu

Automatically generating compilable programs with (or without) natural language descriptions has always been a touchstone problem for computational linguistics and automated software engineering.

Code Completion Code Generation +3

CODE-MVP: Learning to Represent Source Code from Multiple Views with Contrastive Pre-Training

no code implementations Findings (NAACL) 2022 Xin Wang, Yasheng Wang, Yao Wan, Jiawei Wang, Pingyi Zhou, Li Li, Hao Wu, Jin Liu

Specifically, we first extract multiple code views using compiler tools, and learn the complementary information among them under a contrastive learning framework.

Contrastive Learning Defect Detection +2

Sparse Structure Search for Parameter-Efficient Tuning

no code implementations15 Jun 2022 Shengding Hu, Zhen Zhang, Ning Ding, Yadao Wang, Yasheng Wang, Zhiyuan Liu, Maosong Sun

The searched structures preserve more than 99\% fine-tuning performance with 0. 01\% trainable parameters.

Lexicon-injected Semantic Parsing for Task-Oriented Dialog

no code implementations26 Nov 2022 Xiaojun Meng, Wenlin Dai, Yasheng Wang, Baojun Wang, Zhiyong Wu, Xin Jiang, Qun Liu

Then we present a novel lexicon-injected semantic parser, which collects slot labels of tree representation as a lexicon, and injects lexical features to the span representation of parser.

Semantic Parsing

Momentum Contrastive Pre-training for Question Answering

no code implementations12 Dec 2022 Minda Hu, Muzhi Li, Yasheng Wang, Irwin King

In order to address this problem, we propose a novel Momentum Contrastive pRe-training fOr queStion anSwering (MCROSS) method for extractive QA.

Benchmarking Contrastive Learning +3

MultiCoder: Multi-Programming-Lingual Pre-Training for Low-Resource Code Completion

no code implementations19 Dec 2022 Zi Gong, Yinpeng Guo, Pingyi Zhou, Cuiyun Gao, Yasheng Wang, Zenglin Xu

On the other hand, there are few studies exploring the effects of multi-programming-lingual (MultiPL) pre-training for the code completion, especially the impact on low-resource programming languages.

Code Completion

Learning Summary-Worthy Visual Representation for Abstractive Summarization in Video

no code implementations8 May 2023 Zenan Xu, Xiaojun Meng, Yasheng Wang, Qinliang Su, Zexuan Qiu, Xin Jiang, Qun Liu

Multimodal abstractive summarization for videos (MAS) requires generating a concise textual summary to describe the highlights of a video according to multimodal resources, in our case, the video content and its transcript.

Abstractive Text Summarization Language Modelling

Large Language Models as Source Planner for Personalized Knowledge-grounded Dialogue

no code implementations13 Oct 2023 Hongru Wang, Minda Hu, Yang Deng, Rui Wang, Fei Mi, Weichao Wang, Yasheng Wang, Wai-Chung Kwan, Irwin King, Kam-Fai Wong

Open-domain dialogue system usually requires different sources of knowledge to generate more informative and evidential responses.

Response Generation

YODA: Teacher-Student Progressive Learning for Language Models

no code implementations28 Jan 2024 Jianqiao Lu, Wanjun Zhong, YuFei Wang, Zhijiang Guo, Qi Zhu, Wenyong Huang, Yanlin Wang, Fei Mi, Baojun Wang, Yasheng Wang, Lifeng Shang, Xin Jiang, Qun Liu

With the teacher's guidance, the student learns to iteratively refine its answer with feedback, and forms a robust and comprehensive understanding of the posed questions.

GSM8K Math

Understanding the planning of LLM agents: A survey

no code implementations5 Feb 2024 Xu Huang, Weiwen Liu, Xiaolong Chen, Xingmei Wang, Hao Wang, Defu Lian, Yasheng Wang, Ruiming Tang, Enhong Chen

As Large Language Models (LLMs) have shown significant intelligence, the progress to leverage LLMs as planning modules of autonomous agents has attracted more attention.

UniRetriever: Multi-task Candidates Selection for Various Context-Adaptive Conversational Retrieval

no code implementations26 Feb 2024 Hongru Wang, Boyang Xue, Baohang Zhou, Rui Wang, Fei Mi, Weichao Wang, Yasheng Wang, Kam-Fai Wong

Conversational retrieval refers to an information retrieval system that operates in an iterative and interactive manner, requiring the retrieval of various external resources, such as persona, knowledge, and even response, to effectively engage with the user and successfully complete the dialogue.

Information Retrieval Retrieval

Evaluating Robustness of Generative Search Engine on Adversarial Factual Questions

no code implementations25 Feb 2024 Xuming Hu, Xiaochuan Li, Junzhe Chen, Yinghui Li, Yangning Li, Xiaoguang Li, Yasheng Wang, Qun Liu, Lijie Wen, Philip S. Yu, Zhijiang Guo

To this end, we propose evaluating the robustness of generative search engines in the realistic and high-risk setting, where adversaries have only black-box system access and seek to deceive the model into returning incorrect responses.

Retrieval

Improving Language Model Reasoning with Self-motivated Learning

no code implementations10 Apr 2024 Yunlong Feng, Yang Xu, Libo Qin, Yasheng Wang, Wanxiang Che

The framework motivates the model itself to automatically generate rationales on existing datasets.

Language Modelling

WESE: Weak Exploration to Strong Exploitation for LLM Agents

no code implementations11 Apr 2024 Xu Huang, Weiwen Liu, Xiaolong Chen, Xingmei Wang, Defu Lian, Yasheng Wang, Ruiming Tang, Enhong Chen

Concretely, WESE involves decoupling the exploration and exploitation process, employing a cost-effective weak agent to perform exploration tasks for global knowledge.

Decision Making Prompt Engineering

Cannot find the paper you are looking for? You can Submit a new open access paper.