Search Results for author: Hua Wu

Found 164 papers, 81 papers with code

Question Answering over Knowledge Base with Neural Attention Combining Global Knowledge Information

no code implementations3 Jun 2016 Yuanzhe Zhang, Kang Liu, Shizhu He, Guoliang Ji, Zhanyi Liu, Hua Wu, Jun Zhao

With the rapid growth of knowledge bases (KBs) on the web, how to take full advantage of them becomes increasingly important.

Question Answering

Semi-Supervised Learning for Neural Machine Translation

no code implementations ACL 2016 Yong Cheng, Wei Xu, Zhongjun He, wei he, Hua Wu, Maosong Sun, Yang Liu

While end-to-end neural machine translation (NMT) has made remarkable progress recently, NMT systems only rely on parallel corpora for parameter estimation.

Machine Translation NMT +1

Latent Topic Embedding

no code implementations COLING 2016 Di Jiang, Lei Shi, Rongzhong Lian, Hua Wu

Topic modeling and word embedding are two important techniques for deriving latent semantics from data.

Sentence Topic Models +1

DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications

3 code implementations WS 2018 Wei He, Kai Liu, Jing Liu, Yajuan Lyu, Shiqi Zhao, Xinyan Xiao, Yu-An Liu, Yizhong Wang, Hua Wu, Qiaoqiao She, Xuan Liu, Tian Wu, Haifeng Wang

Experiments show that human performance is well above current state-of-the-art baseline systems, leaving plenty of room for the community to make improvements.

Machine Reading Comprehension

Multi-channel Encoder for Neural Machine Translation

no code implementations6 Dec 2017 Hao Xiong, Zhongjun He, Xiaoguang Hu, Hua Wu

This design of encoder yields relatively uniform composition on source sentence, despite the gating mechanism employed in encoding RNN.

Machine Translation NMT +2

A New Method of Region Embedding for Text Classification

1 code implementation ICLR 2018 chao qiao, Bo Huang, guocheng niu, daren li, daxiang dong, wei he, dianhai yu, Hua Wu

In this paper, we propose a new method of learning and utilizing task-specific distributed representations of n-grams, referred to as “region embeddings”.

General Classification text-classification +1

Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification

no code implementations ACL 2018 Yizhong Wang, Kai Liu, Jing Liu, wei he, Yajuan Lyu, Hua Wu, Sujian Li, Haifeng Wang

Machine reading comprehension (MRC) on real web data usually requires the machine to answer a question by analyzing multiple passages retrieved by search engine.

Machine Reading Comprehension Question Answering

Familia: A Configurable Topic Modeling Framework for Industrial Text Engineering

1 code implementation11 Aug 2018 Di Jiang, Yuanfeng Song, Rongzhong Lian, Siqi Bao, Jinhua Peng, Huang He, Hua Wu

In order to relieve burdens of software engineers without knowledge of Bayesian networks, Familia is able to conduct automatic parameter inference for a variety of topic models.

Topic Models

Addressing Troublesome Words in Neural Machine Translation

no code implementations EMNLP 2018 Yang Zhao, Jiajun Zhang, Zhongjun He, Cheng-qing Zong, Hua Wu

One of the weaknesses of Neural Machine Translation (NMT) is in handling lowfrequency and ambiguous words, which we refer as troublesome words.

Machine Translation NMT +1

Learning to Select Knowledge for Response Generation in Dialog Systems

1 code implementation13 Feb 2019 Rongzhong Lian, Min Xie, Fan Wang, Jinhua Peng, Hua Wu

Specifically, a posterior distribution over knowledge is inferred from both utterances and responses, and it ensures the appropriate selection of knowledge during the training process.

Response Generation

Knowledge Aware Conversation Generation with Explainable Reasoning over Augmented Graphs

1 code implementation IJCNLP 2019 Zhibin Liu, Zheng-Yu Niu, Hua Wu, Haifeng Wang

Two types of knowledge, triples from knowledge graphs and texts from documents, have been studied for knowledge aware open-domain conversation generation, in which graph paths can narrow down vertex candidates for knowledge selection decision, and texts can provide rich information for response generation.

Knowledge Graphs Machine Reading Comprehension +1

End-to-End Speech Translation with Knowledge Distillation

no code implementations17 Apr 2019 Yuchen Liu, Hao Xiong, Zhongjun He, Jiajun Zhang, Hua Wu, Haifeng Wang, Cheng-qing Zong

End-to-end speech translation (ST), which directly translates from source language speech into target language text, has attracted intensive attentions in recent years.

Knowledge Distillation speech-recognition +2

Know More about Each Other: Evolving Dialogue Strategy via Compound Assessment

1 code implementation ACL 2019 Siqi Bao, Huang He, Fan Wang, Rongzhong Lian, Hua Wu

In this paper, a novel Generation-Evaluation framework is developed for multi-turn conversations with the objective of letting both participants know more about each other.

Informativeness

Generating Multiple Diverse Responses with Multi-Mapping and Posterior Mapping Selection

1 code implementation5 Jun 2019 Chaotao Chen, Jinhua Peng, Fan Wang, Jun Xu, Hua Wu

In this paper, we propose a multi-mapping mechanism to better capture the one-to-many relationship, where multiple mapping modules are employed as latent mechanisms to model the semantic mappings from an input post to its diverse responses.

Proactive Human-Machine Conversation with Explicit Conversation Goals

8 code implementations13 Jun 2019 Wenquan Wu, Zhen Guo, Xiangyang Zhou, Hua Wu, Xiyuan Zhang, Rongzhong Lian, Haifeng Wang

DuConv enables a very challenging task as the model needs to both understand dialogue and plan over the given knowledge graph.

ARNOR: Attention Regularization based Noise Reduction for Distant Supervision Relation Classification

no code implementations ACL 2019 Wei Jia, Dai Dai, Xinyan Xiao, Hua Wu

In this paper, we propose ARNOR, a novel Attention Regularization based NOise Reduction framework for distant supervision relation classification.

Classification General Classification +3

Proactive Human-Machine Conversation with Explicit Conversation Goal

no code implementations ACL 2019 Wenquan Wu, Zhen Guo, Xiangyang Zhou, Hua Wu, Xiyuan Zhang, Rongzhong Lian, Haifeng Wang

Konv enables a very challenging task as the model needs to both understand dialogue and plan over the given knowledge graph.

ERNIE 2.0: A Continual Pre-training Framework for Language Understanding

3 code implementations29 Jul 2019 Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Hao Tian, Hua Wu, Haifeng Wang

Recently, pre-trained models have achieved state-of-the-art results in various language understanding tasks, which indicates that pre-training on large-scale corpora may play a crucial role in natural language processing.

Chinese Named Entity Recognition Chinese Reading Comprehension +8

Baidu Neural Machine Translation Systems for WMT19

no code implementations WS 2019 Meng Sun, Bojian Jiang, Hao Xiong, Zhongjun He, Hua Wu, Haifeng Wang

In this paper we introduce the systems Baidu submitted for the WMT19 shared task on Chinese{\textless}-{\textgreater}English news translation.

Data Augmentation Domain Adaptation +4

Multi-agent Learning for Neural Machine Translation

no code implementations IJCNLP 2019 Tianchi Bi, Hao Xiong, Zhongjun He, Hua Wu, Haifeng Wang

Conventional Neural Machine Translation (NMT) models benefit from the training with an additional agent, e. g., dual learning, and bidirectional decoding with one agent decoding from left to right and the other decoding in the opposite direction.

Machine Translation NMT +1

D-NET: A Pre-Training and Fine-Tuning Framework for Improving the Generalization of Machine Reading Comprehension

1 code implementation WS 2019 Hongyu Li, Xiyuan Zhang, Yibing Liu, Yiming Zhang, Quan Wang, Xiangyang Zhou, Jing Liu, Hua Wu, Haifeng Wang

In this paper, we introduce a simple system Baidu submitted for MRQA (Machine Reading for Question Answering) 2019 Shared Task that focused on generalization of machine reading comprehension (MRC) models.

Machine Reading Comprehension Multi-Task Learning +1

CoKE: Contextualized Knowledge Graph Embedding

3 code implementations6 Nov 2019 Quan Wang, Pingping Huang, Haifeng Wang, Songtai Dai, Wenbin Jiang, Jing Liu, Yajuan Lyu, Yong Zhu, Hua Wu

This work presents Contextualized Knowledge Graph Embedding (CoKE), a novel paradigm that takes into account such contextual nature, and learns dynamic, flexible, and fully contextualized entity and relation embeddings.

Knowledge Graph Embedding Link Prediction +1

Synchronous Speech Recognition and Speech-to-Text Translation with Interactive Decoding

1 code implementation16 Dec 2019 Yuchen Liu, Jiajun Zhang, Hao Xiong, Long Zhou, Zhongjun He, Hua Wu, Haifeng Wang, Cheng-qing Zong

Speech-to-text translation (ST), which translates source language speech into target language text, has attracted intensive attention in recent years.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation

4 code implementations26 Jan 2020 Dongling Xiao, Han Zhang, Yukun Li, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

Current pre-training works in natural language generation pay little attention to the problem of exposure bias on downstream tasks.

 Ranked #1 on Question Generation on SQuAD1.1 (using extra training data)

Abstractive Text Summarization Dialogue Generation +3

Exploring Contextual Word-level Style Relevance for Unsupervised Style Transfer

1 code implementation ACL 2020 Chulun Zhou, Liang-Yu Chen, Jiachen Liu, Xinyan Xiao, Jinsong Su, Sheng Guo, Hua Wu

Unsupervised style transfer aims to change the style of an input sentence while preserving its original content without using parallel training data.

Denoising Sentence +1

Towards Conversational Recommendation over Multi-Type Dialogs

2 code implementations ACL 2020 Zeming Liu, Haifeng Wang, Zheng-Yu Niu, Hua Wu, Wanxiang Che, Ting Liu

We propose a new task of conversational recommendation over multi-type dialogs, where the bots can proactively and naturally lead a conversation from a non-recommendation dialog (e. g., QA) to a recommendation dialog, taking into account user's interests and feedback.

Vocal Bursts Type Prediction

SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis

7 code implementations ACL 2020 Hao Tian, Can Gao, Xinyan Xiao, Hao liu, Bolei He, Hua Wu, Haifeng Wang, Feng Wu

In particular, the prediction of aspect-sentiment pairs is converted into multi-label classification, aiming to capture the dependency between words in a pair.

Multi-Label Classification Sentiment Analysis +1

Leveraging Graph to Improve Abstractive Multi-Document Summarization

2 code implementations ACL 2020 Wei Li, Xinyan Xiao, Jiachen Liu, Hua Wu, Haifeng Wang, Junping Du

Graphs that capture relations between textual units have great benefits for detecting salient information from multiple documents and generating overall coherent summaries.

Document Summarization Multi-Document Summarization

Conversational Graph Grounded Policy Learning for Open-Domain Conversation Generation

no code implementations ACL 2020 Jun Xu, Haifeng Wang, Zheng-Yu Niu, Hua Wu, Wanxiang Che, Ting Liu

To address the challenge of policy learning in open-domain multi-turn conversation, we propose to represent prior information about dialog transitions as a graph and learn a graph grounded dialog policy, aimed at fostering a more coherent and controllable dialog.

Response Generation

Two-dimensional ferromagnetic semiconductor VBr3 with tunable anisotropy

no code implementations20 Aug 2020 Lu Liu, Ke Yang, Guangyu Wang, Hua Wu

Two-dimensional (2D) ferromagnets (FMs) have attracted widespread attention due to their prospects in spintronic applications.

Materials Science Strongly Correlated Electrons

Discovering Dialog Structure Graph for Open-Domain Dialog Generation

no code implementations31 Dec 2020 Jun Xu, Zeyang Lei, Haifeng Wang, Zheng-Yu Niu, Hua Wu, Wanxiang Che, Ting Liu

Learning interpretable dialog structure from human-human dialogs yields basic insights into the structure of conversation, and also provides background knowledge to facilitate dialog generation.

Open-Domain Dialog

ERNIE-M: Enhanced Multilingual Representation by Aligning Cross-lingual Semantics with Monolingual Corpora

2 code implementations EMNLP 2021 Xuan Ouyang, Shuohuan Wang, Chao Pang, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

In this paper, we propose ERNIE-M, a new training method that encourages the model to align the representation of multiple languages with monolingual corpora, to overcome the constraint that the parallel corpus size places on the model performance.

Sentence Translation

ERNIE-Doc: A Retrospective Long-Document Modeling Transformer

4 code implementations ACL 2021 Siyu Ding, Junyuan Shang, Shuohuan Wang, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

Transformers are not suited for processing long documents, due to their quadratically increasing memory and time consumption.

Ranked #1000000000 on Text Classification on IMDb

Language Modelling Question Answering +2

Knowledge Distillation based Ensemble Learning for Neural Machine Translation

no code implementations1 Jan 2021 Chenze Shao, Meng Sun, Yang Feng, Zhongjun He, Hua Wu, Haifeng Wang

Under this framework, we introduce word-level ensemble learning and sequence-level ensemble learning for neural machine translation, where sequence-level ensemble learning is capable of aggregating translation models with different decoding strategies.

Ensemble Learning Knowledge Distillation +2

Learning to Select External Knowledge with Multi-Scale Negative Sampling

1 code implementation3 Feb 2021 Huang He, Hua Lu, Siqi Bao, Fan Wang, Hua Wu, ZhengYu Niu, Haifeng Wang

The Track-1 of DSTC9 aims to effectively answer user requests or questions during task-oriented dialogues, which are out of the scope of APIs/DB.

Response Generation

A Unified Pre-training Framework for Conversational AI

1 code implementation6 May 2021 Siqi Bao, Bingjin Chen, Huang He, Xin Tian, Han Zhou, Fan Wang, Hua Wu, Haifeng Wang, Wenquan Wu, Yingzhan Lin

In this work, we explore the application of PLATO-2 on various dialogue systems, including open-domain conversation, knowledge grounded dialogue, and task-oriented conversation.

Chatbot Interactive Evaluation of Dialog +1

BASS: Boosting Abstractive Summarization with Unified Semantic Graph

no code implementations ACL 2021 Wenhao Wu, Wei Li, Xinyan Xiao, Jiachen Liu, Ziqiang Cao, Sujian Li, Hua Wu, Haifeng Wang

Abstractive summarization for long-document or multi-document remains challenging for the Seq2Seq architecture, as Seq2Seq is not good at analyzing long-distance relations in text.

Abstractive Text Summarization Document Summarization +2

ERNIE-Tiny : A Progressive Distillation Framework for Pretrained Transformer Compression

1 code implementation4 Jun 2021 Weiyue Su, Xuyi Chen, Shikun Feng, Jiaxiang Liu, Weixin Liu, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

Specifically, the first stage, General Distillation, performs distillation with guidance from pretrained teacher, gerenal data and latent distillation loss.

Knowledge Distillation

Discovering Dialog Structure Graph for Coherent Dialog Generation

no code implementations ACL 2021 Jun Xu, Zeyang Lei, Haifeng Wang, Zheng-Yu Niu, Hua Wu, Wanxiang Che

Learning discrete dialog structure graph from human-human dialogs yields basic insights into the structure of conversation, and also provides background knowledge to facilitate dialog generation.

Management

DuTrust: A Sentiment Analysis Dataset for Trustworthiness Evaluation

no code implementations30 Aug 2021 Lijie Wang, Hao liu, Shuyuan Peng, Hongxuan Tang, Xinyan Xiao, Ying Chen, Hua Wu, Haifeng Wang

Therefore, in order to systematically evaluate the factors for building trustworthy systems, we propose a novel and well-annotated sentiment analysis dataset to evaluate robustness and interpretability.

Sentiment Analysis

Evolving Decomposed Plasticity Rules for Information-Bottlenecked Meta-Learning

2 code implementations8 Sep 2021 Fan Wang, Hao Tian, Haoyi Xiong, Hua Wu, Jie Fu, Yang Cao, Yu Kang, Haifeng Wang

In contrast, biological neural networks (BNNs) can adapt to various new tasks by continually updating the neural connections based on the inputs, which is aligned with the paradigm of learning effective learning rules in addition to static parameters, e. g., meta-learning.

Memorization Meta-Learning

Mixup Decoding for Diverse Machine Translation

no code implementations Findings (EMNLP) 2021 Jicheng Li, Pengzhi Gao, Xuanfu Wu, Yang Feng, Zhongjun He, Hua Wu, Haifeng Wang

To further improve the faithfulness and diversity of the translations, we propose two simple but effective approaches to select diverse sentence pairs in the training corpus and adjust the interpolation weight for each pair correspondingly.

Machine Translation Sentence +1

Fine-grained Entity Typing via Label Reasoning

no code implementations EMNLP 2021 Qing Liu, Hongyu Lin, Xinyan Xiao, Xianpei Han, Le Sun, Hua Wu

Conventional entity typing approaches are based on independent classification paradigms, which make them difficult to recognize inter-dependent, long-tailed and fine-grained entity types.

Attribute Entity Typing

Controllable Dialogue Generation with Disentangled Multi-grained Style Specification and Attribute Consistency Reward

no code implementations14 Sep 2021 Zhe Hu, Zhiwei Cao, Hou Pong Chan, Jiachen Liu, Xinyan Xiao, Jinsong Su, Hua Wu

Controllable text generation is an appealing but challenging task, which allows users to specify particular attributes of the generated outputs.

Attribute Dialogue Generation +1

A Multimodal Sentiment Dataset for Video Recommendation

no code implementations17 Sep 2021 Hongxuan Tang, Hao liu, Xinyan Xiao, Hua Wu

Based on this, we propose a multimodal sentiment analysis dataset, named baiDu Video Sentiment dataset (DuVideoSenti), and introduce a new sentiment system which is designed to describe the sentimental style of a video on recommendation scenery.

Multimodal Sentiment Analysis Video Understanding

DuRecDial 2.0: A Bilingual Parallel Corpus for Conversational Recommendation

1 code implementation EMNLP 2021 Zeming Liu, Haifeng Wang, Zheng-Yu Niu, Hua Wu, Wanxiang Che

In this paper, we provide a bilingual parallel human-to-human recommendation dialog dataset (DuRecDial 2. 0) to enable researchers to explore a challenging task of multilingual and cross-lingual conversational recommendation.

PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation

3 code implementations20 Sep 2021 Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang, Wenquan Wu, Zhihua Wu, Zhen Guo, Hua Lu, Xinxian Huang, Xin Tian, Xinchao Xu, Yingzhan Lin, Zheng-Yu Niu

To explore the limit of dialogue generation pre-training, we present the models of PLATO-XL with up to 11 billion parameters, trained on both Chinese and English social media conversations.

Dialogue Generation

ERNIE-SPARSE: Robust Efficient Transformer Through Hierarchically Unifying Isolated Information

no code implementations29 Sep 2021 Yang Liu, Jiaxiang Liu, Yuxiang Lu, Shikun Feng, Yu Sun, Zhida Feng, Li Chen, Hao Tian, Hua Wu, Haifeng Wang

The first factor is information bottleneck sensitivity, which is caused by the key feature of Sparse Transformer — only a small number of global tokens can attend to all other tokens.

text-classification Text Classification

Do What Nature Did To Us: Evolving Plastic Recurrent Neural Networks For Generalized Tasks

no code implementations29 Sep 2021 Fan Wang, Hao Tian, Haoyi Xiong, Hua Wu, Yang Cao, Yu Kang, Haifeng Wang

While artificial neural networks (ANNs) have been widely adopted in machine learning, researchers are increasingly obsessed by the gaps between ANNs and natural neural networks (NNNs).

Meta-Learning

Building Chinese Biomedical Language Models via Multi-Level Text Discrimination

1 code implementation14 Oct 2021 Quan Wang, Songtai Dai, Benfeng Xu, Yajuan Lyu, Yong Zhu, Hua Wu, Haifeng Wang

In this work we introduce eHealth, a Chinese biomedical PLM built from scratch with a new pre-training framework.

Domain Adaptation

SgSum: Transforming Multi-document Summarization into Sub-graph Selection

1 code implementation25 Oct 2021 Moye Chen, Wei Li, Jiachen Liu, Xinyan Xiao, Hua Wu, Haifeng Wang

Comparing with traditional methods, our method has two main advantages: (1) the relations between sentences are captured by modeling both the graph structure of the whole document set and the candidate sub-graphs; (2) directly outputs an integrate summary in the form of sub-graph which is more informative and coherent.

Document Summarization Multi-Document Summarization +1

Amendable Generation for Dialogue State Tracking

1 code implementation EMNLP (NLP4ConvAI) 2021 Xin Tian, Liankai Huang, Yingzhan Lin, Siqi Bao, Huang He, Yunyi Yang, Hua Wu, Fan Wang, Shuqi Sun

In this paper, we propose a novel Amendable Generation for Dialogue State Tracking (AG-DST), which contains a two-pass generation process: (1) generating a primitive dialogue state based on the dialogue of the current turn and the previous dialogue state, and (2) amending the primitive dialogue state from the first pass.

Dialogue State Tracking Multi-domain Dialogue State Tracking +1

HelixMO: Sample-Efficient Molecular Optimization in Scene-Sensitive Latent Space

no code implementations30 Nov 2021 ZhiYuan Chen, Xiaomin Fang, Zixu Hua, Yueyang Huang, Fan Wang, Hua Wu

Efficient exploration of the chemical space to search the candidate drugs that satisfy various constraints is a fundamental task of drug discovery.

Drug Discovery Efficient Exploration

Learning with Noisy Correspondence for Cross-modal Matching

1 code implementation NeurIPS 2021 Zhenyu Huang, guocheng niu, Xiao Liu, Wenbiao Ding, Xinyan Xiao, Hua Wu, Xi Peng

Based on this observation, we reveal and study a latent and challenging direction in cross-modal matching, named noisy correspondence, which could be regarded as a new paradigm of noisy labels.

Image-text matching Memorization +2

DuQM: A Chinese Dataset of Linguistically Perturbed Natural Questions for Evaluating the Robustness of Question Matching Models

1 code implementation16 Dec 2021 Hongyu Zhu, Yan Chen, Jing Yan, Jing Liu, Yu Hong, Ying Chen, Hua Wu, Haifeng Wang

For this purpose, we create a Chinese dataset namely DuQM which contains natural questions with linguistic perturbations to evaluate the robustness of question matching models.

Natural Questions

ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language Generation

2 code implementations31 Dec 2021 Han Zhang, Weichong Yin, Yewei Fang, Lanxin Li, Boqiang Duan, Zhihua Wu, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

To explore the landscape of large-scale pre-training for bidirectional text-image generation, we train a 10-billion parameter ERNIE-ViLG model on a large-scale dataset of 145 million (Chinese) image-text pairs which achieves state-of-the-art performance for both text-to-image and image-to-text tasks, obtaining an FID of 7. 9 on MS-COCO for text-to-image synthesis and best results on COCO-CN and AIC-ICC for image captioning.

Image Captioning Quantization +2

Faithfulness in Natural Language Generation: A Systematic Survey of Analysis, Evaluation and Optimization Methods

no code implementations10 Mar 2022 Wei Li, Wenhao Wu, Moye Chen, Jiachen Liu, Xinyan Xiao, Hua Wu

In this survey, we provide a systematic overview of the research progress on the faithfulness problem of NLG, including problem analysis, evaluation metrics and optimization methods.

Abstractive Text Summarization Data-to-Text Generation +2

UNIMO-2: End-to-End Unified Vision-Language Grounded Learning

1 code implementation Findings (ACL) 2022 Wei Li, Can Gao, guocheng niu, Xinyan Xiao, Hao liu, Jiachen Liu, Hua Wu, Haifeng Wang

In particular, we propose to conduct grounded learning on both images and texts via a sharing grounded space, which helps bridge unaligned images and texts, and align the visual and textual semantic spaces on different types of corpora.

PLANET: Dynamic Content Planning in Autoregressive Transformers for Long-form Text Generation

no code implementations ACL 2022 Zhe Hu, Hou Pong Chan, Jiachen Liu, Xinyan Xiao, Hua Wu, Lifu Huang

Despite recent progress of pre-trained language models on generating fluent text, existing methods still suffer from incoherence problems in long-form text generation tasks that require proper content control and planning to form a coherent high-level logical flow.

Contrastive Learning Sentence +1

ERNIE-SPARSE: Learning Hierarchical Efficient Transformer Through Regularized Self-Attention

no code implementations23 Mar 2022 Yang Liu, Jiaxiang Liu, Li Chen, Yuxiang Lu, Shikun Feng, Zhida Feng, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

We argue that two factors, information bottleneck sensitivity and inconsistency between different attention topologies, could affect the performance of the Sparse Transformer.

Sparse Learning text-classification +1

Towards Multi-Turn Empathetic Dialogs with Positive Emotion Elicitation

no code implementations22 Apr 2022 Shihang Wang, Xinchao Xu, Wenquan Wu, Zheng-Yu Niu, Hua Wu, Haifeng Wang

In this task, the agent conducts empathetic responses along with the target of eliciting the user's positive emotions in the multi-turn dialog.

A Thorough Examination on Zero-shot Dense Retrieval

no code implementations27 Apr 2022 Ruiyang Ren, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Qifei Wu, Yuchen Ding, Hua Wu, Haifeng Wang, Ji-Rong Wen

Recent years have witnessed the significant advance in dense retrieval (DR) based on powerful pre-trained language models (PLM).

Retrieval

HelixADMET: a robust and endpoint extensible ADMET system incorporating self-supervised knowledge transfer

no code implementations17 May 2022 Shanzhuo Zhang, Zhiyuan Yan, Yueyang Huang, Lihang Liu, Donglong He, Wei Wang, Xiaomin Fang, Xiaonan Zhang, Fan Wang, Hua Wu, Haifeng Wang

Additionally, the pre-trained model provided by H-ADMET can be fine-tuned to generate new and customised ADMET endpoints, meeting various demands of drug research and development requirements.

Drug Discovery Self-Supervised Learning +1

ERNIE-Search: Bridging Cross-Encoder with Dual-Encoder via Self On-the-fly Distillation for Dense Passage Retrieval

no code implementations18 May 2022 Yuxiang Lu, Yiding Liu, Jiaxiang Liu, Yunsheng Shi, Zhengjie Huang, Shikun Feng Yu Sun, Hao Tian, Hua Wu, Shuaiqiang Wang, Dawei Yin, Haifeng Wang

Our method 1) introduces a self on-the-fly distillation method that can effectively distill late interaction (i. e., ColBERT) to vanilla dual-encoder, and 2) incorporates a cascade distillation process to further improve the performance with a cross-encoder teacher.

Knowledge Distillation Open-Domain Question Answering +2

A Fine-grained Interpretability Evaluation Benchmark for Neural NLP

no code implementations23 May 2022 Lijie Wang, Yaozong Shen, Shuyuan Peng, Shuai Zhang, Xinyan Xiao, Hao liu, Hongxuan Tang, Ying Chen, Hua Wu, Haifeng Wang

Based on this benchmark, we conduct experiments on three typical models with three saliency methods, and unveil their strengths and weakness in terms of interpretability.

Reading Comprehension Sentiment Analysis

Less Learn Shortcut: Analyzing and Mitigating Learning of Spurious Feature-Label Correlation

1 code implementation25 May 2022 Yanrui Du, Jing Yan, Yan Chen, Jing Liu, Sendong Zhao, Qiaoqiao She, Hua Wu, Haifeng Wang, Bing Qin

In this study, we focus on the spurious correlation between word features and labels that models learn from the biased data distribution of training data.

Natural Language Inference Sentiment Analysis

Link the World: Improving Open-domain Conversation with Dynamic Spatiotemporal-aware Knowledge

no code implementations28 Jun 2022 Han Zhou, Xinchao Xu, Wenquan Wu, Zheng-Yu Niu, Hua Wu, Siqi Bao, Fan Wang, Haifeng Wang

Making chatbots world aware in a conversation like a human is a crucial challenge, where the world may contain dynamic knowledge and spatiotemporal state.

Informativeness

HelixFold-Single: MSA-free Protein Structure Prediction by Using Protein Language Model as an Alternative

1 code implementation28 Jul 2022 Xiaomin Fang, Fan Wang, Lihang Liu, Jingzhou He, Dayong Lin, Yingfei Xiang, Xiaonan Zhang, Hua Wu, Hui Li, Le Song

Our proposed method, HelixFold-Single, first pre-trains a large-scale protein language model (PLM) with thousands of millions of primary sequences utilizing the self-supervised learning paradigm, which will be used as an alternative to MSAs for learning the co-evolution information.

Protein Language Model Protein Structure Prediction +1

An Interpretability Evaluation Benchmark for Pre-trained Language Models

no code implementations28 Jul 2022 Yaozong Shen, Lijie Wang, Ying Chen, Xinyan Xiao, Jing Liu, Hua Wu

To fill in the gap, we propose a novel evaluation benchmark providing with both English and Chinese annotated data.

GEM-2: Next Generation Molecular Property Prediction Network by Modeling Full-range Many-body Interactions

1 code implementation11 Aug 2022 Lihang Liu, Donglong He, Xiaomin Fang, Shanzhuo Zhang, Fan Wang, Jingzhou He, Hua Wu

Full-range many-body interactions between electrons have been proven effective in obtaining an accurate solution of the Schr"odinger equation by classical computational chemistry methods, although modeling such interactions consumes an expensive computational cost.

Drug Discovery Graph Regression +2

SeSQL: Yet Another Large-scale Session-level Chinese Text-to-SQL Dataset

no code implementations26 Aug 2022 Saihao Huang, Lijie Wang, Zhenghua Li, Zeyang Liu, Chenhui Dou, Fukang Yan, Xinyan Xiao, Hua Wu, Min Zhang

As the first session-level Chinese dataset, CHASE contains two separate parts, i. e., 2, 003 sessions manually constructed from scratch (CHASE-C), and 3, 456 sessions translated from English SParC (CHASE-T).

SQL Parsing Text-To-SQL

Towards Boosting the Open-Domain Chatbot with Human Feedback

1 code implementation30 Aug 2022 Hua Lu, Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang

Many open-domain dialogue models pre-trained with social media comments can generate coherent replies but have difficulties producing engaging responses when interacting with real users.

Chatbot

ERNIE-ViL 2.0: Multi-view Contrastive Learning for Image-Text Pre-training

1 code implementation30 Sep 2022 Bin Shan, Weichong Yin, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

They attempt to learn cross-modal representation using contrastive learning on image-text pairs, however, the built inter-modal correlations only rely on a single view for each modality.

Computational Efficiency Contrastive Learning +7

Q-TOD: A Query-driven Task-oriented Dialogue System

1 code implementation14 Oct 2022 Xin Tian, Yingzhan Lin, Mengfei Song, Siqi Bao, Fan Wang, Huang He, Shuqi Sun, Hua Wu

Firstly, as the query is in the form of natural language and not confined to the schema of the knowledge base, the issue of domain adaption is alleviated remarkably in Q-TOD.

Domain Adaptation Response Generation +2

Clip-Tuning: Towards Derivative-free Prompt Learning with a Mixture of Rewards

no code implementations21 Oct 2022 Yekun Chai, Shuohuan Wang, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

Derivative-free prompt learning has emerged as a lightweight alternative to prompt tuning, which only requires model inference to optimize the prompts.

FRSUM: Towards Faithful Abstractive Summarization via Enhancing Factual Robustness

no code implementations1 Nov 2022 Wenhao Wu, Wei Li, Jiachen Liu, Xinyan Xiao, Ziqiang Cao, Sujian Li, Hua Wu

We first measure a model's factual robustness by its success rate to defend against adversarial attacks when generating factual information.

Abstractive Text Summarization

PLATO-K: Internal and External Knowledge Enhanced Dialogue Generation

no code implementations2 Nov 2022 Siqi Bao, Huang He, Jun Xu, Hua Lu, Fan Wang, Hua Wu, Han Zhou, Wenquan Wu, Zheng-Yu Niu, Haifeng Wang

Recently, the practical deployment of open-domain dialogue systems has been plagued by the knowledge issue of information deficiency and factual inaccuracy.

Dialogue Generation Memorization +1

CLOP: Video-and-Language Pre-Training with Knowledge Regularizations

no code implementations7 Nov 2022 Guohao Li, Hu Yang, Feng He, Zhifan Feng, Yajuan Lyu, Hua Wu, Haifeng Wang

To this end, we propose a Cross-modaL knOwledge-enhanced Pre-training (CLOP) method with Knowledge Regularizations.

Contrastive Learning Retrieval +1

ERNIE-SAT: Speech and Text Joint Pretraining for Cross-Lingual Multi-Speaker Text-to-Speech

2 code implementations7 Nov 2022 Xiaoran Fan, Chao Pang, Tian Yuan, He Bai, Renjie Zheng, Pengfei Zhu, Shuohuan Wang, Junkun Chen, Zeyu Chen, Liang Huang, Yu Sun, Hua Wu

In this paper, we extend the pretraining method for cross-lingual multi-speaker speech synthesis tasks, including cross-lingual multi-speaker voice cloning and cross-lingual multi-speaker speech editing.

Representation Learning Speech Synthesis +2

ERNIE-UniX2: A Unified Cross-lingual Cross-modal Framework for Understanding and Generation

no code implementations9 Nov 2022 Bin Shan, Yaqian Han, Weichong Yin, Shuohuan Wang, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

Recent cross-lingual cross-modal works attempt to extend Vision-Language Pre-training (VLP) models to non-English inputs and achieve impressive performance.

Contrastive Learning Language Modelling +4

ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages

1 code implementation13 Dec 2022 Yekun Chai, Shuohuan Wang, Chao Pang, Yu Sun, Hao Tian, Hua Wu

Extensive results show that ERNIE-Code outperforms previous multilingual LLMs for PL or NL across a wide range of end tasks of code intelligence, including multilingual code-to-text, text-to-code, code-to-code, and text-to-text generation.

Code Summarization Language Modelling +2

Query Enhanced Knowledge-Intensive Conversation via Unsupervised Joint Modeling

1 code implementation19 Dec 2022 Mingzhu Cai, Siqi Bao, Xin Tian, Huang He, Fan Wang, Hua Wu

In this paper, we propose an unsupervised query enhanced approach for knowledge-intensive conversations, namely QKConv.

Conversational Question Answering Retrieval

ERNIE 3.0 Tiny: Frustratingly Simple Method to Improve Task-Agnostic Distillation Generalization

1 code implementation9 Jan 2023 Weixin Liu, Xuyi Chen, Jiaxiang Liu, Shikun Feng, Yu Sun, Hao Tian, Hua Wu

Experimental results demonstrate that our method yields a student with much better generalization, significantly outperforms existing baselines, and establishes a new state-of-the-art result on in-domain, out-domain, and low-resource datasets in the setting of task-agnostic distillation.

Knowledge Distillation Language Modelling +1

Universal Information Extraction as Unified Semantic Matching

no code implementations9 Jan 2023 Jie Lou, Yaojie Lu, Dai Dai, Wei Jia, Hongyu Lin, Xianpei Han, Le Sun, Hua Wu

Based on this paradigm, we propose to universally model various IE tasks with Unified Semantic Matching (USM) framework, which introduces three unified token linking operations to model the abilities of structuring and conceptualizing.

ERNIE-Music: Text-to-Waveform Music Generation with Diffusion Models

no code implementations9 Feb 2023 Pengfei Zhu, Chao Pang, Yekun Chai, Lei LI, Shuohuan Wang, Yu Sun, Hao Tian, Hua Wu

In response to this lacuna, this paper introduces a pioneering contribution in the form of a text-to-waveform music generation model, underpinned by the utilization of diffusion models.

Music Generation Text-to-Music Generation

SMoA: Sparse Mixture of Adapters to Mitigate Multiple Dataset Biases

no code implementations28 Feb 2023 Yanchen Liu, Jing Yan, Yan Chen, Jing Liu, Hua Wu

Recent studies reveal that various biases exist in different NLP tasks, and over-reliance on biases results in models' poor generalization ability and low adversarial robustness.

Adversarial Robustness Natural Language Inference +1

Improving Zero-shot Multilingual Neural Machine Translation by Leveraging Cross-lingual Consistency Regularization

1 code implementation12 May 2023 Pengzhi Gao, Liwen Zhang, Zhongjun He, Hua Wu, Haifeng Wang

The experimental analysis also proves that CrossConST could close the sentence representation gap and better align the representation space.

Machine Translation NMT +2

TOME: A Two-stage Approach for Model-based Retrieval

no code implementations18 May 2023 Ruiyang Ren, Wayne Xin Zhao, Jing Liu, Hua Wu, Ji-Rong Wen, Haifeng Wang

Recently, model-based retrieval has emerged as a new paradigm in text retrieval that discards the index in the traditional retrieval model and instead memorizes the candidate corpora using model parameters.

Natural Questions Retrieval +1

Learning In-context Learning for Named Entity Recognition

2 code implementations18 May 2023 Jiawei Chen, Yaojie Lu, Hongyu Lin, Jie Lou, Wei Jia, Dai Dai, Hua Wu, Boxi Cao, Xianpei Han, Le Sun

M}$, and a new entity extractor can be implicitly constructed by applying new instruction and demonstrations to PLMs, i. e., $\mathcal{ (\lambda .

few-shot-ner Few-shot NER +4

A Simple yet Effective Self-Debiasing Framework for Transformer Models

1 code implementation2 Jun 2023 Xiaoyue Wang, Lijie Wang, Xin Liu, Suhang Wu, Jinsong Su, Hua Wu

In this way, the top-layer sentence representation will be trained to ignore the common biased features encoded by the low-layer sentence representation and focus on task-relevant unbiased features.

Natural Language Understanding Sentence

Learning Multilingual Sentence Representations with Cross-lingual Consistency Regularization

1 code implementation12 Jun 2023 Pengzhi Gao, Liwen Zhang, Zhongjun He, Hua Wu, Haifeng Wang

Multilingual sentence representations are the foundation for similarity-based bitext mining, which is crucial for scaling multilingual neural machine translation (NMT) system to more languages.

Machine Translation NMT +2

Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation

1 code implementation20 Jul 2023 Ruiyang Ren, Yuhao Wang, Yingqi Qu, Wayne Xin Zhao, Jing Liu, Hao Tian, Hua Wu, Ji-Rong Wen, Haifeng Wang

In this study, we present an initial analysis of the factual knowledge boundaries of LLMs and how retrieval augmentation affects LLMs on open-domain QA.

Open-Domain Question Answering Retrieval +1

An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text Translation

1 code implementation28 Aug 2023 Pengzhi Gao, Ruiqing Zhang, Zhongjun He, Hua Wu, Haifeng Wang

Consistency regularization methods, such as R-Drop (Liang et al., 2021) and CrossConST (Gao et al., 2023), have achieved impressive supervised and zero-shot performance in the neural machine translation (NMT) field.

Machine Translation NMT +2

Tool-Augmented Reward Modeling

1 code implementation2 Oct 2023 Lei LI, Yekun Chai, Shuohuan Wang, Yu Sun, Hao Tian, Ningyu Zhang, Hua Wu

We validate our approach across a wide range of domains, incorporating seven distinct external tools.

IBADR: an Iterative Bias-Aware Dataset Refinement Framework for Debiasing NLU models

no code implementations1 Nov 2023 Xiaoyue Wang, Xin Liu, Lijie Wang, Yaoxiang Wang, Jinsong Su, Hua Wu

Then, we pair each sample with a bias indicator representing its bias degree, and use these extended samples to train a sample generator.

Natural Language Understanding

Towards Boosting Many-to-Many Multilingual Machine Translation with Large Language Models

1 code implementation11 Jan 2024 Pengzhi Gao, Zhongjun He, Hua Wu, Haifeng Wang

The training paradigm for machine translation has gradually shifted, from learning neural machine translation (NMT) models with extensive parallel corpora to instruction finetuning on multilingual large language models (LLMs) with high-quality translation pairs.

Machine Translation NMT +1

DeepRicci: Self-supervised Graph Structure-Feature Co-Refinement for Alleviating Over-squashing

no code implementations23 Jan 2024 Li Sun, Zhenhao Huang, Hua Wu, Junda Ye, Hao Peng, Zhengtao Yu, Philip S. Yu

Graph Neural Networks (GNNs) have shown great power for learning and mining on graphs, and Graph Structure Learning (GSL) plays an important role in boosting GNNs with a refined graph.

Contrastive Learning Graph structure learning

BASES: Large-scale Web Search User Simulation with Large Language Model based Agents

no code implementations27 Feb 2024 Ruiyang Ren, Peng Qiu, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Hua Wu, Ji-Rong Wen, Haifeng Wang

Due to the excellent capacities of large language models (LLMs), it becomes feasible to develop LLM-based agents for reliable user simulation.

Information Retrieval Language Modelling +3

On Training Data Influence of GPT Models

1 code implementation11 Apr 2024 Qingyi Liu, Yekun Chai, Shuohuan Wang, Yu Sun, Qiwei Peng, Keze Wang, Hua Wu

This paper presents GPTfluence, a novel approach that leverages a featurized simulation to assess the impact of training examples on the training dynamics of GPT models.

Natural Language Understanding

Dual Modalities of Text: Visual and Textual Generative Pre-training

no code implementations16 Apr 2024 Yekun Chai, Qingyi Liu, Jingwu Xiao, Shuohuan Wang, Yu Sun, Hua Wu

Harnessing visual texts represents a burgeoning frontier in the evolution of language modeling.

Language Modelling

\textrm{DuReader}_{\textrm{vis}}: A Chinese Dataset for Open-domain Document Visual Question Answering

1 code implementation Findings (ACL) 2022 Le Qi, Shangwen Lv, Hongyu Li, Jing Liu, Yu Zhang, Qiaoqiao She, Hua Wu, Haifeng Wang, Ting Liu

Open-domain question answering has been used in a wide range of applications, such as web search and enterprise search, which usually takes clean texts extracted from various formats of documents (e. g., web pages, PDFs, or Word documents) as the information source.

document understanding Open-Domain Question Answering +1

PLATO-KAG: Unsupervised Knowledge-Grounded Conversation via Joint Modeling

no code implementations EMNLP (NLP4ConvAI) 2021 Xinxian Huang, Huang He, Siqi Bao, Fan Wang, Hua Wu, Haifeng Wang

Large-scale conversation models are turning to leveraging external knowledge to improve the factual accuracy in response generation.

Response Generation

Diversified Multiple Instance Learning for Document-Level Multi-Aspect Sentiment Classification

no code implementations EMNLP 2020 Yunjie Ji, Hao liu, Bolei He, Xinyan Xiao, Hua Wu, Yanhua Yu

To this end, we propose a novel Diversified Multiple Instance Learning Network (D-MILN), which is able to achieve aspect-level sentiment classification with only document-level weak supervision.

General Classification Multiple Instance Learning +2

Learning Adaptive Segmentation Policy for End-to-End Simultaneous Translation

no code implementations ACL 2022 Ruiqing Zhang, Zhongjun He, Hua Wu, Haifeng Wang

End-to-end simultaneous speech-to-text translation aims to directly perform translation from streaming source speech to target text with high translation quality and low latency.

Segmentation Simultaneous Speech-to-Text Translation +1

Learning Adaptive Segmentation Policy for Simultaneous Translation

no code implementations EMNLP 2020 Ruiqing Zhang, Chuanqiang Zhang, Zhongjun He, Hua Wu, Haifeng Wang

The policy learns to segment the source text by considering possible translations produced by the translation model, maintaining consistency between the segmentation and translation.

Segmentation Translation

SgSum:Transforming Multi-document Summarization into Sub-graph Selection

1 code implementation EMNLP 2021 Moye Chen, Wei Li, Jiachen Liu, Xinyan Xiao, Hua Wu, Haifeng Wang

Comparing with traditional methods, our method has two main advantages: (1) the relations between sentences are captured by modeling both the graph structure of the whole document set and the candidate sub-graphs; (2) directly outputs an integrate summary in the form of sub-graph which is more informative and coherent.

Document Summarization Multi-Document Summarization +1

Cannot find the paper you are looking for? You can Submit a new open access paper.