Search Results for author: Daxin Jiang

Found 92 papers, 34 papers with code

KnowDA: All-in-One Knowledge Mixture Model for Data Augmentation in Few-Shot NLP

no code implementations21 Jun 2022 YuFei Wang, Jiayi Zheng, Can Xu, Xiubo Geng, Tao Shen, Chongyang Tao, Daxin Jiang

To combat this issue, we propose the Knowledge Mixture Data Augmentation Model (KnowDA): an encoder-decoder LM pretrained on a mixture of diverse NLP tasks using Knowledge Mixture Training (KoMT).

Data Augmentation Denoising +3

Towards Robust Ranker for Text Retrieval

no code implementations16 Jun 2022 Yucheng Zhou, Tao Shen, Xiubo Geng, Chongyang Tao, Can Xu, Guodong Long, Binxing Jiao, Daxin Jiang

A ranker plays an indispensable role in the de facto 'retrieval & rerank' pipeline, but its training still lags behind -- learning from moderate negatives or/and serving as an auxiliary module for a retriever.

Passage Retrieval

Unsupervised Context Aware Sentence Representation Pretraining for Multi-lingual Dense Retrieval

1 code implementation7 Jun 2022 Ning Wu, Yaobo Liang, Houxing Ren, Linjun Shou, Nan Duan, Ming Gong, Daxin Jiang

On the multilingual sentence retrieval task Tatoeba, our model achieves new SOTA results among methods without using bilingual data.

Language Modelling Passage Retrieval +3

Task-Specific Expert Pruning for Sparse Mixture-of-Experts

no code implementations1 Jun 2022 Tianyu Chen, Shaohan Huang, Yuan Xie, Binxing Jiao, Daxin Jiang, Haoyi Zhou, JianXin Li, Furu Wei

The sparse Mixture-of-Experts (MoE) model is powerful for large-scale pre-training and has achieved promising results due to its model capacity.

THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption

no code implementations Findings (ACL) 2022 Tianyu Chen, Hangbo Bao, Shaohan Huang, Li Dong, Binxing Jiao, Daxin Jiang, Haoyi Zhou, JianXin Li, Furu Wei

As more and more pre-trained language models adopt on-cloud deployment, the privacy issues grow quickly, mainly for the exposure of plain-text user data (e. g., search history, medical record, bank account).

Privacy Preserving

Negative Sampling for Contrastive Representation Learning: A Review

no code implementations1 Jun 2022 Lanling Xu, Jianxun Lian, Wayne Xin Zhao, Ming Gong, Linjun Shou, Daxin Jiang, Xing Xie, Ji-Rong Wen

The learn-to-compare paradigm of contrastive representation learning (CRL), which compares positive samples with negative ones for representation learning, has achieved great success in a wide range of domains, including natural language processing, computer vision, information retrieval and graph learning.

Graph Learning Information Retrieval +2

Multi-level Contrastive Learning for Cross-lingual Spoken Language Understanding

no code implementations7 May 2022 Shining Liang, Linjun Shou, Jian Pei, Ming Gong, Wanli Zuo, Xianglin Zuo, Daxin Jiang

Although spoken language understanding (SLU) has achieved great success in high-resource languages, such as English, it remains challenging in low-resource languages mainly due to the lack of high quality training data.

Contrastive Learning Spoken Language Understanding +1

Bridging the Gap between Language Models and Cross-Lingual Sequence Labeling

no code implementations11 Apr 2022 Nuo Chen, Linjun Shou, Ming Gong, Jian Pei, Daxin Jiang

Large-scale cross-lingual pre-trained language models (xPLMs) have shown effectiveness in cross-lingual sequence labeling tasks (xSL), such as cross-lingual machine reading comprehension (xMRC) by transferring knowledge from a high-resource language to low-resource languages.

Contrastive Learning Language Modelling +1

Transformer-Empowered Content-Aware Collaborative Filtering

no code implementations2 Apr 2022 Weizhe Lin, Linjun Shou, Ming Gong, Pei Jian, Zhilin Wang, Bill Byrne, Daxin Jiang

Knowledge graph (KG) based Collaborative Filtering is an effective approach to personalizing recommendation systems for relatively static domains such as movies and books, by leveraging structured information from KG to enrich both item and user representations.

Collaborative Filtering Contrastive Learning +1

HeterMPC: A Heterogeneous Graph Neural Network for Response Generation in Multi-Party Conversations

1 code implementation ACL 2022 Jia-Chen Gu, Chao-Hong Tan, Chongyang Tao, Zhen-Hua Ling, Huang Hu, Xiubo Geng, Daxin Jiang

To address these challenges, we present HeterMPC, a heterogeneous graph-based neural network for response generation in MPCs which models the semantics of utterances and interlocutors simultaneously with two types of nodes in a graph.

Response Generation

Multi-View Document Representation Learning for Open-Domain Dense Retrieval

no code implementations ACL 2022 Shunyu Zhang, Yaobo Liang, Ming Gong, Daxin Jiang, Nan Duan

Second, to prevent multi-view embeddings from collapsing to the same one, we further propose a global-local loss with annealed temperature to encourage the multiple viewers to better align with different potential queries.

Representation Learning

NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN

no code implementations10 Feb 2022 Minheng Ni, Chenfei Wu, Haoyang Huang, Daxin Jiang, WangMeng Zuo, Nan Duan

Language guided image inpainting aims to fill in the defective regions of an image under the guidance of text while keeping non-defective regions unchanged.

Image Inpainting

PCL: Peer-Contrastive Learning with Diverse Augmentations for Unsupervised Sentence Embeddings

no code implementations28 Jan 2022 Qiyu Wu, Chongyang Tao, Tao Shen, Can Xu, Xiubo Geng, Daxin Jiang

A straightforward solution is resorting to more diverse positives from a multi-augmenting strategy, while an open question remains about how to unsupervisedly learn from the diverse positives but with uneven augmenting qualities in the text field.

Contrastive Learning Natural Language Processing +1

CodeRetriever: Unimodal and Bimodal Contrastive Learning

1 code implementation26 Jan 2022 Xiaonan Li, Yeyun Gong, Yelong Shen, Xipeng Qiu, Hang Zhang, Bolun Yao, Weizhen Qi, Daxin Jiang, Weizhu Chen, Nan Duan

For bimodal contrastive learning, we leverage the documentation and in-line comments of code to build text-code pairs.

Code Search Contrastive Learning

From Good to Best: Two-Stage Training for Cross-lingual Machine Reading Comprehension

no code implementations9 Dec 2021 Nuo Chen, Linjun Shou, Min Gong, Jian Pei, Daxin Jiang

Cross-lingual Machine Reading Comprehension (xMRC) is challenging due to the lack of training data in low-resource languages.

Contrastive Learning Machine Reading Comprehension

NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion

1 code implementation24 Nov 2021 Chenfei Wu, Jian Liang, Lei Ji, Fan Yang, Yuejian Fang, Daxin Jiang, Nan Duan

To cover language, image, and video at the same time for different scenarios, a 3D transformer encoder-decoder framework is designed, which can not only deal with videos as 3D data but also adapt to texts and images as 1D and 2D data, respectively.

Text to image generation Text-to-Image Generation +3

Multimodal Dialogue Response Generation

no code implementations ACL 2022 Qingfeng Sun, Yujing Wang, Can Xu, Kai Zheng, Yaming Yang, Huang Hu, Fei Xu, Jessica Zhang, Xiubo Geng, Daxin Jiang

In such a low-resource setting, we devise a novel conversational agent, Divter, in order to isolate parameters that depend on multimodal dialogues from the entire generation model.

Dialogue Generation Response Generation

EventBERT: A Pre-Trained Model for Event Correlation Reasoning

no code implementations13 Oct 2021 Yucheng Zhou, Xiubo Geng, Tao Shen, Guodong Long, Daxin Jiang

Event correlation reasoning infers whether a natural language paragraph containing multiple events conforms to human common sense.

Cloze Test Common Sense Reasoning +1

Building an Efficient and Effective Retrieval-based Dialogue System via Mutual Learning

no code implementations1 Oct 2021 Chongyang Tao, Jiazhan Feng, Chang Liu, Juntao Li, Xiubo Geng, Daxin Jiang

For this task, the adoption of pre-trained language models (such as BERT) has led to remarkable progress in a number of benchmarks.

Re-Ranking

Learning to Ground Visual Objects for Visual Dialog

no code implementations Findings (EMNLP) 2021 Feilong Chen, Xiuyi Chen, Can Xu, Daxin Jiang

Specifically, a posterior distribution over visual objects is inferred from both context (history and questions) and answers, and it ensures the appropriate grounding of visual objects during the training process.

Visual Dialog

Learning from Multiple Noisy Augmented Data Sets for Better Cross-Lingual Spoken Language Understanding

no code implementations EMNLP 2021 YingMei Guo, Linjun Shou, Jian Pei, Ming Gong, Mingxing Xu, Zhiyong Wu, Daxin Jiang

Although various data augmentation approaches have been proposed to synthesize training data in low-resource target languages, the augmented data sets are often noisy, and thus impede the performance of SLU models.

Data Augmentation Denoising +1

Smart Bird: Learnable Sparse Attention for Efficient and Effective Transformer

no code implementations20 Aug 2021 Chuhan Wu, Fangzhao Wu, Tao Qi, Binxing Jiao, Daxin Jiang, Yongfeng Huang, Xing Xie

We then sample token pairs based on their probability scores derived from the sketched attention matrix to generate different sparse attention index matrices for different attention heads.

Reasoning over Entity-Action-Location Graph for Procedural Text Understanding

no code implementations ACL 2021 Hao Huang, Xiubo Geng, Jian Pei, Guodong Long, Daxin Jiang

Procedural text understanding aims at tracking the states (e. g., create, move, destroy) and locations of the entities mentioned in a given paragraph.

graph construction Procedural Text Understanding +1

Neural Rule-Execution Tracking Machine For Transformer-Based Text Generation

no code implementations NeurIPS 2021 YuFei Wang, Can Xu, Huang Hu, Chongyang Tao, Stephen Wan, Mark Dras, Mark Johnson, Daxin Jiang

Sequence-to-Sequence (S2S) neural text generation models, especially the pre-trained ones (e. g., BART and T5), have exhibited compelling performance on various natural language generation tasks.

Text Generation

Language Scaling for Universal Suggested Replies Model

no code implementations NAACL 2021 Qianlan Ying, Payal Bajaj, Budhaditya Deb, Yu Yang, Wei Wang, Bojia Lin, Milad Shokouhi, Xia Song, Yang Yang, Daxin Jiang

Faced with increased compute requirements and low resources for language expansion, we build a single universal model for improving the quality and reducing run-time costs of our production system.

Continual Learning Cross-Lingual Transfer

MPC-BERT: A Pre-Trained Language Model for Multi-Party Conversation Understanding

1 code implementation ACL 2021 Jia-Chen Gu, Chongyang Tao, Zhen-Hua Ling, Can Xu, Xiubo Geng, Daxin Jiang

Recently, various neural models for multi-party conversation (MPC) have achieved impressive improvements on a variety of tasks such as addressee recognition, speaker identification and response prediction.

Language Modelling Speaker Identification

Improving Zero-Shot Cross-lingual Transfer for Multilingual Question Answering over Knowledge Graph

no code implementations NAACL 2021 Yucheng Zhou, Xiubo Geng, Tao Shen, Wenqiang Zhang, Daxin Jiang

That is, we can only access training data in a high-resource language, while need to answer multilingual questions without any labeled data in target languages.

Bilingual Lexicon Induction Question Answering +1

Maria: A Visual Experience Powered Conversational Agent

1 code implementation ACL 2021 Zujie Liang, Huang Hu, Can Xu, Chongyang Tao, Xiubo Geng, Yining Chen, Fan Liang, Daxin Jiang

The retriever aims to retrieve a correlated image to the dialog from an image index, while the visual concept detector extracts rich visual knowledge from the image.

ChemistryQA: A Complex Question Answering Dataset from Chemistry

no code implementations1 Jan 2021 Zhuoyu Wei, Wei Ji, Xiubo Geng, Yining Chen, Baihua Chen, Tao Qin, Daxin Jiang

We notice that some real-world QA tasks are more complex, which cannot be solved by end-to-end neural networks or translated to any kind of formal representations.

Machine Reading Comprehension Question Answering

Syntax-Enhanced Pre-trained Model

1 code implementation ACL 2021 Zenan Xu, Daya Guo, Duyu Tang, Qinliang Su, Linjun Shou, Ming Gong, Wanjun Zhong, Xiaojun Quan, Nan Duan, Daxin Jiang

We study the problem of leveraging the syntactic structure of text to enhance pre-trained models such as BERT and RoBERTa.

Entity Typing Question Answering +1

Reinforced Multi-Teacher Selection for Knowledge Distillation

no code implementations11 Dec 2020 Fei Yuan, Linjun Shou, Jian Pei, Wutao Lin, Ming Gong, Yan Fu, Daxin Jiang

When multiple teacher models are available in distillation, the state-of-the-art methods assign a fixed weight to a teacher model in the whole distillation.

Knowledge Distillation Model Compression +1

CalibreNet: Calibration Networks for Multilingual Sequence Labeling

no code implementations11 Nov 2020 Shining Liang, Linjun Shou, Jian Pei, Ming Gong, Wanli Zuo, Daxin Jiang

To tackle the challenge of lack of training data in low-resource languages, we dedicatedly develop a novel unsupervised phrase boundary recovery pre-training task to enhance the multilingual boundary detection capability of CalibreNet.

Boundary Detection Cross-Lingual NER +3

Cross-lingual Machine Reading Comprehension with Language Branch Knowledge Distillation

no code implementations COLING 2020 Junhao Liu, Linjun Shou, Jian Pei, Ming Gong, Min Yang, Daxin Jiang

Then, we devise a multilingual distillation approach to amalgamate knowledge from multiple language branch models to a single model for all target languages.

Knowledge Distillation Machine Reading Comprehension +1

A Graph Representation of Semi-structured Data for Web Question Answering

no code implementations COLING 2020 Xingyao Zhang, Linjun Shou, Jian Pei, Ming Gong, Lijie Wen, Daxin Jiang

The abundant semi-structured data on the Web, such as HTML-based tables and lists, provide commercial search engines a rich information source for question answering (QA).

Question Answering

Towards Interpretable Reasoning over Paragraph Effects in Situation

1 code implementation EMNLP 2020 Mucheng Ren, Xiubo Geng, Tao Qin, Heyan Huang, Daxin Jiang

We focus on the task of reasoning over paragraph effects in situation, which requires a model to understand the cause and effect described in a background paragraph, and apply the knowledge to a novel situation.

Knowledge-Aware Procedural Text Understanding with Multi-Stage Training

no code implementations28 Sep 2020 Zhihan Zhang, Xiubo Geng, Tao Qin, Yunfang Wu, Daxin Jiang

In this work, we focus on the task of procedural text understanding, which aims to comprehend such documents and track entities' states and locations during a process.

Procedural Text Understanding

No Answer is Better Than Wrong Answer: A Reflection Model for Document Level Machine Reading Comprehension

no code implementations Findings of the Association for Computational Linguistics 2020 Xuguang Wang, Linjun Shou, Ming Gong, Nan Duan, Daxin Jiang

The Natural Questions (NQ) benchmark set brings new challenges to Machine Reading Comprehension: the answers are not only at different levels of granularity (long and short), but also of richer types (including no-answer, yes/no, single-span and multi-span).

Machine Reading Comprehension

Difference-aware Knowledge Selection for Knowledge-grounded Conversation Generation

1 code implementation Findings of the Association for Computational Linguistics 2020 Chujie Zheng, Yunbo Cao, Daxin Jiang, Minlie Huang

In a multi-turn knowledge-grounded dialog, the difference between the knowledge selected at different turns usually provides potential clues to knowledge selection, which has been largely neglected in previous research.

GraphCodeBERT: Pre-training Code Representations with Data Flow

1 code implementation ICLR 2021 Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie Liu, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, Michele Tufano, Shao Kun Deng, Colin Clement, Dawn Drain, Neel Sundaresan, Jian Yin, Daxin Jiang, Ming Zhou

Instead of taking syntactic-level structure of code like abstract syntax tree (AST), we use data flow in the pre-training stage, which is a semantic-level structure of code that encodes the relation of "where-the-value-comes-from" between variables.

Clone Detection Code Completion +7

Learning an Effective Context-Response Matching Model with Self-Supervised Tasks for Retrieval-based Dialogues

no code implementations14 Sep 2020 Ruijian Xu, Chongyang Tao, Daxin Jiang, Xueliang Zhao, Dongyan Zhao, Rui Yan

To address these issues, in this paper, we propose learning a context-response matching model with auxiliary self-supervised tasks designed for the dialogue data based on pre-trained language models.

Conversational Response Selection

Evidence-Aware Inferential Text Generation with Vector Quantised Variational AutoEncoder

1 code implementation ACL 2020 Daya Guo, Duyu Tang, Nan Duan, Jian Yin, Daxin Jiang, Ming Zhou

Generating inferential texts about an event in different perspectives requires reasoning over different contexts that the event occurs.

Text Generation

Mining Implicit Relevance Feedback from User Behavior for Web Question Answering

no code implementations13 Jun 2020 Linjun Shou, Shining Bo, Feixiang Cheng, Ming Gong, Jian Pei, Daxin Jiang

In this paper, we make the first study to explore the correlation between user behavior and passage relevance, and propose a novel approach for mining training data for Web QA.

Passage Ranking Question Answering

Dance Revolution: Long-Term Dance Generation with Music via Curriculum Learning

1 code implementation ICLR 2021 Ruozi Huang, Huang Hu, Wei Wu, Kei Sawada, Mi Zhang, Daxin Jiang

In this paper, we formalize the music-conditioned dance generation as a sequence-to-sequence learning problem and devise a novel seq2seq architecture to efficiently process long sequences of music features and capture the fine-grained correspondence between music and dance.

motion synthesis

Document Modeling with Graph Attention Networks for Multi-grained Machine Reading Comprehension

1 code implementation ACL 2020 Bo Zheng, Haoyang Wen, Yaobo Liang, Nan Duan, Wanxiang Che, Daxin Jiang, Ming Zhou, Ting Liu

Natural Questions is a new challenging machine reading comprehension benchmark with two-grained answers, which are a long answer (typically a paragraph) and a short answer (one or more entities inside the long answer).

Graph Attention Machine Reading Comprehension

RikiNet: Reading Wikipedia Pages for Natural Question Answering

no code implementations ACL 2020 Dayiheng Liu, Yeyun Gong, Jie Fu, Yu Yan, Jiusheng Chen, Daxin Jiang, Jiancheng Lv, Nan Duan

The representations are then fed into the predictor to obtain the span of the short answer, the paragraph of the long answer, and the answer type in a cascaded manner.

Natural Language Understanding Question Answering

Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension

no code implementations ACL 2020 Fei Yuan, Linjun Shou, Xuanyu Bai, Ming Gong, Yaobo Liang, Nan Duan, Yan Fu, Daxin Jiang

Multilingual pre-trained models could leverage the training data from a rich source language (such as English) to improve performance on low resource languages.

Boundary Detection Machine Reading Comprehension +1

Pre-training Text Representations as Meta Learning

no code implementations12 Apr 2020 Shangwen Lv, Yuechen Wang, Daya Guo, Duyu Tang, Nan Duan, Fuqing Zhu, Ming Gong, Linjun Shou, Ryan Ma, Daxin Jiang, Guihong Cao, Ming Zhou, Songlin Hu

In this work, we introduce a learning algorithm which directly optimizes model's ability to learn text representations for effective learning of downstream tasks.

Language Modelling Meta-Learning +3

Diverse, Controllable, and Keyphrase-Aware: A Corpus and Method for News Multi-Headline Generation

1 code implementation EMNLP 2020 Dayiheng Liu, Yeyun Gong, Jie Fu, Wei Liu, Yu Yan, Bo Shao, Daxin Jiang, Jiancheng Lv, Nan Duan

Furthermore, we propose a simple and effective method to mine the keyphrases of interest in the news article and build a first large-scale keyphrase-aware news headline corpus, which contains over 180K aligned triples of $<$news article, headline, keyphrase$>$.

Headline generation

XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation

2 code implementations3 Apr 2020 Yaobo Liang, Nan Duan, Yeyun Gong, Ning Wu, Fenfei Guo, Weizhen Qi, Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, Xiaodong Fan, Ruofei Zhang, Rahul Agrawal, Edward Cui, Sining Wei, Taroon Bharti, Ying Qiao, Jiun-Hung Chen, Winnie Wu, Shuguang Liu, Fan Yang, Daniel Campos, Rangan Majumder, Ming Zhou

In this paper, we introduce XGLUE, a new benchmark dataset that can be used to train large-scale cross-lingual pre-trained models using multilingual and bilingual corpora and evaluate their performance across a diverse set of cross-lingual tasks.

Natural Language Understanding

DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding

no code implementations28 Feb 2020 Yuyu Zhang, Ping Nie, Xiubo Geng, Arun Ramamurthy, Le Song, Daxin Jiang

Recent studies on open-domain question answering have achieved prominent performance improvement using pre-trained language models such as BERT.

Open-Domain Question Answering

Model Compression with Two-stage Multi-teacher Knowledge Distillation for Web Question Answering System

no code implementations18 Oct 2019 Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang

The experiment results show that our method can significantly outperform the baseline methods and even achieve comparable results with the original teacher models, along with substantial speedup of model inference.

Knowledge Distillation Model Compression +1

Neural Semantic Parsing in Low-Resource Settings with Back-Translation and Meta-Learning

no code implementations12 Sep 2019 Yibo Sun, Duyu Tang, Nan Duan, Yeyun Gong, Xiaocheng Feng, Bing Qin, Daxin Jiang

Neural semantic parsing has achieved impressive results in recent years, yet its success relies on the availability of large amounts of supervised data.

Meta-Learning Semantic Parsing +1

NeuronBlocks: Building Your NLP DNN Models Like Playing Lego

2 code implementations IJCNLP 2019 Ming Gong, Linjun Shou, Wutao Lin, Zhijie Sang, Quanjia Yan, Ze Yang, Feixiang Cheng, Daxin Jiang

Deep Neural Networks (DNN) have been widely employed in industry to address various Natural Language Processing (NLP) tasks.

Natural Language Processing

Model Compression with Multi-Task Knowledge Distillation for Web-scale Question Answering System

no code implementations21 Apr 2019 Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang

Deep pre-training and fine-tuning models (like BERT, OpenAI GPT) have demonstrated excellent results in question answering areas.

Knowledge Distillation Model Compression +1

Assertion-based QA with Question-Aware Open Information Extraction

no code implementations23 Jan 2018 Zhao Yan, Duyu Tang, Nan Duan, Shujie Liu, Wendi Wang, Daxin Jiang, Ming Zhou, Zhoujun Li

We present assertion based question answering (ABQA), an open domain question answering task that takes a question and a passage as inputs, and outputs a semi-structured assertion consisting of a subject, a predicate and a list of arguments.

Learning-To-Rank Open-Domain Question Answering +2

Cannot find the paper you are looking for? You can Submit a new open access paper.