Search Results for author: Yelong Shen

Found 39 papers, 18 papers with code

Finding the Dominant Winning Ticket in Pre-Trained Language Models

no code implementations Findings (ACL) 2022 Zhuocheng Gong, Di He, Yelong Shen, Tie-Yan Liu, Weizhu Chen, Dongyan Zhao, Ji-Rong Wen, Rui Yan

Empirically, we show that (a) the dominant winning ticket can achieve performance that is comparable with that of the full-parameter model, (b) the dominant winning ticket is transferable across different tasks, (c) and the dominant winning ticket has a natural structure within each parameter matrix.

Joint Generator-Ranker Learning for Natural Language Generation

no code implementations28 Jun 2022 Weizhou Shen, Yeyun Gong, Yelong Shen, Song Wang, Xiaojun Quan, Nan Duan, Weizhu Chen

Due to exposure bias, most existing natural language generation (NLG) models trained by maximizing the likelihood objective predict poor text results during the inference stage.

Question Generation Response Generation +1

A Self-Paced Mixed Distillation Method for Non-Autoregressive Generation

no code implementations23 May 2022 Weizhen Qi, Yeyun Gong, Yelong Shen, Jian Jiao, Yu Yan, Houqiang Li, Ruofei Zhang, Weizhu Chen, Nan Duan

To further illustrate the commercial value of our approach, we conduct experiments on three generation tasks in real-world advertisements applications.

Question Generation Text Generation

CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing

1 code implementation ACL 2022 Chen Liang, Pengcheng He, Yelong Shen, Weizhu Chen, Tuo Zhao

To retain ensemble benefits while maintaining a low memory cost, we propose a consistency-regularized ensemble learning approach based on perturbed models, named CAMERO.

Ensemble Learning

Controllable Natural Language Generation with Contrastive Prefixes

no code implementations Findings (ACL) 2022 Jing Qian, Li Dong, Yelong Shen, Furu Wei, Weizhu Chen

We propose a novel supervised method and also an unsupervised method to train the prefixes for single-aspect control while the combination of these two methods can achieve multi-aspect control.

Language Modelling Pretrained Language Models +1

CodeRetriever: Unimodal and Bimodal Contrastive Learning

1 code implementation26 Jan 2022 Xiaonan Li, Yeyun Gong, Yelong Shen, Xipeng Qiu, Hang Zhang, Bolun Yao, Weizhen Qi, Daxin Jiang, Weizhu Chen, Nan Duan

For bimodal contrastive learning, we leverage the documentation and in-line comments of code to build text-code pairs.

Code Search Contrastive Learning

Knowledge-Grounded Dialogue Generation with a Unified Knowledge Representation

no code implementations NAACL 2022 Yu Li, Baolin Peng, Yelong Shen, Yi Mao, Lars Liden, Zhou Yu, Jianfeng Gao

To address these challenges, we present PLUG, a language model that homogenizes different knowledge sources to a unified knowledge representation for knowledge-grounded dialogue generation tasks.

Dialogue Generation Language Modelling

Adversarial Retriever-Ranker for dense text retrieval

1 code implementation ICLR 2022 Hang Zhang, Yeyun Gong, Yelong Shen, Jiancheng Lv, Nan Duan, Weizhu Chen

To address these challenges, we present Adversarial Retriever-Ranker (AR2), which consists of a dual-encoder retriever plus a cross-encoder ranker.

Natural Questions TriviaQA

LoRA: Low-Rank Adaptation of Large Language Models

3 code implementations ICLR 2022 Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen

We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks.

Language Modelling Natural Language Processing

Poolingformer: Long Document Modeling with Pooling Attention

no code implementations10 May 2021 Hang Zhang, Yeyun Gong, Yelong Shen, Weisheng Li, Jiancheng Lv, Nan Duan, Weizhu Chen

We first evaluate Poolingformer on two long sequence QA tasks: the monolingual NQ and the multilingual TyDi QA.

UnitedQA: A Hybrid Approach for Open Domain Question Answering

no code implementations ACL 2021 Hao Cheng, Yelong Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao

To date, most of recent work under the retrieval-reader framework for open-domain QA focuses on either extractive or generative reader exclusively.

Open-Domain Question Answering TriviaQA

Rider: Reader-Guided Passage Reranking for Open-Domain Question Answering

1 code implementation1 Jan 2021 Yuning Mao, Pengcheng He, Xiaodong Liu, Yelong Shen, Jianfeng Gao, Jiawei Han, Weizhu Chen

Current open-domain question answering systems often follow a Retriever-Reader architecture, where the retriever first retrieves relevant passages and the reader then reads the retrieved passages to form an answer.

Natural Questions Open-Domain Question Answering +1

Adversarial Attacks on Deep Graph Matching

no code implementations NeurIPS 2020 Zijie Zhang, Zeru Zhang, Yang Zhou, Yelong Shen, Ruoming Jin, Dejing Dou

Despite achieving remarkable performance, deep graph learning models, such as node classification and network embedding, suffer from harassment caused by small adversarial perturbations.

Adversarial Attack Density Estimation +5

Improving Self-supervised Pre-training via a Fully-Explored Masked Language Model

no code implementations12 Oct 2020 Mingzhi Zheng, Dinghan Shen, Yelong Shen, Weizhu Chen, Lin Xiao

We prove, from a theoretical perspective, that the gradients derived from this new masking schema have a smaller variance and can lead to more efficient self-supervised training.

Language Modelling Sentence Classification

Generation-Augmented Retrieval for Open-domain Question Answering

1 code implementation ACL 2021 Yuning Mao, Pengcheng He, Xiaodong Liu, Yelong Shen, Jianfeng Gao, Jiawei Han, Weizhu Chen

We demonstrate that the generated contexts substantially enrich the semantics of the queries and GAR with sparse representations (BM25) achieves comparable or better performance than state-of-the-art dense retrieval methods such as DPR.

Natural Questions Open-Domain Question Answering +3

Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension

1 code implementation ACL 2020 Hongyu Gong, Yelong Shen, Dian Yu, Jianshu Chen, Dong Yu

In this paper, we study machine reading comprehension (MRC) on long texts, where a model takes as inputs a lengthy document and a question and then extracts a text span from the document as an answer.

Chunking Machine Reading Comprehension +1

MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning

1 code implementation ACL 2020 Jie Lei, Li-Wei Wang, Yelong Shen, Dong Yu, Tamara L. Berg, Mohit Bansal

Generating multi-sentence descriptions for videos is one of the most challenging captioning tasks due to its high requirements for not only visual relevance but also discourse-based coherence across the sentences in the paragraph.

A Hybrid Retrieval-Generation Neural Conversation Model

1 code implementation19 Apr 2019 Liu Yang, Junjie Hu, Minghui Qiu, Chen Qu, Jianfeng Gao, W. Bruce Croft, Xiaodong Liu, Yelong Shen, Jingjing Liu

In this paper, we propose a hybrid neural conversation model that combines the merits of both response retrieval and generation methods.

Text Generation

Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension

5 code implementations NAACL 2019 Yichong Xu, Xiaodong Liu, Yelong Shen, Jingjing Liu, Jianfeng Gao

We propose a multi-task learning framework to learn a joint Machine Reading Comprehension (MRC) model that can be applied to a wide range of MRC tasks in different domains.

Machine Reading Comprehension Machine Translation +3

M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search

no code implementations NeurIPS 2018 Yelong Shen, Jianshu Chen, Po-Sen Huang, Yuqing Guo, Jianfeng Gao

In order to effectively train the agent from sparse rewards, we combine MCTS with the neural policy to generate trajectories yielding more positive rewards.

Ranked #35 on Link Prediction on WN18RR (Hits@3 metric)

Knowledge Base Completion Link Prediction +1

Dynamic Fusion Networks for Machine Reading Comprehension

no code implementations14 Nov 2017 Yichong Xu, Jingjing Liu, Jianfeng Gao, Yelong Shen, Xiaodong Liu

This paper presents a novel neural model - Dynamic Fusion Network (DFN), for machine reading comprehension (MRC).

Machine Reading Comprehension

An Empirical Analysis of Multiple-Turn Reasoning Strategies in Reading Comprehension Tasks

no code implementations IJCNLP 2017 Yelong Shen, Xiaodong Liu, Kevin Duh, Jianfeng Gao

Using a state-of-the-art RC model, we empirically investigate the performance of single-turn and multiple-turn reasoning on the SQuAD and MS MARCO datasets.

Reading Comprehension reinforcement-learning

Link Prediction using Embedded Knowledge Graphs

no code implementations14 Nov 2016 Yelong Shen, Po-Sen Huang, Ming-Wei Chang, Jianfeng Gao

Since large knowledge bases are typically incomplete, missing facts need to be inferred from observed facts in a task called knowledge base completion.

Knowledge Base Completion Knowledge Graphs +1

ReasoNet: Learning to Stop Reading in Machine Comprehension

no code implementations17 Sep 2016 Yelong Shen, Po-Sen Huang, Jianfeng Gao, Weizhu Chen

Teaching a computer to read and answer general questions pertaining to a document is a challenging yet unsolved problem.

Question Answering Reading Comprehension +1

End-to-end Learning of LDA by Mirror-Descent Back Propagation over a Deep Architecture

1 code implementation NeurIPS 2015 Jianshu Chen, Ji He, Yelong Shen, Lin Xiao, Xiaodong He, Jianfeng Gao, Xinying Song, Li Deng

We develop a fully discriminative learning approach for supervised Latent Dirichlet Allocation (LDA) model using Back Propagation (i. e., BP-sLDA), which maximizes the posterior probability of the prediction variable given the input document.

General Classification Topic Models

A Deep Embedding Model for Co-occurrence Learning

no code implementations11 Apr 2015 Yelong Shen, Ruoming Jin, Jianshu Chen, Xiaodong He, Jianfeng Gao, Li Deng

Co-occurrence Data is a common and important information source in many areas, such as the word co-occurrence in the sentences, friends co-occurrence in social networks and products co-occurrence in commercial transaction data, etc, which contains rich correlation and clustering information about the items.

Cannot find the paper you are looking for? You can Submit a new open access paper.