Search Results for author: Yelong Shen

Found 29 papers, 13 papers with code

LoRA: Low-Rank Adaptation of Large Language Models

1 code implementation17 Jun 2021 Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Weizhu Chen

We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks.

Language Modelling

Memory-Efficient Differentiable Transformer Architecture Search

no code implementations31 May 2021 Yuekai Zhao, Li Dong, Yelong Shen, Zhihua Zhang, Furu Wei, Weizhu Chen

To this end, we propose a multi-split reversible network and combine it with DARTS.

Poolingformer: Long Document Modeling with Pooling Attention

no code implementations10 May 2021 Hang Zhang, Yeyun Gong, Yelong Shen, Weisheng Li, Jiancheng Lv, Nan Duan, Weizhu Chen

We first evaluate Poolingformer on two long sequence QA tasks: the monolingual NQ and the multilingual TyDi QA.

UnitedQA: A Hybrid Approach for Open Domain Question Answering

no code implementations ACL 2021 Hao Cheng, Yelong Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao

To date, most of recent work under the retrieval-reader framework for open-domain QA focuses on either extractive or generative reader exclusively.

Open-Domain Question Answering

Rider: Reader-Guided Passage Reranking for Open-Domain Question Answering

1 code implementation1 Jan 2021 Yuning Mao, Pengcheng He, Xiaodong Liu, Yelong Shen, Jianfeng Gao, Jiawei Han, Weizhu Chen

Current open-domain question answering systems often follow a Retriever-Reader architecture, where the retriever first retrieves relevant passages and the reader then reads the retrieved passages to form an answer.

Open-Domain Question Answering

Adversarial Attacks on Deep Graph Matching

no code implementations NeurIPS 2020 Zijie Zhang, Zeru Zhang, Yang Zhou, Yelong Shen, Ruoming Jin, Dejing Dou

Despite achieving remarkable performance, deep graph learning models, such as node classification and network embedding, suffer from harassment caused by small adversarial perturbations.

Adversarial Attack Density Estimation +5

Improving Self-supervised Pre-training via a Fully-Explored Masked Language Model

no code implementations12 Oct 2020 Mingzhi Zheng, Dinghan Shen, Yelong Shen, Weizhu Chen, Lin Xiao

We prove, from a theoretical perspective, that the gradients derived from this new masking schema have a smaller variance and can lead to more efficient self-supervised training.

Language Modelling Sentence Classification

Generation-Augmented Retrieval for Open-domain Question Answering

1 code implementation ACL 2021 Yuning Mao, Pengcheng He, Xiaodong Liu, Yelong Shen, Jianfeng Gao, Jiawei Han, Weizhu Chen

We demonstrate that the generated contexts substantially enrich the semantics of the queries and GAR with sparse representations (BM25) achieves comparable or better performance than state-of-the-art dense retrieval methods such as DPR.

Open-Domain Question Answering Text Generation

Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension

1 code implementation ACL 2020 Hongyu Gong, Yelong Shen, Dian Yu, Jianshu Chen, Dong Yu

In this paper, we study machine reading comprehension (MRC) on long texts, where a model takes as inputs a lengthy document and a question and then extracts a text span from the document as an answer.

Chunking Machine Reading Comprehension

MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning

1 code implementation ACL 2020 Jie Lei, Li-Wei Wang, Yelong Shen, Dong Yu, Tamara L. Berg, Mohit Bansal

Generating multi-sentence descriptions for videos is one of the most challenging captioning tasks due to its high requirements for not only visual relevance but also discourse-based coherence across the sentences in the paragraph.

A Hybrid Retrieval-Generation Neural Conversation Model

1 code implementation19 Apr 2019 Liu Yang, Junjie Hu, Minghui Qiu, Chen Qu, Jianfeng Gao, W. Bruce Croft, Xiaodong Liu, Yelong Shen, Jingjing Liu

In this paper, we propose a hybrid neural conversation model that combines the merits of both response retrieval and generation methods.

Text Generation

Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension

5 code implementations NAACL 2019 Yichong Xu, Xiaodong Liu, Yelong Shen, Jingjing Liu, Jianfeng Gao

We propose a multi-task learning framework to learn a joint Machine Reading Comprehension (MRC) model that can be applied to a wide range of MRC tasks in different domains.

Machine Reading Comprehension Machine Translation +2

M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search

no code implementations NeurIPS 2018 Yelong Shen, Jianshu Chen, Po-Sen Huang, Yuqing Guo, Jianfeng Gao

In order to effectively train the agent from sparse rewards, we combine MCTS with the neural policy to generate trajectories yielding more positive rewards.

Ranked #25 on Link Prediction on WN18RR (Hits@3 metric)

Knowledge Base Completion Link Prediction +1

Stochastic Answer Networks for Machine Reading Comprehension

5 code implementations ACL 2018 Xiaodong Liu, Yelong Shen, Kevin Duh, Jianfeng Gao

We propose a simple yet robust stochastic answer network (SAN) that simulates multi-step reasoning in machine reading comprehension.

Machine Reading Comprehension Question Answering

Dynamic Fusion Networks for Machine Reading Comprehension

no code implementations14 Nov 2017 Yichong Xu, Jingjing Liu, Jianfeng Gao, Yelong Shen, Xiaodong Liu

This paper presents a novel neural model - Dynamic Fusion Network (DFN), for machine reading comprehension (MRC).

Machine Reading Comprehension

An Empirical Analysis of Multiple-Turn Reasoning Strategies in Reading Comprehension Tasks

no code implementations IJCNLP 2017 Yelong Shen, Xiaodong Liu, Kevin Duh, Jianfeng Gao

Using a state-of-the-art RC model, we empirically investigate the performance of single-turn and multiple-turn reasoning on the SQuAD and MS MARCO datasets.

Reading Comprehension

Link Prediction using Embedded Knowledge Graphs

no code implementations14 Nov 2016 Yelong Shen, Po-Sen Huang, Ming-Wei Chang, Jianfeng Gao

Since large knowledge bases are typically incomplete, missing facts need to be inferred from observed facts in a task called knowledge base completion.

Knowledge Base Completion Knowledge Graphs +1

ReasoNet: Learning to Stop Reading in Machine Comprehension

no code implementations17 Sep 2016 Yelong Shen, Po-Sen Huang, Jianfeng Gao, Weizhu Chen

Teaching a computer to read and answer general questions pertaining to a document is a challenging yet unsolved problem.

Question Answering Reading Comprehension

End-to-end Learning of LDA by Mirror-Descent Back Propagation over a Deep Architecture

1 code implementation NeurIPS 2015 Jianshu Chen, Ji He, Yelong Shen, Lin Xiao, Xiaodong He, Jianfeng Gao, Xinying Song, Li Deng

We develop a fully discriminative learning approach for supervised Latent Dirichlet Allocation (LDA) model using Back Propagation (i. e., BP-sLDA), which maximizes the posterior probability of the prediction variable given the input document.

General Classification Topic Models

A Deep Embedding Model for Co-occurrence Learning

no code implementations11 Apr 2015 Yelong Shen, Ruoming Jin, Jianshu Chen, Xiaodong He, Jianfeng Gao, Li Deng

Co-occurrence Data is a common and important information source in many areas, such as the word co-occurrence in the sentences, friends co-occurrence in social networks and products co-occurrence in commercial transaction data, etc, which contains rich correlation and clustering information about the items.

Cannot find the paper you are looking for? You can Submit a new open access paper.