Search Results for author: Yelong Shen

Found 64 papers, 33 papers with code

Finding the Dominant Winning Ticket in Pre-Trained Language Models

no code implementations • Findings (ACL) 2022 • Zhuocheng Gong, Di He, Yelong Shen, Tie-Yan Liu, Weizhu Chen, Dongyan Zhao, Ji-Rong Wen, Rui Yan

Empirically, we show that (a) the dominant winning ticket can achieve performance that is comparable with that of the full-parameter model, (b) the dominant winning ticket is transferable across different tasks, (c) and the dominant winning ticket has a natural structure within each parameter matrix.

Paper
Add Code

Reader-Guided Passage Reranking for Open-Domain Question Answering

1 code implementation • Findings (ACL) 2021 • Yuning Mao, Pengcheng He, Xiaodong Liu, Yelong Shen, Jianfeng Gao, Jiawei Han, Weizhu Chen

Open-Domain Question Answering

Paper
Code

Rho-1: Not All Tokens Are What You Need

2 code implementations • 11 Apr 2024 • Zhenghao Lin, Zhibin Gou, Yeyun Gong, Xiao Liu, Yelong Shen, Ruochen Xu, Chen Lin, Yujiu Yang, Jian Jiao, Nan Duan, Weizhu Chen

After fine-tuning, Rho-1-1B and 7B achieved state-of-the-art results of 40. 6% and 51. 8% on MATH dataset, respectively - matching DeepSeekMath with only 3% of the pretraining tokens.

Continual Pretraining Language Modelling +1

190

Paper
Code

Exploring the Mystery of Influential Data for Mathematical Reasoning

no code implementations • 1 Apr 2024 • Xinzhe Ni, Yeyun Gong, Zhibin Gou, Yelong Shen, Yujiu Yang, Nan Duan, Weizhu Chen

Additionally, we showcase the use of QaDS in creating efficient fine-tuning mixtures with various selection ratios, and analyze the quality of a wide range of open-source datasets, which can perform as a reference for future works on mathematical reasoning tasks.

Math Mathematical Reasoning

Paper
Add Code

Ensuring Safe and High-Quality Outputs: A Guideline Library Approach for Language Models

1 code implementation • 18 Mar 2024 • Yi Luo, Zhenghao Lin, Yuhao Zhang, Jiashuo Sun, Chen Lin, Chengjin Xu, Xiangdong Su, Yelong Shen, Jian Guo, Yeyun Gong

Subsequently, the retrieval model correlates new inputs with relevant guidelines, which guide LLMs in response generation to ensure safe and high-quality outputs, thereby aligning with human values.

Response Generation Retrieval

Paper
Code

Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning

1 code implementation • 4 Mar 2024 • Yiming Huang, Xiao Liu, Yeyun Gong, Zhibin Gou, Yelong Shen, Nan Duan, Weizhu Chen

Large language models (LLMs) have shown great potential in complex reasoning tasks, yet their performance is often hampered by the scarcity of high-quality and reasoning-focused training datasets.

Ranked #49 on Math Word Problem Solving on MATH (using extra training data)

Math Math Word Problem Solving

16,544

Paper
Code

Multi-LoRA Composition for Image Generation

no code implementations • 26 Feb 2024 • Ming Zhong, Yelong Shen, Shuohang Wang, Yadong Lu, Yizhu Jiao, Siru Ouyang, Donghan Yu, Jiawei Han, Weizhu Chen

Low-Rank Adaptation (LoRA) is extensively utilized in text-to-image models for the accurate rendition of specific elements like distinct characters or unique styles in generated images.

Denoising Image Generation

Paper
Add Code

Competition-Level Problems are Effective LLM Evaluators

no code implementations • 4 Dec 2023 • Yiming Huang, Zhenghao Lin, Xiao Liu, Yeyun Gong, Shuai Lu, Fangyu Lei, Yaobo Liang, Yelong Shen, Chen Lin, Nan Duan, Weizhu Chen

Large language models (LLMs) have demonstrated impressive reasoning capabilities, yet there is ongoing debate about these abilities and the potential data contamination problem recently.

Paper
Add Code

Language Models can be Logical Solvers

no code implementations • 10 Nov 2023 • Jiazhan Feng, Ruochen Xu, Junheng Hao, Hiteshi Sharma, Yelong Shen, Dongyan Zhao, Weizhu Chen

Despite their impressive performance, any parsing errors will inevitably result in the failure of the execution of the external logical solver and no answer to the logical questions.

Decision Making Language Modelling +1

Paper
Add Code

Adapting LLM Agents with Universal Feedback in Communication

no code implementations • 1 Oct 2023 • Kuan Wang, Yadong Lu, Michael Santacroce, Yeyun Gong, Chao Zhang, Yelong Shen

To optimize agent interactions for task-specific learning with our universal buffer and pipeline, we introduce diverse communication patterns tailored for both single-agent and multi-agent environments.

Decision Making GSM8K

Paper
Add Code

ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving

1 code implementation • 29 Sep 2023 • Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Minlie Huang, Nan Duan, Weizhu Chen

Large language models have made significant progress in various language tasks, yet they still struggle with complex mathematics.

Ranked #10 on Math Word Problem Solving on MATH (using extra training data)

Arithmetic Reasoning Computational Efficiency +3

812

Paper
Code

An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models

1 code implementation • 18 Sep 2023 • Yadong Lu, Chunyuan Li, Haotian Liu, Jianwei Yang, Jianfeng Gao, Yelong Shen

We find that scaling LMM consistently enhances model performance and improves language capabilities, and performance of LoRA/QLoRA tuning of LMM are comparable to the performance of full-model fine-tuning.

Ranked #45 on Visual Question Answering on MM-Vet

Visual Question Answering

15,912

Paper
Code

Efficient RLHF: Reducing the Memory Usage of PPO

no code implementations • 1 Sep 2023 • Michael Santacroce, Yadong Lu, Han Yu, Yuanzhi Li, Yelong Shen

To address this issue, we present a comprehensive analysis the memory usage, performance, and training time of memory-savings techniques for PPO.

Language Modelling

Paper
Add Code

GRILL: Grounded Vision-language Pre-training via Aligning Text and Image Regions

no code implementations • 24 May 2023 • Woojeong Jin, Subhabrata Mukherjee, Yu Cheng, Yelong Shen, Weizhu Chen, Ahmed Hassan Awadallah, Damien Jose, Xiang Ren

Generalization to unseen tasks is an important ability for few-shot learners to achieve better zero-/few-shot performance on diverse tasks.

Object Question Answering +2

Paper
Add Code

Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy

no code implementations • 24 May 2023 • Zhihong Shao, Yeyun Gong, Yelong Shen, Minlie Huang, Nan Duan, Weizhu Chen

In this paper, we show that strong performance can be achieved by a method we call Iter-RetGen, which synergizes retrieval and generation in an iterative manner.

Fact Verification Multi-hop Question Answering +2

Paper
Add Code

CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing

1 code implementation • 19 May 2023 • Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Nan Duan, Weizhu Chen

Unlike these models, humans typically utilize external tools to cross-check and refine their initial content, like using a search engine for fact-checking, or a code interpreter for debugging.

Fact Checking Natural Questions +4

614

Paper
Code

AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation

1 code implementation • NeurIPS 2023 • Tong Wu, Zhihao Fan, Xiao Liu, Yeyun Gong, Yelong Shen, Jian Jiao, Hai-Tao Zheng, Juntao Li, Zhongyu Wei, Jian Guo, Nan Duan, Weizhu Chen

Diffusion models have gained significant attention in the realm of image generation due to their exceptional performance.

Common Sense Reasoning Denoising +4

614

Paper
Code

In-Context Learning Unlocked for Diffusion Models

1 code implementation • NeurIPS 2023 • Zhendong Wang, Yifan Jiang, Yadong Lu, Yelong Shen, Pengcheng He, Weizhu Chen, Zhangyang Wang, Mingyuan Zhou

We present Prompt Diffusion, a framework for enabling in-context learning in diffusion-based generative models.

In-Context Learning text-guided-image-editing

349

Paper
Code

Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models

1 code implementation • 23 Apr 2023 • Jiashuo Sun, Yi Luo, Yeyun Gong, Chen Lin, Yelong Shen, Jian Guo, Nan Duan

By utilizing iterative bootstrapping, our approach enables LLMs to autonomously rectify errors, resulting in more precise and comprehensive reasoning chains.

Paper
Code

What Matters In The Structured Pruning of Generative Language Models?

1 code implementation • 7 Feb 2023 • Michael Santacroce, Zixin Wen, Yelong Shen, Yuanzhi Li

Auto-regressive large language models such as GPT-3 require enormous computational resources to use.

Text Generation

374

Paper
Code

Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language Models

no code implementations • 1 Feb 2023 • Zhihong Shao, Yeyun Gong, Yelong Shen, Minlie Huang, Nan Duan, Weizhu Chen

However, the quality of the prompts depends on the demonstrations given to the models, and creating many of them by hand is costly.

Paper
Add Code

Text Generation with Diffusion Language Models: A Pre-training Approach with Continuous Paragraph Denoise

1 code implementation • 22 Dec 2022 • Zhenghao Lin, Yeyun Gong, Yelong Shen, Tong Wu, Zhihao Fan, Chen Lin, Nan Duan, Weizhu Chen

In this paper, we introduce a novel dIffusion language modEl pre-training framework for text generation, which we call GENIE.

Denoising Language Modelling +1

614

Paper
Code

Generation-Augmented Query Expansion For Code Retrieval

no code implementations • 20 Dec 2022 • Dong Li, Yelong Shen, Ruoming Jin, Yi Mao, Kuan Wang, Weizhu Chen

Pre-trained language models have achieved promising success in code retrieval tasks, where a natural language documentation query is given to find the most relevant existing code snippet.

Code Generation Retrieval

Paper
Add Code

GENIUS: Sketch-based Language Model Pre-training via Extreme and Selective Masking for Text Generation and Augmentation

2 code implementations • 18 Nov 2022 • Biyang Guo, Yeyun Gong, Yelong Shen, Songqiao Han, Hailiang Huang, Nan Duan, Weizhu Chen

We introduce GENIUS: a conditional text generation model using sketches as input, which can fill in the missing contexts for a given sketch (key information consisting of textual spans, phrases, or words, concatenated by mask tokens).

Conditional Text Generation Data Augmentation +8

175

Paper
Code

SimANS: Simple Ambiguous Negatives Sampling for Dense Text Retrieval

1 code implementation • 21 Oct 2022 • Kun Zhou, Yeyun Gong, Xiao Liu, Wayne Xin Zhao, Yelong Shen, Anlei Dong, Jingwen Lu, Rangan Majumder, Ji-Rong Wen, Nan Duan, Weizhu Chen

Thus, we propose a simple ambiguous negatives sampling method, SimANS, which incorporates a new sampling probability distribution to sample more ambiguous negatives.

Retrieval Text Retrieval

Paper
Code

Soft-Labeled Contrastive Pre-training for Function-level Code Representation

1 code implementation • 18 Oct 2022 • Xiaonan Li, Daya Guo, Yeyun Gong, Yun Lin, Yelong Shen, Xipeng Qiu, Daxin Jiang, Weizhu Chen, Nan Duan

In this paper, we present \textbf{SCodeR}, a \textbf{S}oft-labeled contrastive pre-training framework with two positive sample construction methods to learn functional-level \textbf{Code} \textbf{R}epresentation.

Paper
Code

Explanations from Large Language Models Make Small Reasoners Better

no code implementations • 13 Oct 2022 • Shiyang Li, Jianshu Chen, Yelong Shen, Zhiyu Chen, Xinlu Zhang, Zekun Li, Hong Wang, Jing Qian, Baolin Peng, Yi Mao, Wenhu Chen, Xifeng Yan

Integrating free-text explanations to in-context learning of large language models (LLM) is shown to elicit strong reasoning capabilities along with reasonable explanations.

Explanation Generation In-Context Learning +1

Paper
Add Code

Joint Generator-Ranker Learning for Natural Language Generation

2 code implementations • 28 Jun 2022 • Weizhou Shen, Yeyun Gong, Yelong Shen, Song Wang, Xiaojun Quan, Nan Duan, Weizhu Chen

Generate-then-rank is a widely used mechanism for text generation, where a generator produces multiple text candidates and a ranker chooses the best one among the text candidates.

Question Generation Question-Generation +2

614

Paper
Code

A Self-Paced Mixed Distillation Method for Non-Autoregressive Generation

no code implementations • 23 May 2022 • Weizhen Qi, Yeyun Gong, Yelong Shen, Jian Jiao, Yu Yan, Houqiang Li, Ruofei Zhang, Weizhu Chen, Nan Duan

To further illustrate the commercial value of our approach, we conduct experiments on three generation tasks in real-world advertisements applications.

Question Generation Question-Generation +1

Paper
Add Code

CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing

1 code implementation • ACL 2022 • Chen Liang, Pengcheng He, Yelong Shen, Weizhu Chen, Tuo Zhao

To retain ensemble benefits while maintaining a low memory cost, we propose a consistency-regularized ensemble learning approach based on perturbed models, named CAMERO.

Ensemble Learning

Paper
Code

Controllable Natural Language Generation with Contrastive Prefixes

no code implementations • Findings (ACL) 2022 • Jing Qian, Li Dong, Yelong Shen, Furu Wei, Weizhu Chen

We propose a novel supervised method and also an unsupervised method to train the prefixes for single-aspect control while the combination of these two methods can achieve multi-aspect control.

Attribute Language Modelling +1

Paper
Add Code

CodeRetriever: Unimodal and Bimodal Contrastive Learning for Code Search

1 code implementation • 26 Jan 2022 • Xiaonan Li, Yeyun Gong, Yelong Shen, Xipeng Qiu, Hang Zhang, Bolun Yao, Weizhen Qi, Daxin Jiang, Weizhu Chen, Nan Duan

For bimodal contrastive learning, we leverage the documentation and in-line comments of code to build code-text pairs.

Code Search Contrastive Learning

Paper
Code

Knowledge-Grounded Dialogue Generation with a Unified Knowledge Representation

no code implementations • NAACL 2022 • Yu Li, Baolin Peng, Yelong Shen, Yi Mao, Lars Liden, Zhou Yu, Jianfeng Gao

To address these challenges, we present PLUG, a language model that homogenizes different knowledge sources to a unified knowledge representation for knowledge-grounded dialogue generation tasks.

Dialogue Generation Language Modelling

Paper
Add Code

A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models

1 code implementation • ACL 2022 • Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren

Large pre-trained vision-language (VL) models can learn a new task with a handful of examples and generalize to a new task without fine-tuning.

Ranked #4 on Image Captioning on Flickr30k Captions test (SPICE metric)

Image Captioning Language Modelling +2

Paper
Code

Adversarial Retriever-Ranker for dense text retrieval

1 code implementation • ICLR 2022 • Hang Zhang, Yeyun Gong, Yelong Shen, Jiancheng Lv, Nan Duan, Weizhu Chen

To address these challenges, we present Adversarial Retriever-Ranker (AR2), which consists of a dual-encoder retriever plus a cross-encoder ranker.

Natural Questions Retrieval +2

Paper
Code

LoRA: Low-Rank Adaptation of Large Language Models

48 code implementations • ICLR 2022 • Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen

We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks.

Language Modelling

28,740

Paper
Code

Memory-Efficient Differentiable Transformer Architecture Search

no code implementations • Findings (ACL) 2021 • Yuekai Zhao, Li Dong, Yelong Shen, Zhihua Zhang, Furu Wei, Weizhu Chen

To this end, we propose a multi-split reversible network and combine it with DARTS.

Paper
Add Code

Poolingformer: Long Document Modeling with Pooling Attention

no code implementations • 10 May 2021 • Hang Zhang, Yeyun Gong, Yelong Shen, Weisheng Li, Jiancheng Lv, Nan Duan, Weizhu Chen

We first evaluate Poolingformer on two long sequence QA tasks: the monolingual NQ and the multilingual TyDi QA.

Paper
Add Code

NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned

no code implementations • 1 Jan 2021 • Sewon Min, Jordan Boyd-Graber, Chris Alberti, Danqi Chen, Eunsol Choi, Michael Collins, Kelvin Guu, Hannaneh Hajishirzi, Kenton Lee, Jennimaria Palomaki, Colin Raffel, Adam Roberts, Tom Kwiatkowski, Patrick Lewis, Yuxiang Wu, Heinrich Küttler, Linqing Liu, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel, Sohee Yang, Minjoon Seo, Gautier Izacard, Fabio Petroni, Lucas Hosseini, Nicola De Cao, Edouard Grave, Ikuya Yamada, Sonse Shimaoka, Masatoshi Suzuki, Shumpei Miyawaki, Shun Sato, Ryo Takahashi, Jun Suzuki, Martin Fajcik, Martin Docekal, Karel Ondrej, Pavel Smrz, Hao Cheng, Yelong Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao, Barlas Oguz, Xilun Chen, Vladimir Karpukhin, Stan Peshterliev, Dmytro Okhonko, Michael Schlichtkrull, Sonal Gupta, Yashar Mehdad, Wen-tau Yih

We review the EfficientQA competition from NeurIPS 2020.

Open-Domain Question Answering Retrieval

Paper
Add Code

UnitedQA: A Hybrid Approach for Open Domain Question Answering

no code implementations • ACL 2021 • Hao Cheng, Yelong Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao

To date, most of recent work under the retrieval-reader framework for open-domain QA focuses on either extractive or generative reader exclusively.

Ranked #1 on Open-Domain Question Answering on TriviaQA

Open-Domain Question Answering Retrieval +1

Paper
Add Code

Rider: Reader-Guided Passage Reranking for Open-Domain Question Answering

1 code implementation • 1 Jan 2021 • Yuning Mao, Pengcheng He, Xiaodong Liu, Yelong Shen, Jianfeng Gao, Jiawei Han, Weizhu Chen

Current open-domain question answering systems often follow a Retriever-Reader architecture, where the retriever first retrieves relevant passages and the reader then reads the retrieved passages to form an answer.

Natural Questions Open-Domain Question Answering +2

Paper
Code

Adversarial Attacks on Deep Graph Matching

no code implementations • NeurIPS 2020 • Zijie Zhang, Zeru Zhang, Yang Zhou, Yelong Shen, Ruoming Jin, Dejing Dou

Despite achieving remarkable performance, deep graph learning models, such as node classification and network embedding, suffer from harassment caused by small adversarial perturbations.

Adversarial Attack Density Estimation +5

Paper
Add Code

CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding

no code implementations • ICLR 2021 • Yanru Qu, Dinghan Shen, Yelong Shen, Sandra Sajeev, Jiawei Han, Weizhu Chen

To verify the effectiveness of the proposed framework, we apply CoDA to Transformer-based models on a wide range of natural language understanding tasks.

Data Augmentation Natural Language Understanding

Paper
Add Code

Improving Self-supervised Pre-training via a Fully-Explored Masked Language Model

no code implementations • 12 Oct 2020 • Mingzhi Zheng, Dinghan Shen, Yelong Shen, Weizhu Chen, Lin Xiao

We prove, from a theoretical perspective, that the gradients derived from this new masking schema have a smaller variance and can lead to more efficient self-supervised training.

Ranked #1 on Sentence Classification on ACL-ARC

Language Modelling Sentence Classification

Paper
Add Code

A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation

2 code implementations • 29 Sep 2020 • Dinghan Shen, Mingzhi Zheng, Yelong Shen, Yanru Qu, Weizhu Chen

Adversarial training has been shown effective at endowing the learned representations with stronger generalization ability.

Ranked #8 on Machine Translation on IWSLT2014 German-English

Data Augmentation Machine Translation +3

Paper
Code

Generation-Augmented Retrieval for Open-domain Question Answering

1 code implementation • ACL 2021 • Yuning Mao, Pengcheng He, Xiaodong Liu, Yelong Shen, Jianfeng Gao, Jiawei Han, Weizhu Chen

We demonstrate that the generated contexts substantially enrich the semantics of the queries and GAR with sparse representations (BM25) achieves comparable or better performance than state-of-the-art dense retrieval methods such as DPR.

Ranked #9 on Passage Retrieval on Natural Questions

Natural Questions Open-Domain Question Answering +4

Paper
Code

Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension

1 code implementation • ACL 2020 • Hongyu Gong, Yelong Shen, Dian Yu, Jianshu Chen, Dong Yu

In this paper, we study machine reading comprehension (MRC) on long texts, where a model takes as inputs a lengthy document and a question and then extracts a text span from the document as an answer.

Chunking Machine Reading Comprehension +1

Paper
Code

MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning

1 code implementation • ACL 2020 • Jie Lei, Li-Wei Wang, Yelong Shen, Dong Yu, Tamara L. Berg, Mohit Bansal

Generating multi-sentence descriptions for videos is one of the most challenging captioning tasks due to its high requirements for not only visual relevance but also discourse-based coherence across the sentences in the paragraph.

Ranked #5 on Video Captioning on ActivityNet Captions

Sentence

168

Paper
Code

A Hybrid Retrieval-Generation Neural Conversation Model

1 code implementation • 19 Apr 2019 • Liu Yang, Junjie Hu, Minghui Qiu, Chen Qu, Jianfeng Gao, W. Bruce Croft, Xiaodong Liu, Yelong Shen, Jingjing Liu

In this paper, we propose a hybrid neural conversation model that combines the merits of both response retrieval and generation methods.

Retrieval Text Generation +1

Paper
Code

Unsupervised Deep Structured Semantic Models for Commonsense Reasoning

no code implementations • NAACL 2019 • Shuohang Wang, Sheng Zhang, Yelong Shen, Xiaodong Liu, Jingjing Liu, Jianfeng Gao, Jing Jiang

Commonsense reasoning is fundamental to natural language understanding.

Ranked #2 on Natural Language Understanding on PDP60

Common Sense Reasoning Coreference Resolution +2

Paper
Add Code

StoryGAN: A Sequential Conditional GAN for Story Visualization

1 code implementation • CVPR 2019 • Yitong Li, Zhe Gan, Yelong Shen, Jingjing Liu, Yu Cheng, Yuexin Wu, Lawrence Carin, David Carlson, Jianfeng Gao

We therefore propose a new story-to-image-sequence generation model, StoryGAN, based on the sequential conditional GAN framework.

Sentence Story Visualization +1

231

Paper
Code

Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension

5 code implementations • NAACL 2019 • Yichong Xu, Xiaodong Liu, Yelong Shen, Jingjing Liu, Jianfeng Gao

We propose a multi-task learning framework to learn a joint Machine Reading Comprehension (MRC) model that can be applied to a wide range of MRC tasks in different domains.

Machine Reading Comprehension Machine Translation +3

148

Paper
Code

M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search

no code implementations • NeurIPS 2018 • Yelong Shen, Jianshu Chen, Po-Sen Huang, Yuqing Guo, Jianfeng Gao

In order to effectively train the agent from sparse rewards, we combine MCTS with the neural policy to generate trajectories yielding more positive rewards.

Ranked #44 on Link Prediction on WN18RR (Hits@3 metric)

Knowledge Base Completion Link Prediction +2

Paper
Add Code

Stochastic Answer Networks for Machine Reading Comprehension

5 code implementations • ACL 2018 • Xiaodong Liu, Yelong Shen, Kevin Duh, Jianfeng Gao

We propose a simple yet robust stochastic answer network (SAN) that simulates multi-step reasoning in machine reading comprehension.

Ranked #24 on Question Answering on SQuAD1.1 dev

Machine Reading Comprehension Question Answering +2

148

Paper
Code

Language-Based Image Editing with Recurrent Attentive Models

1 code implementation • CVPR 2018 • Jianbo Chen, Yelong Shen, Jianfeng Gao, Jingjing Liu, Xiaodong Liu

First, we introduce a synthetic dataset, called CoSaL, to evaluate the end-to-end performance of our LBIE system.

Colorization Image Colorization +3

Paper
Code

FusionNet: Fusing via Fully-Aware Attention with Application to Machine Comprehension

3 code implementations • ICLR 2018 • Hsin-Yuan Huang, Chenguang Zhu, Yelong Shen, Weizhu Chen

This paper introduces a new neural structure called FusionNet, which extends existing attention approaches from three perspectives.

Ranked #26 on Question Answering on SQuAD1.1 dev

Question Answering Reading Comprehension

135

Paper
Code

Dynamic Fusion Networks for Machine Reading Comprehension

no code implementations • 14 Nov 2017 • Yichong Xu, Jingjing Liu, Jianfeng Gao, Yelong Shen, Xiaodong Liu

This paper presents a novel neural model - Dynamic Fusion Network (DFN), for machine reading comprehension (MRC).

Machine Reading Comprehension

Paper
Add Code

An Empirical Analysis of Multiple-Turn Reasoning Strategies in Reading Comprehension Tasks

no code implementations • IJCNLP 2017 • Yelong Shen, Xiaodong Liu, Kevin Duh, Jianfeng Gao

Using a state-of-the-art RC model, we empirically investigate the performance of single-turn and multiple-turn reasoning on the SQuAD and MS MARCO datasets.

Descriptive Reading Comprehension +1

Paper
Add Code

Modeling Large-Scale Structured Relationships with Shared Memory for Knowledge Base Completion

no code implementations • WS 2017 • Yelong Shen, Po-Sen Huang, Ming-Wei Chang, Jianfeng Gao

However, due to the size of knowledge bases, learning multi-step relations directly on top of observed triplets could be costly.

Knowledge Base Completion Question Answering +1

Paper
Add Code

Link Prediction using Embedded Knowledge Graphs

no code implementations • 14 Nov 2016 • Yelong Shen, Po-Sen Huang, Ming-Wei Chang, Jianfeng Gao

Since large knowledge bases are typically incomplete, missing facts need to be inferred from observed facts in a task called knowledge base completion.

Knowledge Base Completion Knowledge Graphs +1

Paper
Add Code

ReasoNet: Learning to Stop Reading in Machine Comprehension

no code implementations • 17 Sep 2016 • Yelong Shen, Po-Sen Huang, Jianfeng Gao, Weizhu Chen

Teaching a computer to read and answer general questions pertaining to a document is a challenging yet unsolved problem.

Ranked #7 on Question Answering on CNN / Daily Mail

Question Answering Reading Comprehension

Paper
Add Code

End-to-end Learning of LDA by Mirror-Descent Back Propagation over a Deep Architecture

1 code implementation • NeurIPS 2015 • Jianshu Chen, Ji He, Yelong Shen, Lin Xiao, Xiaodong He, Jianfeng Gao, Xinying Song, Li Deng

We develop a fully discriminative learning approach for supervised Latent Dirichlet Allocation (LDA) model using Back Propagation (i. e., BP-sLDA), which maximizes the posterior probability of the prediction variable given the input document.

General Classification Topic Models

Paper
Code

A Deep Embedding Model for Co-occurrence Learning

no code implementations • 11 Apr 2015 • Yelong Shen, Ruoming Jin, Jianshu Chen, Xiaodong He, Jianfeng Gao, Li Deng

Co-occurrence Data is a common and important information source in many areas, such as the word co-occurrence in the sentences, friends co-occurrence in social networks and products co-occurrence in commercial transaction data, etc, which contains rich correlation and clustering information about the items.

Clustering

Paper
Add Code

Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval

no code implementations • 24 Feb 2015 • Hamid Palangi, Li Deng, Yelong Shen, Jianfeng Gao, Xiaodong He, Jianshu Chen, Xinying Song, Rabab Ward

The results show that the proposed method in this paper significantly outperforms it for web document retrieval task.

Information Retrieval Retrieval +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.