Search Results for author: Jianshu Chen

Found 57 papers, 23 papers with code

From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning

no code implementations30 Sep 2023 Xuansheng Wu, Wenlin Yao, Jianshu Chen, Xiaoman Pan, Xiaoyang Wang, Ninghao Liu, Dong Yu

Large Language Models (LLMs) have achieved remarkable success, demonstrating powerful instruction-following capabilities across diverse tasks.

Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models

no code implementations1 Aug 2023 Jiaao Chen, Xiaoman Pan, Dian Yu, Kaiqiang Song, Xiaoyang Wang, Dong Yu, Jianshu Chen

Compositional generalization empowers the LLMs to solve problems that are harder than the ones they have seen (i. e., easy-to-hard generalization), which is a critical reasoning capability of human-like intelligence.

Mathematical Reasoning Math Word Problem Solving

Thrust: Adaptively Propels Large Language Models with External Knowledge

no code implementations19 Jul 2023 Xinran Zhao, Hongming Zhang, Xiaoman Pan, Wenlin Yao, Dong Yu, Jianshu Chen

Although large-scale pre-trained language models (PTLMs) are shown to encode rich knowledge in their model parameters, the inherent knowledge in PTLMs can be opaque or static, making external knowledge necessary.

Information Retrieval Retrieval

A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation

no code implementations8 Jul 2023 Neeraj Varshney, Wenlin Yao, Hongming Zhang, Jianshu Chen, Dong Yu

Specifically, the detection technique achieves a recall of ~88% and the mitigation technique successfully mitigates 57. 6% of the correctly detected hallucinations.

PIVOINE: Instruction Tuning for Open-world Information Extraction

1 code implementation24 May 2023 Keming Lu, Xiaoman Pan, Kaiqiang Song, Hongming Zhang, Dong Yu, Jianshu Chen

In particular, we construct INSTRUCTOPENWIKI, a substantial instruction tuning dataset for Open-world IE enriched with a comprehensive corpus, extensive annotations, and diverse instructions.

Instruction Following Language Modelling +1

CEO: Corpus-based Open-Domain Event Ontology Induction

no code implementations22 May 2023 Nan Xu, Hongming Zhang, Jianshu Chen

Existing event-centric NLP models often only apply to the pre-defined ontology, which significantly restricts their generalization capabilities.

Learning Language Representations with Logical Inductive Bias

no code implementations19 Feb 2023 Jianshu Chen

We construct a set of neural logic operators as learnable Horn clauses, which are further forward-chained into a fully differentiable neural architecture (FOLNet).

Inductive Bias Representation Learning

ZeroKBC: A Comprehensive Benchmark for Zero-Shot Knowledge Base Completion

1 code implementation6 Dec 2022 Pei Chen, Wenlin Yao, Hongming Zhang, Xiaoman Pan, Dian Yu, Dong Yu, Jianshu Chen

However, there has been limited research on the zero-shot KBC settings, where we need to deal with unseen entities and relations that emerge in a constantly growing knowledge base.

Knowledge Base Completion Knowledge Graphs

Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models

no code implementations28 Oct 2022 Xiaoman Pan, Wenlin Yao, Hongming Zhang, Dian Yu, Dong Yu, Jianshu Chen

In this paper, we develop a novel semi-parametric language model architecture, Knowledge-in-Context (KiC), which empowers a parametric text-to-text language model with a knowledge-rich external memory.

Language Modelling

Z-LaVI: Zero-Shot Language Solver Fueled by Visual Imagination

1 code implementation21 Oct 2022 Yue Yang, Wenlin Yao, Hongming Zhang, Xiaoyang Wang, Dong Yu, Jianshu Chen

Large-scale pretrained language models have made significant advances in solving downstream language understanding tasks.

Language Modelling Retrieval +1

Explanations from Large Language Models Make Small Reasoners Better

no code implementations13 Oct 2022 Shiyang Li, Jianshu Chen, Yelong Shen, Zhiyu Chen, Xinlu Zhang, Zekun Li, Hong Wang, Jing Qian, Baolin Peng, Yi Mao, Wenhu Chen, Xifeng Yan

Integrating free-text explanations to in-context learning of large language models (LLM) is shown to elicit strong reasoning capabilities along with reasonable explanations.

Explanation Generation Multi-Task Learning

Join-Chain Network: A Logical Reasoning View of the Multi-head Attention in Transformer

no code implementations6 Oct 2022 Jianyi Zhang, Yiran Chen, Jianshu Chen

Developing neural architectures that are capable of logical reasoning has become increasingly important for a wide range of applications (e. g., natural language processing).

Logical Reasoning Natural Language Understanding

Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks

1 code implementation1 Oct 2022 Zhenhailong Wang, Xiaoman Pan, Dian Yu, Dong Yu, Jianshu Chen, Heng Ji

Notably, our proposed $\text{Zemi}_\text{LARGE}$ outperforms T0-3B by 16% on all seven evaluation tasks while being 3. 9x smaller in model size.

Language Modelling Retrieval +1

Blessing of Class Diversity in Pre-training

no code implementations7 Sep 2022 Yulai Zhao, Jianshu Chen, Simon S. Du

Here, $n$ is the number of pre-training data and $m$ is the number of data in the downstream task, and typically $n \gg m$.

Language Modelling Transfer Learning

Learning-by-Narrating: Narrative Pre-Training for Zero-Shot Dialogue Comprehension

1 code implementation ACL 2022 Chao Zhao, Wenlin Yao, Dian Yu, Kaiqiang Song, Dong Yu, Jianshu Chen

Comprehending a dialogue requires a model to capture diverse kinds of key information in the utterances, which are either scattered around or implicitly implied in different turns of conversations.

PRIMA: Planner-Reasoner Inside a Multi-task Reasoning Agent

no code implementations1 Feb 2022 Daoming Lyu, Bo Liu, Jianshu Chen

We consider the problem of multi-task reasoning (MTR), where an agent can solve multiple tasks via (first-order) logic reasoning.

Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories

2 code implementations EMNLP 2021 Wenlin Yao, Xiaoman Pan, Lifeng Jin, Jianshu Chen, Dian Yu, Dong Yu

We then train a model to identify semantic equivalence between a target word in context and one of its glosses using these aligned inventories, which exhibits strong transfer capability to many WSD tasks.

Word Sense Disambiguation

Comprehensive Image Captioning via Scene Graph Decomposition

1 code implementation ECCV 2020 Yiwu Zhong, Li-Wei Wang, Jianshu Chen, Dong Yu, Yin Li

We address the challenging problem of image captioning by revisiting the representation of image scene graph.

Image Captioning

ZPR2: Joint Zero Pronoun Recovery and Resolution using Multi-Task Learning and BERT

no code implementations ACL 2020 Linfeng Song, Kun Xu, Yue Zhang, Jianshu Chen, Dong Yu

Zero pronoun recovery and resolution aim at recovering the dropped pronoun and pointing out its anaphoric mentions, respectively.

Multi-Task Learning

On Effective Parallelization of Monte Carlo Tree Search

no code implementations15 Jun 2020 Anji Liu, Yitao Liang, Ji Liu, Guy Van Den Broeck, Jianshu Chen

Second, and more importantly, we demonstrate how the proposed necessary conditions can be adopted to design more effective parallel MCTS algorithms.

Atari Games

Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension

1 code implementation ACL 2020 Hongyu Gong, Yelong Shen, Dian Yu, Jianshu Chen, Dong Yu

In this paper, we study machine reading comprehension (MRC) on long texts, where a model takes as inputs a lengthy document and a question and then extracts a text span from the document as an answer.

Chunking Machine Reading Comprehension +1

Logical Natural Language Generation from Open-Domain Tables

1 code implementation ACL 2020 Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang

To facilitate the study of the proposed logical NLG problem, we use the existing TabFact dataset \cite{chen2019tabfact} featured with a wide range of logical/symbolic inferences as our testbed, and propose new automatic metrics to evaluate the fidelity of generation models w. r. t.\ logical inference.

Text Generation

Improving Pre-Trained Multilingual Model with Vocabulary Expansion

no code implementations CONLL 2019 Hai Wang, Dian Yu, Kai Sun, Jianshu Chen, Dong Yu

However, in multilingual setting, it is extremely resource-consuming to pre-train a deep language model over large-scale corpora for each language.

Language Modelling Machine Reading Comprehension +5

Teaching Pretrained Models with Commonsense Reasoning: A Preliminary KB-Based Approach

no code implementations20 Sep 2019 Shiyang Li, Jianshu Chen, Dian Yu

Recently, pretrained language models (e. g., BERT) have achieved great success on many downstream natural language understanding tasks and exhibit a certain level of commonsense reasoning ability.

Few-Shot Learning Logical Reasoning +2

Learning to Recover Sparse Signals

no code implementations NeurIPS Workshop Deep_Invers 2019 Sichen Zhong, Yue Zhao, Jianshu Chen

In compressed sensing, a primary problem to solve is to reconstruct a high dimensional sparse signal from a small number of observations.

reinforcement-learning Reinforcement Learning (RL)

TabFact: A Large-scale Dataset for Table-based Fact Verification

1 code implementation ICLR 2020 Wenhu Chen, Hongmin Wang, Jianshu Chen, Yunkai Zhang, Hong Wang, Shiyang Li, Xiyou Zhou, William Yang Wang

To this end, we construct a large-scale dataset called TabFact with 16k Wikipedia tables as the evidence for 118k human-annotated natural language statements, which are labeled as either ENTAILED or REFUTED.

Fact Checking Fact Verification +3

Stochastic Variance Reduced Primal Dual Algorithms for Empirical Composition Optimization

1 code implementation NeurIPS 2019 Adithya M. Devraj, Jianshu Chen

We consider a generic empirical composition optimization problem, where there are empirical averages present both outside and inside nonlinear loss functions.

From Caesar Cipher to Unsupervised Learning: A New Method for Classifier Parameter Estimation

no code implementations6 Jun 2019 Yu Liu, Li Deng, Jianshu Chen, Chang Wen Chen

To remove the need for the parallel training corpora has practical significance for real-world applications, and it is one of the main goals of unsupervised learning.

Binary Classification General Classification +4

Improving Question Answering with External Knowledge

1 code implementation WS 2019 Xiaoman Pan, Kai Sun, Dian Yu, Jianshu Chen, Heng Ji, Claire Cardie, Dong Yu

We focus on multiple-choice question answering (QA) tasks in subject areas such as science, where we require both broad background knowledge and the facts from the given subject-area reference corpus.

Multiple-choice Question Answering

DREAM: A Challenge Dataset and Models for Dialogue-Based Reading Comprehension

1 code implementation1 Feb 2019 Kai Sun, Dian Yu, Jianshu Chen, Dong Yu, Yejin Choi, Claire Cardie

DREAM is likely to present significant challenges for existing reading comprehension systems: 84% of answers are non-extractive, 85% of questions require reasoning beyond a single sentence, and 34% of questions also involve commonsense knowledge.

Dialogue Understanding Multiple-choice +1

Unsupervised Speech Recognition via Segmental Empirical Output Distribution Matching

no code implementations ICLR 2019 Chih-Kuan Yeh, Jianshu Chen, Chengzhu Yu, Dong Yu

We consider the problem of training speech recognition systems without using any labeled data, under the assumption that the learner can only access to the input utterances and a phoneme language model estimated from a non-overlapping corpus.

Language Modelling speech-recognition +2

Coupled Variational Bayes via Optimization Embedding

1 code implementation NeurIPS 2018 Bo Dai, Hanjun Dai, Niao He, Weiyang Liu, Zhen Liu, Jianshu Chen, Lin Xiao, Le Song

This flexible function class couples the variational distribution with the original parameters in the graphical models, allowing end-to-end learning of the graphical models by back-propagation through the variational distribution.

Variational Inference

Incorporating Structured Commonsense Knowledge in Story Completion

no code implementations1 Nov 2018 Jiaao Chen, Jianshu Chen, Zhou Yu

The ability to select an appropriate story ending is the first step towards perfect narrative comprehension.

Story Completion

Watch the Unobserved: A Simple Approach to Parallelizing Monte Carlo Tree Search

4 code implementations ICLR 2020 Anji Liu, Jianshu Chen, Mingze Yu, Yu Zhai, Xuewen Zhou, Ji Liu

Monte Carlo Tree Search (MCTS) algorithms have achieved great success on many challenging benchmarks (e. g., Computer Go).

XL-NBT: A Cross-lingual Neural Belief Tracking Framework

1 code implementation EMNLP 2018 Wenhu Chen, Jianshu Chen, Yu Su, Xin Wang, Dong Yu, Xifeng Yan, William Yang Wang

Then, we pre-train a state tracker for the source language as a teacher, which is able to exploit easy-to-access parallel data.

Transfer Learning

M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search

no code implementations NeurIPS 2018 Yelong Shen, Jianshu Chen, Po-Sen Huang, Yuqing Guo, Jianfeng Gao

In order to effectively train the agent from sparse rewards, we combine MCTS with the neural policy to generate trajectories yielding more positive rewards.

Ranked #42 on Link Prediction on WN18RR (Hits@3 metric)

Knowledge Base Completion Link Prediction +2

SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation

no code implementations ICML 2018 Bo Dai, Albert Shaw, Lihong Li, Lin Xiao, Niao He, Zhen Liu, Jianshu Chen, Le Song

When function approximation is used, solving the Bellman optimality equation with stability guarantees has remained a major open problem in reinforcement learning for decades.

Q-Learning reinforcement-learning +1

Q-LDA: Uncovering Latent Patterns in Text-based Sequential Decision Processes

no code implementations NeurIPS 2017 Jianshu Chen, Chong Wang, Lin Xiao, Ji He, Lihong Li, Li Deng

In sequential decision making, it is often important and useful for end users to understand the underlying patterns or causes that lead to the corresponding decisions.

Decision Making Q-Learning +2

A Learning-to-Infer Method for Real-Time Power Grid Multi-Line Outage Identification

no code implementations21 Oct 2017 Yue Zhao, Jianshu Chen, H. Vincent Poor

Identifying a potentially large number of simultaneous line outages in power transmission networks in real time is a computationally hard problem.

Stochastic Variance Reduction Methods for Policy Evaluation

no code implementations ICML 2017 Simon S. Du, Jianshu Chen, Lihong Li, Lin Xiao, Dengyong Zhou

Policy evaluation is a crucial step in many reinforcement-learning procedures, which estimates a value function that predicts states' long-term value under a given policy.

Reinforcement Learning (RL)

Unsupervised Sequence Classification using Sequential Output Statistics

no code implementations NeurIPS 2017 Yu Liu, Jianshu Chen, Li Deng

Although it is harder to optimize in its functional form, a stochastic primal-dual gradient method is developed to effectively solve the problem.

Classification General Classification

Character-level Deep Conflation for Business Data Analytics

2 code implementations8 Feb 2017 Zhe Gan, P. D. Singh, Ameet Joshi, Xiaodong He, Jianshu Chen, Jianfeng Gao, Li Deng

Connecting different text attributes associated with the same entity (conflation) is important in business data analytics since it could help merge two different tables in a database to provide a more comprehensive profile of an entity.

Unsupervised Learning of Predictors from Unpaired Input-Output Samples

no code implementations15 Jun 2016 Jianshu Chen, Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng

In particular, we show that with regularization via a generative model, learning with the proposed unsupervised objective function converges to an optimal solution.

Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads

1 code implementation EMNLP 2016 Ji He, Mari Ostendorf, Xiaodong He, Jianshu Chen, Jianfeng Gao, Lihong Li, Li Deng

We introduce an online popularity prediction and tracking task as a benchmark task for reinforcement learning with a combinatorial, natural language action space.

reinforcement-learning Reinforcement Learning (RL)

Deep Reinforcement Learning with a Natural Language Action Space

3 code implementations ACL 2016 Ji He, Jianshu Chen, Xiaodong He, Jianfeng Gao, Lihong Li, Li Deng, Mari Ostendorf

This paper introduces a novel architecture for reinforcement learning with deep neural networks designed to handle state and action spaces characterized by natural language, as found in text-based games.

Q-Learning reinforcement-learning +2

Recurrent Reinforcement Learning: A Hybrid Approach

no code implementations10 Sep 2015 Xiujun Li, Lihong Li, Jianfeng Gao, Xiaodong He, Jianshu Chen, Li Deng, Ji He

Successful applications of reinforcement learning in real-world problems often require dealing with partially observable states.

reinforcement-learning Reinforcement Learning (RL)

End-to-end Learning of LDA by Mirror-Descent Back Propagation over a Deep Architecture

1 code implementation NeurIPS 2015 Jianshu Chen, Ji He, Yelong Shen, Lin Xiao, Xiaodong He, Jianfeng Gao, Xinying Song, Li Deng

We develop a fully discriminative learning approach for supervised Latent Dirichlet Allocation (LDA) model using Back Propagation (i. e., BP-sLDA), which maximizes the posterior probability of the prediction variable given the input document.

General Classification Topic Models

A Deep Embedding Model for Co-occurrence Learning

no code implementations11 Apr 2015 Yelong Shen, Ruoming Jin, Jianshu Chen, Xiaodong He, Jianfeng Gao, Li Deng

Co-occurrence Data is a common and important information source in many areas, such as the word co-occurrence in the sentences, friends co-occurrence in social networks and products co-occurrence in commercial transaction data, etc, which contains rich correlation and clustering information about the items.


Dictionary Learning over Distributed Models

no code implementations6 Feb 2014 Jianshu Chen, Zaid J. Towfic, Ali H. Sayed

In this paper, we consider learning dictionary models over a network of agents, where each agent is only in charge of a portion of the dictionary elements.

Collaborative Inference Dictionary Learning

Distributed Policy Evaluation Under Multiple Behavior Strategies

no code implementations30 Dec 2013 Sergio Valcarcel Macua, Jianshu Chen, Santiago Zazo, Ali H. Sayed

We apply diffusion strategies to develop a fully-distributed cooperative reinforcement learning algorithm in which agents in a network communicate only with their immediate neighbors to improve predictions about their environment.

A Primal-Dual Method for Training Recurrent Neural Networks Constrained by the Echo-State Property

no code implementations24 Nov 2013 Jianshu Chen, Li Deng

We present an architecture of a recurrent neural network (RNN) with a fully-connected deep neural network (DNN) as its feature extractor.

Cannot find the paper you are looking for? You can Submit a new open access paper.