no code implementations • 10 Sep 2024 • Kuan Wang, Alexander Bukharin, Haoming Jiang, Qingyu Yin, Zhengyang Wang, Tuo Zhao, Jingbo Shang, Chao Zhang, Bing Yin, Xian Li, Jianshu Chen, Shiyang Li
However, existing models trained on open-source IFT datasets only have the ability to follow instructions from users, and often fail to follow complex role and rules specified by developers, a. k. a.
1 code implementation • 27 Feb 2024 • Xinran Zhao, Hongming Zhang, Xiaoman Pan, Wenlin Yao, Dong Yu, Tongshuang Wu, Jianshu Chen
For a LLM to be trustworthy, its confidence level should be well-calibrated with its actual performance.
2 code implementations • 15 Feb 2024 • Rui Yang, Xiaoman Pan, Feng Luo, Shuang Qiu, Han Zhong, Dong Yu, Jianshu Chen
We consider the problem of multi-objective alignment of foundation models with human preferences, which is a critical step towards helpful and harmless AI systems.
6 code implementations • 15 Nov 2023 • Fuxiao Liu, Xiaoyang Wang, Wenlin Yao, Jianshu Chen, Kaiqiang Song, Sangwoo Cho, Yaser Yacoob, Dong Yu
Recognizing the need for a comprehensive evaluation of LMM chart understanding, we also propose a MultiModal Chart Benchmark (\textbf{MMC-Benchmark}), a comprehensive human-annotated benchmark with nine distinct tasks evaluating reasoning capabilities over charts.
1 code implementation • 30 Sep 2023 • Xuansheng Wu, Wenlin Yao, Jianshu Chen, Xiaoman Pan, Xiaoyang Wang, Ninghao Liu, Dong Yu
In this work, we investigate how the instruction tuning adjusts pre-trained models with a focus on intrinsic changes.
no code implementations • 1 Aug 2023 • Jiaao Chen, Xiaoman Pan, Dian Yu, Kaiqiang Song, Xiaoyang Wang, Dong Yu, Jianshu Chen
We investigate how to elicit compositional generalization capabilities in large language models (LLMs).
Ranked #34 on Math Word Problem Solving on MATH
no code implementations • 8 Jul 2023 • Neeraj Varshney, Wenlin Yao, Hongming Zhang, Jianshu Chen, Dong Yu
Specifically, the detection technique achieves a recall of ~88% and the mitigation technique successfully mitigates 57. 6% of the correctly detected hallucinations.
1 code implementation • 24 May 2023 • Keming Lu, Xiaoman Pan, Kaiqiang Song, Hongming Zhang, Dong Yu, Jianshu Chen
In particular, we construct INSTRUCTOPENWIKI, a substantial instruction tuning dataset for Open-world IE enriched with a comprehensive corpus, extensive annotations, and diverse instructions.
no code implementations • 22 May 2023 • Nan Xu, Hongming Zhang, Jianshu Chen
Existing event-centric NLP models often only apply to the pre-defined ontology, which significantly restricts their generalization capabilities.
no code implementations • 19 Feb 2023 • Jianshu Chen
We construct a set of neural logic operators as learnable Horn clauses, which are further forward-chained into a fully differentiable neural architecture (FOLNet).
1 code implementation • 6 Dec 2022 • Pei Chen, Wenlin Yao, Hongming Zhang, Xiaoman Pan, Dian Yu, Dong Yu, Jianshu Chen
However, there has been limited research on the zero-shot KBC settings, where we need to deal with unseen entities and relations that emerge in a constantly growing knowledge base.
no code implementations • 28 Oct 2022 • Xiaoman Pan, Wenlin Yao, Hongming Zhang, Dian Yu, Dong Yu, Jianshu Chen
In this paper, we develop a novel semi-parametric language model architecture, Knowledge-in-Context (KiC), which empowers a parametric text-to-text language model with a knowledge-rich external memory.
Ranked #5 on Question Answering on StoryCloze
1 code implementation • 21 Oct 2022 • Yue Yang, Wenlin Yao, Hongming Zhang, Xiaoyang Wang, Dong Yu, Jianshu Chen
Large-scale pretrained language models have made significant advances in solving downstream language understanding tasks.
Ranked #2 on Visual Commonsense Tests on ViComTe-color
no code implementations • 13 Oct 2022 • Shiyang Li, Jianshu Chen, Yelong Shen, Zhiyu Chen, Xinlu Zhang, Zekun Li, Hong Wang, Jing Qian, Baolin Peng, Yi Mao, Wenhu Chen, Xifeng Yan
Integrating free-text explanations to in-context learning of large language models (LLM) is shown to elicit strong reasoning capabilities along with reasonable explanations.
no code implementations • 6 Oct 2022 • Jianyi Zhang, Yiran Chen, Jianshu Chen
Developing neural architectures that are capable of logical reasoning has become increasingly important for a wide range of applications (e. g., natural language processing).
1 code implementation • 1 Oct 2022 • Zhenhailong Wang, Xiaoman Pan, Dian Yu, Dong Yu, Jianshu Chen, Heng Ji
Notably, our proposed $\text{Zemi}_\text{LARGE}$ outperforms T0-3B by 16% on all seven evaluation tasks while being 3. 9x smaller in model size.
no code implementations • 7 Sep 2022 • Yulai Zhao, Jianshu Chen, Simon S. Du
Here, $n$ is the number of pre-training data and $m$ is the number of data in the downstream task, and typically $n \gg m$.
1 code implementation • ACL 2022 • Chao Zhao, Wenlin Yao, Dian Yu, Kaiqiang Song, Dong Yu, Jianshu Chen
Comprehending a dialogue requires a model to capture diverse kinds of key information in the utterances, which are either scattered around or implicitly implied in different turns of conversations.
1 code implementation • ACL 2022 • Xiang Yue, Xiaoman Pan, Wenlin Yao, Dian Yu, Dong Yu, Jianshu Chen
And with our pretrained reader, the entire system improves by up to 4% in exact match.
no code implementations • 1 Feb 2022 • Daoming Lyu, Bo Liu, Jianshu Chen
We consider the problem of multi-task reasoning (MTR), where an agent can solve multiple tasks via (first-order) logic reasoning.
1 code implementation • EMNLP 2021 • Wenlin Yao, Xiaoman Pan, Lifeng Jin, Jianshu Chen, Dian Yu, Dong Yu
We then train a model to identify semantic equivalence between a target word in context and one of its glosses using these aligned inventories, which exhibits strong transfer capability to many WSD tasks.
no code implementations • ACL 2022 • Kai Sun, Dian Yu, Jianshu Chen, Dong Yu, Claire Cardie
In this paper, we aim to extract commonsense knowledge to improve machine reading comprehension.
1 code implementation • ECCV 2020 • Yiwu Zhong, Li-Wei Wang, Jianshu Chen, Dong Yu, Yin Li
We address the challenging problem of image captioning by revisiting the representation of image scene graph.
no code implementations • ACL 2020 • Linfeng Song, Kun Xu, Yue Zhang, Jianshu Chen, Dong Yu
Zero pronoun recovery and resolution aim at recovering the dropped pronoun and pointing out its anaphoric mentions, respectively.
no code implementations • 15 Jun 2020 • Anji Liu, Yitao Liang, Ji Liu, Guy Van Den Broeck, Jianshu Chen
Second, and more importantly, we demonstrate how the proposed necessary conditions can be adopted to design more effective parallel MCTS algorithms.
1 code implementation • ACL 2020 • Hongyu Gong, Yelong Shen, Dian Yu, Jianshu Chen, Dong Yu
In this paper, we study machine reading comprehension (MRC) on long texts, where a model takes as inputs a lengthy document and a question and then extracts a text span from the document as an answer.
1 code implementation • ACL 2020 • Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang
To facilitate the study of the proposed logical NLG problem, we use the existing TabFact dataset \cite{chen2019tabfact} featured with a wide range of logical/symbolic inferences as our testbed, and propose new automatic metrics to evaluate the fidelity of generation models w. r. t.\ logical inference.
no code implementations • CONLL 2019 • Hai Wang, Dian Yu, Kai Sun, Jianshu Chen, Dong Yu
However, in multilingual setting, it is extremely resource-consuming to pre-train a deep language model over large-scale corpora for each language.
no code implementations • 20 Sep 2019 • Shiyang Li, Jianshu Chen, Dian Yu
Recently, pretrained language models (e. g., BERT) have achieved great success on many downstream natural language understanding tasks and exhibit a certain level of commonsense reasoning ability.
no code implementations • NeurIPS Workshop Deep_Invers 2019 • Sichen Zhong, Yue Zhao, Jianshu Chen
In compressed sensing, a primary problem to solve is to reconstruct a high dimensional sparse signal from a small number of observations.
1 code implementation • ICLR 2020 • Wenhu Chen, Hongmin Wang, Jianshu Chen, Yunkai Zhang, Hong Wang, Shiyang Li, Xiyou Zhou, William Yang Wang
To this end, we construct a large-scale dataset called TabFact with 16k Wikipedia tables as the evidence for 118k human-annotated natural language statements, which are labeled as either ENTAILED or REFUTED.
Ranked #12 on Table-based Fact Verification on TabFact
1 code implementation • NeurIPS 2019 • Adithya M. Devraj, Jianshu Chen
We consider a generic empirical composition optimization problem, where there are empirical averages present both outside and inside nonlinear loss functions.
no code implementations • 6 Jun 2019 • Yu Liu, Li Deng, Jianshu Chen, Chang Wen Chen
To remove the need for the parallel training corpora has practical significance for real-world applications, and it is one of the main goals of unsupervised learning.
2 code implementations • ACL 2019 • Wenhu Chen, Jianshu Chen, Pengda Qin, Xifeng Yan, William Yang Wang
Semantically controlled neural response generation on limited-domain has achieved great performance.
Ranked #5 on Data-to-Text Generation on MULTIWOZ 2.1
no code implementations • TACL 2019 • Kai Sun, Dian Yu, Jianshu Chen, Dong Yu, Yejin Choi, Claire Cardie
We present DREAM, the first dialogue-based multiple-choice reading comprehension data set.
1 code implementation • CONLL 2019 • Hai Wang, Dian Yu, Kai Sun, Jianshu Chen, Dong Yu, David Mcallester, Dan Roth
Remarkable success has been achieved in the last few years on some limited machine reading comprehension (MRC) tasks.
1 code implementation • WS 2019 • Xiaoman Pan, Kai Sun, Dian Yu, Jianshu Chen, Heng Ji, Claire Cardie, Dong Yu
We focus on multiple-choice question answering (QA) tasks in subject areas such as science, where we require both broad background knowledge and the facts from the given subject-area reference corpus.
1 code implementation • 1 Feb 2019 • Kai Sun, Dian Yu, Jianshu Chen, Dong Yu, Yejin Choi, Claire Cardie
DREAM is likely to present significant challenges for existing reading comprehension systems: 84% of answers are non-extractive, 85% of questions require reasoning beyond a single sentence, and 34% of questions also involve commonsense knowledge.
no code implementations • ICLR 2019 • Chih-Kuan Yeh, Jianshu Chen, Chengzhu Yu, Dong Yu
We consider the problem of training speech recognition systems without using any labeled data, under the assumption that the learner can only access to the input utterances and a phoneme language model estimated from a non-overlapping corpus.
1 code implementation • NeurIPS 2018 • Bo Dai, Hanjun Dai, Niao He, Weiyang Liu, Zhen Liu, Jianshu Chen, Lin Xiao, Le Song
This flexible function class couples the variational distribution with the original parameters in the graphical models, allowing end-to-end learning of the graphical models by back-propagation through the variational distribution.
no code implementations • 1 Nov 2018 • Jiaao Chen, Jianshu Chen, Zhou Yu
The ability to select an appropriate story ending is the first step towards perfect narrative comprehension.
4 code implementations • ICLR 2020 • Anji Liu, Jianshu Chen, Mingze Yu, Yu Zhai, Xuewen Zhou, Ji Liu
Monte Carlo Tree Search (MCTS) algorithms have achieved great success on many challenging benchmarks (e. g., Computer Go).
1 code implementation • EMNLP 2018 • Wenhu Chen, Jianshu Chen, Yu Su, Xin Wang, Dong Yu, Xifeng Yan, William Yang Wang
Then, we pre-train a state tracker for the source language as a teacher, which is able to exploit easy-to-access parallel data.
no code implementations • NeurIPS 2018 • Yelong Shen, Jianshu Chen, Po-Sen Huang, Yuqing Guo, Jianfeng Gao
In order to effectively train the agent from sparse rewards, we combine MCTS with the neural policy to generate trajectories yielding more positive rewards.
Ranked #45 on Link Prediction on WN18RR (Hits@3 metric)
no code implementations • ICML 2018 • Bo Dai, Albert Shaw, Lihong Li, Lin Xiao, Niao He, Zhen Liu, Jianshu Chen, Le Song
When function approximation is used, solving the Bellman optimality equation with stability guarantees has remained a major open problem in reinforcement learning for decades.
no code implementations • NeurIPS 2017 • Jianshu Chen, Chong Wang, Lin Xiao, Ji He, Lihong Li, Li Deng
In sequential decision making, it is often important and useful for end users to understand the underlying patterns or causes that lead to the corresponding decisions.
no code implementations • 21 Oct 2017 • Yue Zhao, Jianshu Chen, H. Vincent Poor
Identifying a potentially large number of simultaneous line outages in power transmission networks in real time is a computationally hard problem.
no code implementations • ICML 2017 • Simon S. Du, Jianshu Chen, Lihong Li, Lin Xiao, Dengyong Zhou
Policy evaluation is a crucial step in many reinforcement-learning procedures, which estimates a value function that predicts states' long-term value under a given policy.
no code implementations • NeurIPS 2017 • Yu Liu, Jianshu Chen, Li Deng
Although it is harder to optimize in its functional form, a stochastic primal-dual gradient method is developed to effectively solve the problem.
2 code implementations • 8 Feb 2017 • Zhe Gan, P. D. Singh, Ameet Joshi, Xiaodong He, Jianshu Chen, Jianfeng Gao, Li Deng
Connecting different text attributes associated with the same entity (conflation) is important in business data analytics since it could help merge two different tables in a database to provide a more comprehensive profile of an entity.
no code implementations • 15 Jun 2016 • Jianshu Chen, Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng
In particular, we show that with regularization via a generative model, learning with the proposed unsupervised objective function converges to an optimal solution.
1 code implementation • EMNLP 2016 • Ji He, Mari Ostendorf, Xiaodong He, Jianshu Chen, Jianfeng Gao, Lihong Li, Li Deng
We introduce an online popularity prediction and tracking task as a benchmark task for reinforcement learning with a combinatorial, natural language action space.
3 code implementations • ACL 2016 • Ji He, Jianshu Chen, Xiaodong He, Jianfeng Gao, Lihong Li, Li Deng, Mari Ostendorf
This paper introduces a novel architecture for reinforcement learning with deep neural networks designed to handle state and action spaces characterized by natural language, as found in text-based games.
no code implementations • 10 Sep 2015 • Xiujun Li, Lihong Li, Jianfeng Gao, Xiaodong He, Jianshu Chen, Li Deng, Ji He
Successful applications of reinforcement learning in real-world problems often require dealing with partially observable states.
1 code implementation • NeurIPS 2015 • Jianshu Chen, Ji He, Yelong Shen, Lin Xiao, Xiaodong He, Jianfeng Gao, Xinying Song, Li Deng
We develop a fully discriminative learning approach for supervised Latent Dirichlet Allocation (LDA) model using Back Propagation (i. e., BP-sLDA), which maximizes the posterior probability of the prediction variable given the input document.
no code implementations • 11 Apr 2015 • Yelong Shen, Ruoming Jin, Jianshu Chen, Xiaodong He, Jianfeng Gao, Li Deng
Co-occurrence Data is a common and important information source in many areas, such as the word co-occurrence in the sentences, friends co-occurrence in social networks and products co-occurrence in commercial transaction data, etc, which contains rich correlation and clustering information about the items.
no code implementations • 24 Feb 2015 • Hamid Palangi, Li Deng, Yelong Shen, Jianfeng Gao, Xiaodong He, Jianshu Chen, Xinying Song, Rabab Ward
The results show that the proposed method in this paper significantly outperforms it for web document retrieval task.
no code implementations • 6 Feb 2014 • Jianshu Chen, Zaid J. Towfic, Ali H. Sayed
In this paper, we consider learning dictionary models over a network of agents, where each agent is only in charge of a portion of the dictionary elements.
no code implementations • 30 Dec 2013 • Sergio Valcarcel Macua, Jianshu Chen, Santiago Zazo, Ali H. Sayed
We apply diffusion strategies to develop a fully-distributed cooperative reinforcement learning algorithm in which agents in a network communicate only with their immediate neighbors to improve predictions about their environment.
no code implementations • 24 Nov 2013 • Jianshu Chen, Li Deng
We present an architecture of a recurrent neural network (RNN) with a fully-connected deep neural network (DNN) as its feature extractor.