Search Results for author: Shulin Cao

Found 24 papers, 17 papers with code

LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks

1 code implementation19 Dec 2024 Yushi Bai, Shangqing Tu, Jiajie Zhang, Hao Peng, Xiaozhi Wang, Xin Lv, Shulin Cao, Jiazheng Xu, Lei Hou, Yuxiao Dong, Jie Tang, Juanzi Li

This paper introduces LongBench v2, a benchmark designed to assess the ability of LLMs to handle long-context problems requiring deep understanding and reasoning across real-world multitasks.

8k In-Context Learning +1

AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning

1 code implementation25 Nov 2024 Amy Xin, Jinxin Liu, Zijun Yao, Zhicheng Lee, Shulin Cao, Lei Hou, Juanzi Li

Drawing inspiration from the graph modeling of knowledge, AtomR leverages large language models (LLMs) to decompose complex questions into combinations of three atomic knowledge operators, significantly enhancing the reasoning process at both the planning and execution stages.

Hallucination Question Answering +3

LongReward: Improving Long-context Large Language Models with AI Feedback

1 code implementation28 Oct 2024 Jiajie Zhang, Zhongni Hou, Xin Lv, Shulin Cao, Zhenyu Hou, Yilin Niu, Lei Hou, Yuxiao Dong, Ling Feng, Juanzi Li

Though significant advancements have been achieved in developing long-context large language models (LLMs), the compromised quality of LLM-synthesized data for supervised fine-tuning (SFT) often affects the long-context performance of SFT models and leads to inherent limitations.

Offline RL Reinforcement Learning (RL)

LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA

1 code implementation4 Sep 2024 Jiajie Zhang, Yushi Bai, Xin Lv, Wanjun Gu, Danqing Liu, Minhao Zou, Shulin Cao, Lei Hou, Yuxiao Dong, Ling Feng, Juanzi Li

Though current long-context large language models (LLMs) have demonstrated impressive capacities in answering user questions based on extensive text, the lack of citations in their responses makes user verification difficult, leading to concerns about their trustworthiness due to their potential hallucinations.

Question Answering Sentence

Aligning Teacher with Student Preferences for Tailored Training Data Generation

no code implementations27 Jun 2024 Yantao Liu, Zhao Zhang, Zijun Yao, Shulin Cao, Lei Hou, Juanzi Li

Thus, we propose ARTE, dubbed Aligning TeacheR with StudenT PreferencEs, a framework that aligns the teacher model with student preferences to generate tailored training examples for Knowledge Distillation.

In-Context Learning Knowledge Distillation

SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation

1 code implementation27 Jun 2024 Zijun Yao, Weijian Qi, Liangming Pan, Shulin Cao, Linmei Hu, Weichuan Liu, Lei Hou, Juanzi Li

This paper introduces Self-aware Knowledge Retrieval (SeaKR), a novel adaptive RAG model that extracts self-aware uncertainty of LLMs from their internal states.

Question Answering RAG +2

Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models

1 code implementation4 Apr 2024 Yantao Liu, Zijun Yao, Xin Lv, Yuchen Fan, Shulin Cao, Jifan Yu, Lei Hou, Juanzi Li

However, knowledge in the document may conflict with the memory of LLMs due to outdated or incorrect knowledge in the LLMs' parameters.

Question Answering

KB-Plugin: A Plug-and-play Framework for Large Language Models to Induce Programs over Low-resourced Knowledge Bases

1 code implementation2 Feb 2024 Jiajie Zhang, Shulin Cao, Linmei Hu, Ling Feng, Lei Hou, Juanzi Li

Secondly, KB-Plugin utilizes abundant annotated data from a rich-resourced KB to train another pluggable module, namely PI plugin, which can help the LLM extract question-relevant schema information from the schema plugin of any KB and utilize this information to induce programs over this KB.

Program induction Self-Supervised Learning

Probabilistic Tree-of-thought Reasoning for Answering Knowledge-intensive Complex Questions

1 code implementation23 Nov 2023 Shulin Cao, Jiajie Zhang, Jiaxin Shi, Xin Lv, Zijun Yao, Qi Tian, Juanzi Li, Lei Hou

During reasoning, for leaf nodes, LLMs choose a more confident answer from Closed-book QA that employs parametric knowledge and Open-book QA that employs retrieved external knowledge, thus eliminating the negative retrieval problem.

Retrieval

VisKoP: Visual Knowledge oriented Programming for Interactive Knowledge Base Question Answering

no code implementations6 Jul 2023 Zijun Yao, Yuanyong Chen, Xin Lv, Shulin Cao, Amy Xin, Jifan Yu, Hailong Jin, Jianjun Xu, Peng Zhang, Lei Hou, Juanzi Li

We present Visual Knowledge oriented Programming platform (VisKoP), a knowledge base question answering (KBQA) system that integrates human into the loop to edit and debug the knowledge base (KB) queries.

Knowledge Base Question Answering Program induction +2

Reasoning over Hierarchical Question Decomposition Tree for Explainable Question Answering

no code implementations24 May 2023 Jiajie Zhang, Shulin Cao, Tingjia Zhang, Xin Lv, Jiaxin Shi, Qi Tian, Juanzi Li, Lei Hou

To facilitate reasoning, we propose a novel two-stage XQA framework, Reasoning over Hierarchical Question Decomposition Tree (RoHT).

Question Answering

GraphQ IR: Unifying the Semantic Parsing of Graph Query Languages with One Intermediate Representation

1 code implementation24 May 2022 Lunyiu Nie, Shulin Cao, Jiaxin Shi, Jiuding Sun, Qi Tian, Lei Hou, Juanzi Li, Jidong Zhai

Subject to the huge semantic gap between natural and formal languages, neural semantic parsing is typically bottlenecked by its complexity of dealing with both input semantics and output syntax.

Few-Shot Learning Semantic Parsing

Schema-Free Dependency Parsing via Sequence Generation

no code implementations28 Jan 2022 Boda Lin, Zijun Yao, Jiaxin Shi, Shulin Cao, Binghao Tang, Si Li, Yong Luo, Juanzi Li, Lei Hou

To remedy these drawbacks, we propose to achieve universal and schema-free Dependency Parsing (DP) via Sequence Generation (SG) DPSG by utilizing only the pre-trained language model (PLM) without any auxiliary structures or parsing algorithms.

Decoder Dependency Parsing +1

Program Transfer for Answering Complex Questions over Knowledge Bases

1 code implementation ACL 2022 Shulin Cao, Jiaxin Shi, Zijun Yao, Xin Lv, Jifan Yu, Lei Hou, Juanzi Li, Zhiyuan Liu, Jinghui Xiao

In this paper, we propose the approach of program transfer, which aims to leverage the valuable program annotations on the rich-resourced KBs as external supervision signals to aid program induction for the low-resourced KBs that lack program annotations.

Program induction Semantic Parsing

OpenKE: An Open Toolkit for Knowledge Embedding

1 code implementation EMNLP 2018 Xu Han, Shulin Cao, Xin Lv, Yankai Lin, Zhiyuan Liu, Maosong Sun, Juanzi Li

We release an open toolkit for knowledge embedding (OpenKE), which provides a unified framework and various fundamental models to embed knowledge graphs into a continuous low-dimensional space.

Information Retrieval Knowledge Graphs +3

Cannot find the paper you are looking for? You can Submit a new open access paper.