Search Results for author: Peitian Zhang

Found 22 papers, 15 papers with code

Does RAG Really Perform Bad For Long-Context Processing?

no code implementations17 Feb 2025 Kun Luo, Zheng Liu, Peitian Zhang, Hongjin Qian, Jun Zhao, Kang Liu

The efficient processing of long context poses a serious challenge for large language models (LLMs).

RAG Retrieval

Search-o1: Agentic Search-Enhanced Large Reasoning Models

1 code implementation9 Jan 2025 Xiaoxi Li, Guanting Dong, Jiajie Jin, Yuyao Zhang, Yujia Zhou, Yutao Zhu, Peitian Zhang, Zhicheng Dou

To address this limitation, we introduce \textbf{Search-o1}, a framework that enhances LRMs with an agentic retrieval-augmented generation (RAG) mechanism and a Reason-in-Documents module for refining retrieved documents.

Code Generation +4

Boosting Long-Context Management via Query-Guided Activation Refilling

no code implementations17 Dec 2024 Hongjin Qian, Zheng Liu, Peitian Zhang, Zhicheng Dou, Defu Lian

ACRE constructs a Bi-layer KV Cache for long contexts, where the layer-1 (L1) cache compactly captures global information, and the layer-2 (L2) cache provides detailed and localized information.

Management

MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery

1 code implementation9 Sep 2024 Hongjin Qian, Peitian Zhang, Zheng Liu, Kelong Mao, Zhicheng Dou

Retrieval-Augmented Generation (RAG) leverages retrieval tools to access external databases, thereby enhancing the generation quality of large language models (LLMs) through optimized context.

Memorization Question Answering +2

Compressing Lengthy Context With UltraGist

1 code implementation26 May 2024 Peitian Zhang, Zheng Liu, Shitao Xiao, Ninglu Shao, Qiwei Ye, Zhicheng Dou

Compressing lengthy context is a critical but technically challenging problem.

Few-Shot Learning

Are Long-LLMs A Necessity For Long-Context Tasks?

no code implementations24 May 2024 Hongjin Qian, Zheng Liu, Peitian Zhang, Kelong Mao, Yujia Zhou, Xu Chen, Zhicheng Dou

The learning and deployment of long-LLMs remains a challenging problem despite recent progresses.

Extending Llama-3's Context Ten-Fold Overnight

1 code implementation30 Apr 2024 Peitian Zhang, Ninglu Shao, Zheng Liu, Shitao Xiao, Hongjin Qian, Qiwei Ye, Zhicheng Dou

We extend the context length of Llama-3-8B-Instruct from 8K to 80K via QLoRA fine-tuning.

8k Retrieval

From Matching to Generation: A Survey on Generative Information Retrieval

1 code implementation23 Apr 2024 Xiaoxi Li, Jiajie Jin, Yujia Zhou, Yuyao Zhang, Peitian Zhang, Yutao Zhu, Zhicheng Dou

We will summarize the advancements in GR regarding model training, document identifier, incremental learning, downstream tasks adaptation, multi-modal GR and generative recommendation, as well as progress in reliable response generation in aspects of internal knowledge memorization, external knowledge augmentation, generating response with citations and personal information assistant.

Incremental Learning Information Retrieval +6

Extensible Embedding: A Flexible Multipler For LLM's Context Length

no code implementations18 Feb 2024 Ninglu Shao, Shitao Xiao, Zheng Liu, Peitian Zhang

2) Strong sample efficiency of training, which enables the embedding model to be learned in a cost-effective way.

Language Modeling Language Modelling

BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation

3 code implementations5 Feb 2024 Jianlv Chen, Shitao Xiao, Peitian Zhang, Kun Luo, Defu Lian, Zheng Liu

It can simultaneously perform the three common retrieval functionalities of embedding model: dense retrieval, multi-vector retrieval, and sparse retrieval, which provides a unified model foundation for real-world IR applications.

Retrieval Self-Knowledge Distillation

Flexibly Scaling Large Language Models Contexts Through Extensible Tokenization

1 code implementation15 Jan 2024 Ninglu Shao, Shitao Xiao, Zheng Liu, Peitian Zhang

Extensible Tokenization stands as a midware in between of the tokenized context and the LLM, which transforms the raw token embeddings into the extensible embeddings.

Few-Shot Learning Language Modeling +1

INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning

1 code implementation12 Jan 2024 Yutao Zhu, Peitian Zhang, Chenghao Zhang, Yifei Chen, Binyu Xie, Zheng Liu, Ji-Rong Wen, Zhicheng Dou

Despite this, their application to information retrieval (IR) tasks is still challenging due to the infrequent occurrence of many IR-specific concepts in natural language.

Diversity document understanding +3

Long Context Compression with Activation Beacon

1 code implementation7 Jan 2024 Peitian Zhang, Zheng Liu, Shitao Xiao, Ninglu Shao, Qiwei Ye, Zhicheng Dou

In this paper, we propose Activation Beacon, a plug-in module for transformer-based LLMs that targets effective, efficient, and flexible compression of long contexts.

4k document understanding +2

LM-Cocktail: Resilient Tuning of Language Models via Model Merging

1 code implementation22 Nov 2023 Shitao Xiao, Zheng Liu, Peitian Zhang, Xingrun Xing

Despite simplicity, LM-Cocktail is surprisingly effective: the resulted model is able to achieve a strong empirical performance in the whole scope of general tasks while preserving a superior capacity in its targeted domain.

Language Modeling Language Modelling +1

Retrieve Anything To Augment Large Language Models

1 code implementation11 Oct 2023 Peitian Zhang, Shitao Xiao, Zheng Liu, Zhicheng Dou, Jian-Yun Nie

On the other hand, the task-specific retrievers lack the required versatility, hindering their performance across the diverse retrieval augmentation scenarios.

Knowledge Distillation Retrieval

C-Pack: Packed Resources For General Chinese Embeddings

2 code implementations14 Sep 2023 Shitao Xiao, Zheng Liu, Peitian Zhang, Niklas Muennighoff, Defu Lian, Jian-Yun Nie

Along with our resources on general Chinese embedding, we release our data and models for English text embeddings.

Generative Retrieval via Term Set Generation

1 code implementation23 May 2023 Peitian Zhang, Zheng Liu, Yujia Zhou, Zhicheng Dou, Fangchao Liu, Zhao Cao

On top of the term-set DocID, we propose a permutation-invariant decoding algorithm, with which the term set can be generated in any permutation yet will always lead to the corresponding document.

Information Retrieval Natural Questions +1

Hybrid Inverted Index Is a Robust Accelerator for Dense Retrieval

1 code implementation11 Oct 2022 Peitian Zhang, Zheng Liu, Shitao Xiao, Zhicheng Dou, Jing Yao

Based on comprehensive experiments on popular retrieval benchmarks, we verify that clusters and terms indeed complement each other, enabling HI$^2$ to achieve lossless retrieval quality with competitive efficiency across various index settings.

Knowledge Distillation Quantization +1

Ultron: An Ultimate Retriever on Corpus with a Model-based Indexer

no code implementations19 Aug 2022 Yujia Zhou, Jing Yao, Zhicheng Dou, Ledell Wu, Peitian Zhang, Ji-Rong Wen

In order to unify these two stages, we explore a model-based indexer for document retrieval.

Retrieval

Learning to Select Historical News Articles for Interaction based Neural News Recommendation

no code implementations13 Oct 2021 Peitian Zhang, Zhicheng Dou, Jing Yao

The key to personalized news recommendation is to match the user's interests with the candidate news precisely and efficiently.

News Recommendation

Cannot find the paper you are looking for? You can Submit a new open access paper.