Search Results for author: Hansi Zeng

Found 18 papers, 10 papers with code

Can Pre-training Indicators Reliably Predict Fine-tuning Outcomes of LLMs?

no code implementations16 Apr 2025 Hansi Zeng, Kai Hui, Honglei Zhuang, Zhen Qin, Zhenrui Yue, Hamed Zamani, Dana Alon

While metrics available during pre-training, such as perplexity, correlate well with model performance at scaling-laws studies, their predictive capacities at a fixed model size remain unclear, hindering effective model selection and development.

Model Selection

Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

3 code implementations12 Mar 2025 Bowen Jin, Hansi Zeng, Zhenrui Yue, Jinsung Yoon, Sercan Arik, Dong Wang, Hamed Zamani, Jiawei Han

Efficiently acquiring external knowledge and up-to-date information is essential for effective reasoning and text generation in large language models (LLMs).

Question Answering RAG +3

ATEB: Evaluating and Improving Advanced NLP Tasks for Text Embedding Models

no code implementations24 Feb 2025 Simeng Han, Frank Palma Gomez, Tu Vu, Zefei Li, Daniel Cer, Hansi Zeng, Chris Tar, Arman Cohan, Gustavo Hernandez Abrego

We introduce a new benchmark designed to assess and highlight the limitations of embedding models trained on existing information retrieval data mixtures on advanced capabilities, which include factuality, safety, instruction following, reasoning and document-level understanding.

Information Retrieval Instruction Following +3

Scaling Sparse and Dense Retrieval in Decoder-Only LLMs

2 code implementations21 Feb 2025 Hansi Zeng, Julian Killingback, Hamed Zamani

Scaling large language models (LLMs) has shown great potential for improving retrieval model performance; however, previous studies have mainly focused on dense retrieval trained with contrastive loss (CL), neglecting the scaling behavior of other retrieval paradigms and optimization techniques, such as sparse retrieval and knowledge distillation (KD).

Decoder Knowledge Distillation +1

Hypencoder: Hypernetworks for Information Retrieval

no code implementations7 Feb 2025 Julian Killingback, Hansi Zeng, Hamed Zamani

To produce the small neural network we use a hypernetwork, a network that produces the weights of other networks, as our query encoder.

Information Retrieval Instruction Following +2

Inference Scaling for Long-Context Retrieval Augmented Generation

no code implementations6 Oct 2024 Zhenrui Yue, Honglei Zhuang, Aijun Bai, Kai Hui, Rolf Jagerman, Hansi Zeng, Zhen Qin, Dong Wang, Xuanhui Wang, Michael Bendersky

Our observations reveal that increasing inference computation leads to nearly linear gains in RAG performance when optimally allocated, a relationship we describe as the inference scaling laws for RAG.

In-Context Learning RAG +2

Simulating Task-Oriented Dialogues with State Transition Graphs and Large Language Models

1 code implementation23 Apr 2024 Chris Samarinas, Pracha Promthaw, Atharva Nijasure, Hansi Zeng, Julian Killingback, Hamed Zamani

In our experiments, using graph-guided response simulations leads to significant improvements in intent classification, slot filling and response relevance compared to naive single-prompt simulated conversations.

Conversational Question Answering Dialogue State Tracking +8

Planning Ahead in Generative Retrieval: Guiding Autoregressive Generation through Simultaneous Decoding

1 code implementation22 Apr 2024 Hansi Zeng, Chen Luo, Hamed Zamani

This paper introduces PAG-a novel optimization and decoding approach that guides autoregressive generation of document identifiers in generative retrieval models through simultaneous decoding.

Information Retrieval Retrieval

Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond

1 code implementation15 Mar 2024 Tianxin Wei, Bowen Jin, Ruirui Li, Hansi Zeng, Zhengyang Wang, Jianhui Sun, Qingyu Yin, Hanqing Lu, Suhang Wang, Jingrui He, Xianfeng Tang

Developing a universal model that can effectively harness heterogeneous resources and respond to a wide range of personalized needs has been a longstanding community aspiration.

Explanation Generation Image Generation

Scalable and Effective Generative Information Retrieval

3 code implementations15 Nov 2023 Hansi Zeng, Chen Luo, Bowen Jin, Sheikh Muhammad Sarwar, Tianxin Wei, Hamed Zamani

This paper represents an important milestone in generative retrieval research by showing, for the first time, that generative retrieval models can be trained to perform effectively on large-scale standard retrieval benchmarks.

Information Retrieval Retrieval

Language Models As Semantic Indexers

1 code implementation11 Oct 2023 Bowen Jin, Hansi Zeng, Guoyin Wang, Xiusi Chen, Tianxin Wei, Ruirui Li, Zhengyang Wang, Zheng Li, Yang Li, Hanqing Lu, Suhang Wang, Jiawei Han, Xianfeng Tang

Semantic identifier (ID) is an important concept in information retrieval that aims to preserve the semantics of objects such as documents and items inside their IDs.

Contrastive Learning Information Retrieval +2

Soft Prompt Decoding for Multilingual Dense Retrieval

no code implementations15 May 2023 Zhiqi Huang, Hansi Zeng, Hamed Zamani, James Allan

In this work, we explore a Multilingual Information Retrieval (MLIR) task, where the collection includes documents in multiple languages.

Cross-Lingual Information Retrieval Knowledge Distillation +1

A Personalized Dense Retrieval Framework for Unified Information Access

1 code implementation26 Apr 2023 Hansi Zeng, Surya Kallumadi, Zaid Alibadi, Rodrigo Nogueira, Hamed Zamani

Developing a universal model that can efficiently and effectively respond to a wide range of information access requests -- from retrieval to recommendation to question answering -- has been a long-lasting goal in the information retrieval community.

Information Retrieval Question Answering +1

Curriculum Learning for Dense Retrieval Distillation

1 code implementation28 Apr 2022 Hansi Zeng, Hamed Zamani, Vishwa Vinay

Recent work has shown that more effective dense retrieval models can be obtained by distilling ranking knowledge from an existing base re-ranking model.

Knowledge Distillation Passage Retrieval +2

Understanding the Effectiveness of Reviews in E-commerce Top-N Recommendation

1 code implementation17 Jun 2021 Zhichao Xu, Hansi Zeng, Qingyao Ai

We find that models utilizing only review information can not achieve better performances than vanilla implicit-feedback matrix factorization method.

Cannot find the paper you are looking for? You can Submit a new open access paper.