Search Results for author: Zhan Su

Found 8 papers, 8 papers with code

Aligning Query Representation with Rewritten Query and Relevance Judgments in Conversational Search

1 code implementation29 Jul 2024 Fengran Mo, Chen Qu, Kelong Mao, Yihong Wu, Zhan Su, Kaiyu Huang, Jian-Yun Nie

In this paper, we leverage both rewritten queries and relevance judgments in the conversational search data to train a better query representation model.

Conversational Search

Mixture of Latent Experts Using Tensor Products

1 code implementation26 May 2024 Zhan Su, Fengran Mo, Prayag Tiwari, Benyou Wang, Jian-Yun Nie, Jakob Grue Simonsen

For \textit{routing function}, we tailor two innovative routing functions according to the granularity: \texttt{TensorPoly-I} which directs to each rank within the entangled tensor while \texttt{TensorPoly-II} offers a finer-grained routing approach targeting each order of the entangled tensor.

Language Modelling Multi-Task Learning +1

Towards Modular LLMs by Building and Reusing a Library of LoRAs

1 code implementation18 May 2024 Oleksiy Ostapenko, Zhan Su, Edoardo Maria Ponti, Laurent Charlin, Nicolas Le Roux, Matheus Pereira, Lucas Caccia, Alessandro Sordoni

The growing number of parameter-efficient adaptations of a base large language model (LLM) calls for studying whether we can reuse such trained adapters to improve performance for new tasks.

Language Modelling Large Language Model

Language Modeling Using Tensor Trains

1 code implementation7 May 2024 Zhan Su, Yuqin Zhou, Fengran Mo, Jakob Grue Simonsen

We propose a novel tensor network language model based on the simplest tensor network (i. e., tensor trains), called `Tensor Train Language Model' (TTLM).

Language Modelling

History-Aware Conversational Dense Retrieval

1 code implementation30 Jan 2024 Fengran Mo, Chen Qu, Kelong Mao, Tianyu Zhu, Zhan Su, Kaiyu Huang, Jian-Yun Nie

To address the aforementioned issues, we propose a History-Aware Conversational Dense Retrieval (HAConvDR) system, which incorporates two ideas: context-denoised query reformulation and automatic mining of supervision signals based on the actual impact of historical turns.

Conversational Search Information Retrieval +1

Multi-Head Adapter Routing for Cross-Task Generalization

1 code implementation NeurIPS 2023 Lucas Caccia, Edoardo Ponti, Zhan Su, Matheus Pereira, Nicolas Le Roux, Alessandro Sordoni

We find that routing is most beneficial during multi-task pre-training rather than during few-shot adaptation and propose $\texttt{MHR}$-$\mu$, which discards routing and fine-tunes the average of the pre-trained adapters on each downstream tasks.

parameter-efficient fine-tuning

A Generalized Language Model in Tensor Space

1 code implementation31 Jan 2019 Lipeng Zhang, Peng Zhang, Xindian Ma, Shuqin Gu, Zhan Su, Dawei Song

Theoretically, we prove that such tensor representation is a generalization of the n-gram language model.

Language Modelling Tensor Decomposition +1

A Quantum Many-body Wave Function Inspired Language Modeling Approach

1 code implementation28 Aug 2018 Peng Zhang, Zhan Su, Lipeng Zhang, Benyou Wang, Dawei Song

The recently proposed quantum language model (QLM) aimed at a principled approach to modeling term dependency by applying the quantum probability theory.

Language Modelling Question Answering +2

Cannot find the paper you are looking for? You can Submit a new open access paper.