1 code implementation • 29 Jul 2024 • Fengran Mo, Chen Qu, Kelong Mao, Yihong Wu, Zhan Su, Kaiyu Huang, Jian-Yun Nie
In this paper, we leverage both rewritten queries and relevance judgments in the conversational search data to train a better query representation model.
1 code implementation • 26 May 2024 • Zhan Su, Fengran Mo, Prayag Tiwari, Benyou Wang, Jian-Yun Nie, Jakob Grue Simonsen
For \textit{routing function}, we tailor two innovative routing functions according to the granularity: \texttt{TensorPoly-I} which directs to each rank within the entangled tensor while \texttt{TensorPoly-II} offers a finer-grained routing approach targeting each order of the entangled tensor.
1 code implementation • 18 May 2024 • Oleksiy Ostapenko, Zhan Su, Edoardo Maria Ponti, Laurent Charlin, Nicolas Le Roux, Matheus Pereira, Lucas Caccia, Alessandro Sordoni
The growing number of parameter-efficient adaptations of a base large language model (LLM) calls for studying whether we can reuse such trained adapters to improve performance for new tasks.
1 code implementation • 7 May 2024 • Zhan Su, Yuqin Zhou, Fengran Mo, Jakob Grue Simonsen
We propose a novel tensor network language model based on the simplest tensor network (i. e., tensor trains), called `Tensor Train Language Model' (TTLM).
1 code implementation • 30 Jan 2024 • Fengran Mo, Chen Qu, Kelong Mao, Tianyu Zhu, Zhan Su, Kaiyu Huang, Jian-Yun Nie
To address the aforementioned issues, we propose a History-Aware Conversational Dense Retrieval (HAConvDR) system, which incorporates two ideas: context-denoised query reformulation and automatic mining of supervision signals based on the actual impact of historical turns.
1 code implementation • NeurIPS 2023 • Lucas Caccia, Edoardo Ponti, Zhan Su, Matheus Pereira, Nicolas Le Roux, Alessandro Sordoni
We find that routing is most beneficial during multi-task pre-training rather than during few-shot adaptation and propose $\texttt{MHR}$-$\mu$, which discards routing and fine-tunes the average of the pre-trained adapters on each downstream tasks.
1 code implementation • 31 Jan 2019 • Lipeng Zhang, Peng Zhang, Xindian Ma, Shuqin Gu, Zhan Su, Dawei Song
Theoretically, we prove that such tensor representation is a generalization of the n-gram language model.
1 code implementation • 28 Aug 2018 • Peng Zhang, Zhan Su, Lipeng Zhang, Benyou Wang, Dawei Song
The recently proposed quantum language model (QLM) aimed at a principled approach to modeling term dependency by applying the quantum probability theory.