no code implementations • LREC 2022 • Shi Yu, Clara Ponchard, Roland Trouville, Sergio Hassid, Didier Demolin
Aerodynamic processes underlie the characteristics of the acoustic signal of speech sounds.
1 code implementation • 25 Feb 2024 • Xinze Li, Zhenghao Liu, Chenyan Xiong, Shi Yu, Yukun Yan, Shuo Wang, Ge Yu
It finetunes the compression plugin module and uses the representations of gist tokens to emulate the raw prompts in the vanilla language model.
1 code implementation • 21 Feb 2024 • Zhipeng Xu, Zhenghao Liu, Yibin Liu, Chenyan Xiong, Yukun Yan, Shuo Wang, Shi Yu, Zhiyuan Liu, Ge Yu
Retrieval Augmented Generation (RAG) has introduced a new paradigm for Large Language Models (LLMs), aiding in the resolution of knowledge-intensive tasks.
1 code implementation • 27 Aug 2023 • Zhenghao Liu, Sen Mei, Chenyan Xiong, Xiaohua LI, Shi Yu, Zhiyuan Liu, Yu Gu, Ge Yu
TASTE alleviates the cold start problem by representing long-tail items using full-text modeling and bringing the benefits of pretrained language models to recommendation systems.
1 code implementation • 31 May 2023 • Xinze Li, Zhenghao Liu, Chenyan Xiong, Shi Yu, Yu Gu, Zhiyuan Liu, Ge Yu
SANTA proposes two pretraining methods to make language models structure-aware and learn effective representations for structured data: 1) Structured Data Alignment, which utilizes the natural alignment relations between structured data and unstructured data for structure-aware pretraining.
1 code implementation • 27 May 2023 • Zichun Yu, Chenyan Xiong, Shi Yu, Zhiyuan Liu
Retrieval augmentation can aid language models (LMs) in knowledge-intensive tasks by supplying them with external information.
no code implementations • 24 May 2023 • Shi Yu, Chenghao Fan, Chenyan Xiong, David Jin, Zhiyuan Liu, Zhenghao Liu
Common IR pipelines are typically cascade systems that may involve multiple rankers and/or fusion models to integrate different information step-by-step.
1 code implementation • 12 Apr 2023 • Si Sun, Yida Lu, Shi Yu, Xiangyang Li, Zhonghua Li, Zhao Cao, Zhiyuan Liu, Deiming Ye, Jie Bao
Moreover, the dataset is disjointed into base and novel classes, allowing DR models to be continuously trained on ample data from base classes and a few samples in novel classes.
1 code implementation • 4 May 2022 • Xiaomeng Hu, Shi Yu, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu, Ge Yu
In this paper, we identify and study the two mismatches between pre-training and ranking fine-tuning: the training schema gap regarding the differences in training objectives and model architectures, and the task knowledge gap considering the discrepancy between the knowledge needed in ranking and that learned during pre-training.
no code implementations • 19 May 2021 • Haoran Wang, Shi Yu
Machine Learning (ML) has been embraced as a powerful tool by the financial industry, with notable applications spreading in various domains including investment management.
1 code implementation • 10 May 2021 • Shi Yu, Zhenghao Liu, Chenyan Xiong, Tao Feng, Zhiyuan Liu
In this paper, we present a Conversational Dense Retrieval system, ConvDR, that learns contextualized embeddings for multi-turn conversational queries and retrieves documents solely using embedding dot products.
no code implementations • 18 Dec 2020 • Jerry Zikun Chen, Shi Yu, Haoran Wang
Query reformulation aims to alter noisy or ambiguous text sequences into coherent ones closer to natural language questions.
3 code implementations • 3 Nov 2020 • Chenyan Xiong, Zhenghao Liu, Si Sun, Zhuyun Dai, Kaitao Zhang, Shi Yu, Zhiyuan Liu, Hoifung Poon, Jianfeng Gao, Paul Bennett
Neural rankers based on deep pretrained language models (LMs) have been shown to improve many information retrieval benchmarks.
no code implementations • 4 Oct 2020 • Shi Yu, Haoran Wang, Chaosheng Dong
Our approach allows the learner to continuously estimate real-time risk preferences using concurrent observed portfolios and market price data.
1 code implementation • 9 Jun 2020 • Shi Yu, Jiahua Liu, Jingqin Yang, Chenyan Xiong, Paul Bennett, Jianfeng Gao, Zhiyuan Liu
Conversational query rewriting aims to reformulate a concise conversational query to a fully specified, context-independent query that can be effectively handled by existing information retrieval systems.
1 code implementation • 22 Apr 2020 • Shi Yu
When applying eigenvalue decomposition on the quadratic term matrix in a type of linear equally constrained quadratic programming (EQP), there exists a linear mapping to project optimal solutions between the new EQP formulation where $Q$ is diagonalized and the original formulation.
no code implementations • 17 Feb 2020 • Shi Yu, Yuxin Chen, Hussain Zaidi
Our main novel contribution is the discussion about uncertainty measure for BERT, where three different approaches are systematically compared on real problems.