Search Results for author: Xiaolong Huang

Found 10 papers, 8 papers with code

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

no code implementations • 20 Feb 2024 • Haoran Li, Qingxiu Dong, Zhengyang Tang, Chaojun Wang, Xingxing Zhang, Haoyang Huang, Shaohan Huang, Xiaolong Huang, Zeqiang Huang, Dongdong Zhang, Yuxian Gu, Xin Cheng, Xun Wang, Si-Qing Chen, Li Dong, Wei Lu, Zhifang Sui, Benyou Wang, Wai Lam, Furu Wei

We introduce Generalized Instruction Tuning (called GLAN), a general and scalable method for instruction tuning of Large Language Models (LLMs).

Instruction Following Logical Reasoning +1

Paper
Add Code

Multilingual E5 Text Embeddings: A Technical Report

1 code implementation • 8 Feb 2024 • Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, Furu Wei

This technical report presents the training methodology and evaluation results of the open-source multilingual E5 text embedding models, released in mid-2023.

18,327

Paper
Code

One Step Learning, One Step Review

1 code implementation • 19 Jan 2024 • Xiaolong Huang, Qiankun Li, Xueran Li, Xuesong Gao

Visual fine-tuning has garnered significant attention with the rise of pre-trained vision models.

Image Classification Instance Segmentation +3

Paper
Code

Improving Text Embeddings with Large Language Models

1 code implementation • 31 Dec 2023 • Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, Furu Wei

In this paper, we introduce a novel and simple method for obtaining high-quality text embeddings using only synthetic data and less than 1k training steps.

Paper
Code

Large Search Model: Redefining Search Stack in the Era of LLMs

no code implementations • 23 Oct 2023 • Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, Furu Wei

Modern search engines are built on a stack of different components, including query understanding, retrieval, multi-stage ranking, and question answering, among others.

Language Modelling Large Language Model +3

Paper
Add Code

Text Embeddings by Weakly-Supervised Contrastive Pre-training

1 code implementation • 7 Dec 2022 • Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei

This paper presents E5, a family of state-of-the-art text embeddings that transfer well to a wide range of tasks.

Ranked #11 on Only Connect Walls Dataset Task 1 (Grouping) on OCW (using extra training data)

Only Connect Walls Dataset Task 1 (Grouping) Retrieval

18,327

Paper
Code

Effective and Efficient Query-aware Snippet Extraction for Web Search

1 code implementation • 17 Oct 2022 • Jingwei Yi, Fangzhao Wu, Chuhan Wu, Xiaolong Huang, Binxing Jiao, Guangzhong Sun, Xing Xie

In this paper, we propose an effective query-aware webpage snippet extraction method named DeepQSE, aiming to select a few sentences which can best summarize the webpage content in the context of input query.

Sentence

Paper
Code

2nd Place Solution to Google Universal Image Embedding

1 code implementation • 17 Oct 2022 • Xiaolong Huang, Qiankun Li

Image representations are a critical building block of computer vision applications.

Fine-Grained Image Classification

Paper
Code

LexMAE: Lexicon-Bottlenecked Pretraining for Large-Scale Retrieval

1 code implementation • 31 Aug 2022 • Tao Shen, Xiubo Geng, Chongyang Tao, Can Xu, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang

In large-scale retrieval, the lexicon-weighting paradigm, learning weighted sparse representations in vocabulary space, has shown promising results with high quality and low latency.

Language Modelling Passage Retrieval +1

Paper
Code

SimLM: Pre-training with Representation Bottleneck for Dense Passage Retrieval

1 code implementation • 6 Jul 2022 • Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei

It employs a simple bottleneck architecture that learns to compress the passage information into a dense vector through self-supervised pre-training.

Language Modelling Passage Retrieval +1

18,327

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.