Search Results for author: Jian Jiao

Found 32 papers, 17 papers with code

AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators

2 code implementations29 Mar 2023 Xingwei He, Zhenghao Lin, Yeyun Gong, A-Long Jin, Hang Zhang, Chen Lin, Jian Jiao, Siu Ming Yiu, Nan Duan, Weizhu Chen

Many natural language processing (NLP) tasks rely on labeled data to train machine learning models with high performance.

Information Retrieval Retrieval

Rho-1: Not All Tokens Are What You Need

2 code implementations11 Apr 2024 Zhenghao Lin, Zhibin Gou, Yeyun Gong, Xiao Liu, Yelong Shen, Ruochen Xu, Chen Lin, Yujiu Yang, Jian Jiao, Nan Duan, Weizhu Chen

After fine-tuning, Rho-1-1B and 7B achieved state-of-the-art results of 40. 6% and 51. 8% on MATH dataset, respectively - matching DeepSeekMath with only 3% of the pretraining tokens.

Continual Pretraining Language Modelling +1

HittER: Hierarchical Transformers for Knowledge Graph Embeddings

2 code implementations EMNLP 2021 Sanxing Chen, Xiaodong Liu, Jianfeng Gao, Jian Jiao, Ruofei Zhang, Yangfeng Ji

Our proposed model consists of two different Transformer blocks: the bottom block extracts features of each entity-relation pair in the local neighborhood of the source entity and the top block aggregates the relational information from outputs of the bottom block.

 Ranked #1 on Link Prediction on FB15k-237 (Hit@10 metric)

Knowledge Graph Embeddings Link Prediction +2

PROD: Progressive Distillation for Dense Retrieval

1 code implementation27 Sep 2022 Zhenghao Lin, Yeyun Gong, Xiao Liu, Hang Zhang, Chen Lin, Anlei Dong, Jian Jiao, Jingwen Lu, Daxin Jiang, Rangan Majumder, Nan Duan

It is common that a better teacher model results in a bad student via distillation due to the nonnegligible gap between teacher and student.

Knowledge Distillation Natural Questions +1

Taming Sparsely Activated Transformer with Stochastic Experts

1 code implementation ICLR 2022 Simiao Zuo, Xiaodong Liu, Jian Jiao, Young Jin Kim, Hany Hassan, Ruofei Zhang, Tuo Zhao, Jianfeng Gao

While most on-going research focuses on improving SAMs models by exploring methods of routing inputs to experts, our analysis reveals that such research might not lead to the solution we expect, i. e., the commonly-used routing methods based on gating mechanisms do not work better than randomly routing inputs to experts.

Machine Translation Translation

Efficient Long Sequence Modeling via State Space Augmented Transformer

1 code implementation15 Dec 2022 Simiao Zuo, Xiaodong Liu, Jian Jiao, Denis Charles, Eren Manavoglu, Tuo Zhao, Jianfeng Gao

Specifically, we augment a SSM into the bottom layer of SPADE, and we employ efficient local attention methods for the other layers.

Computational Efficiency Language Modelling +2

TwinBERT: Distilling Knowledge to Twin-Structured BERT Models for Efficient Retrieval

2 code implementations14 Feb 2020 Wenhao Lu, Jian Jiao, Ruofei Zhang

Experimental results showed that the inference time was significantly reduced and was firstly controlled around 20ms on CPUs while at the same time the performance gain from fine-tuned BERT-Base model was mostly retained.

Retrieval

GalaXC: Graph Neural Networks with Labelwise Attention for Extreme Classification

1 code implementation The Web Conference 2021 Deepak Saini, Arnav Kumar Jain, Kushal Dave, Jian Jiao, Amit Singh, Ruofei Zhang and Manik Varma

An efficient end-to-end implementation of GalaXC is presented that could be trained on a dataset with 50M labels and 97M training documents in less than 100 hours on 4×V100 GPUs.

Classification Product Recommendation

Dual-Alignment Pre-training for Cross-lingual Sentence Embedding

1 code implementation16 May 2023 Ziheng Li, Shaohan Huang, Zihan Zhang, Zhi-Hong Deng, Qiang Lou, Haizhen Huang, Jian Jiao, Furu Wei, Weiwei Deng, Qi Zhang

Recent studies have shown that dual encoder models trained with the sentence-level translation ranking task are effective methods for cross-lingual sentence embedding.

Language Modelling Sentence +3

AutoHint: Automatic Prompt Optimization with Hint Generation

1 code implementation13 Jul 2023 Hong Sun, Xue Li, Yinchuan Xu, Youkow Homma, Qi Cao, Min Wu, Jian Jiao, Denis Charles

This paper presents AutoHint, a novel framework for automatic prompt engineering and optimization for Large Language Models (LLM).

In-Context Learning Prompt Engineering +1

Recurrent Binary Embedding for GPU-Enabled Exhaustive Retrieval from Billion-Scale Semantic Vectors

no code implementations18 Feb 2018 Ying Shan, Jian Jiao, Jie Zhu, JC Mao

Building on top of the powerful concept of semantic learning, this paper proposes a Recurrent Binary Embedding (RBE) model that learns compact representations for real-time retrieval.

Information Retrieval Retrieval

ProphetNet-Ads: A Looking Ahead Strategy for Generative Retrieval Models in Sponsored Search Engine

no code implementations21 Oct 2020 Weizhen Qi, Yeyun Gong, Yu Yan, Jian Jiao, Bo Shao, Ruofei Zhang, Houqiang Li, Nan Duan, Ming Zhou

We build a dataset from a real-word sponsored search engine and carry out experiments to analyze different generative retrieval models.

Retrieval

KFCNet: Knowledge Filtering and Contrastive Learning Network for Generative Commonsense Reasoning

no code implementations14 Sep 2021 Haonan Li, Yeyun Gong, Jian Jiao, Ruofei Zhang, Timothy Baldwin, Nan Duan

Pre-trained language models have led to substantial gains over a broad range of natural language processing (NLP) tasks, but have been shown to have limitations for natural language generation tasks with high-quality requirements on the output, such as commonsense generation and ad keyword generation.

Contrastive Learning Text Generation

KFCNet: Knowledge Filtering and Contrastive Learning for Generative Commonsense Reasoning

no code implementations Findings (EMNLP) 2021 Haonan Li, Yeyun Gong, Jian Jiao, Ruofei Zhang, Timothy Baldwin, Nan Duan

Pre-trained language models have led to substantial gains over a broad range of natural language processing (NLP) tasks, but have been shown to have limitations for natural language generation tasks with high-quality requirements on the output, such as commonsense generation and ad keyword generation.

Contrastive Learning Text Generation

A Self-Paced Mixed Distillation Method for Non-Autoregressive Generation

no code implementations23 May 2022 Weizhen Qi, Yeyun Gong, Yelong Shen, Jian Jiao, Yu Yan, Houqiang Li, Ruofei Zhang, Weizhu Chen, Nan Duan

To further illustrate the commercial value of our approach, we conduct experiments on three generation tasks in real-world advertisements applications.

Question Generation Question-Generation +1

CULG: Commercial Universal Language Generation

no code implementations NAACL (ACL) 2022 Haonan Li, Yameng Huang, Yeyun Gong, Jian Jiao, Ruofei Zhang, Timothy Baldwin, Nan Duan

Pre-trained language models (PLMs) have dramatically improved performance for many natural language processing (NLP) tasks in domains such as finance and healthcare.

Marketing Text Generation

The Counterfactual-Shapley Value: Attributing Change in System Metrics

no code implementations17 Aug 2022 Amit Sharma, Hua Li, Jian Jiao

Specifically, we propose a method to estimate counterfactuals using time-series predictive models and construct an attribution score, CF-Shapley, that is consistent with desirable axioms for attributing an observed change in the output metric.

Attribute counterfactual +1

Metric-guided Distillation: Distilling Knowledge from the Metric to Ranker and Retriever for Generative Commonsense Reasoning

no code implementations21 Oct 2022 Xingwei He, Yeyun Gong, A-Long Jin, Weizhen Qi, Hang Zhang, Jian Jiao, Bartuer Zhou, Biao Cheng, SM Yiu, Nan Duan

Commonsense generation aims to generate a realistic sentence describing a daily scene under the given concepts, which is very challenging, since it requires models to have relational reasoning and compositional generalization capabilities.

Relational Reasoning Re-Ranking +1

Pre-training Transformers for Knowledge Graph Completion

no code implementations28 Mar 2023 Sanxing Chen, Hao Cheng, Xiaodong Liu, Jian Jiao, Yangfeng Ji, Jianfeng Gao

Learning transferable representation of knowledge graphs (KGs) is challenging due to the heterogeneous, multi-relational nature of graph structures.

DeepTagger: Knowledge Enhanced Named Entity Recognition for Web-Based Ads Queries

no code implementations30 Jun 2023 Simiao Zuo, Pengfei Tang, Xinyu Hu, Qiang Lou, Jian Jiao, Denis Charles

For model-free enhancement, we collect unlabeled web queries to augment domain knowledge; and we collect web search results to enrich the information of ads queries.

Data Augmentation named-entity-recognition +2

Evoke: Evoking Critical Thinking Abilities in LLMs via Reviewer-Author Prompt Editing

no code implementations20 Oct 2023 Xinyu Hu, Pengfei Tang, Simiao Zuo, Zihan Wang, Bowen Song, Qiang Lou, Jian Jiao, Denis Charles

In Evoke, there are two instances of a same LLM: one as a reviewer (LLM-Reviewer), it scores the current prompt; the other as an author (LLM-Author), it edits the prompt by considering the edit history and the reviewer's feedback.

Logical Fallacy Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.