Search Results for author: Jian Jiao

Found 32 papers, 17 papers with code

Deep Crossing: Web-Scale Modeling without Manually Crafted Combinatorial Features

2 code implementations • KDD 2016 • Ying Shan, T. Ryan Hoens, Jian Jiao, Haijing Wang, Dong Yu, JC Mao

Manually crafted combinatorial features have been the “secret sauce” behind many successful models.

Feature Engineering

781

Paper
Code

Recurrent Binary Embedding for GPU-Enabled Exhaustive Retrieval from Billion-Scale Semantic Vectors

no code implementations • 18 Feb 2018 • Ying Shan, Jian Jiao, Jie Zhu, JC Mao

Building on top of the powerful concept of semantic learning, this paper proposes a Recurrent Binary Embedding (RBE) model that learns compact representations for real-time retrieval.

Information Retrieval Retrieval

Paper
Add Code

MagicVO: End-to-End Monocular Visual Odometry through Deep Bi-directional Recurrent Convolutional Neural Network

no code implementations • 27 Nov 2018 • Jian Jiao, Jichao Jiao, Yaokai Mo, Weilun Liu, Zhongliang Deng

This paper proposes a new framework to solve the problem of monocular visual odometry, called MagicVO .

Monocular Visual Odometry

Paper
Add Code

TwinBERT: Distilling Knowledge to Twin-Structured BERT Models for Efficient Retrieval

2 code implementations • 14 Feb 2020 • Wenhao Lu, Jian Jiao, Ruofei Zhang

Experimental results showed that the inference time was significantly reduced and was firstly controlled around 20ms on CPUs while at the same time the performance gain from fine-tuned BERT-Base model was mostly retained.

Retrieval

Paper
Code

HittER: Hierarchical Transformers for Knowledge Graph Embeddings

2 code implementations • EMNLP 2021 • Sanxing Chen, Xiaodong Liu, Jianfeng Gao, Jian Jiao, Ruofei Zhang, Yangfeng Ji

Our proposed model consists of two different Transformer blocks: the bottom block extracts features of each entity-relation pair in the local neighborhood of the source entity and the top block aggregates the relational information from outputs of the bottom block.

Ranked #1 on Link Prediction on FB15k-237 (Hit@10 metric)

Knowledge Graph Embeddings Link Prediction +2

105

Paper
Code

ProphetNet-Ads: A Looking Ahead Strategy for Generative Retrieval Models in Sponsored Search Engine

no code implementations • 21 Oct 2020 • Weizhen Qi, Yeyun Gong, Yu Yan, Jian Jiao, Bo Shao, Ruofei Zhang, Houqiang Li, Nan Duan, Ming Zhou

We build a dataset from a real-word sponsored search engine and carry out experiments to analyze different generative retrieval models.

Retrieval

Paper
Add Code

GLGE: A New General Language Generation Evaluation Benchmark

1 code implementation • Findings (ACL) 2021 • Dayiheng Liu, Yu Yan, Yeyun Gong, Weizhen Qi, Hang Zhang, Jian Jiao, Weizhu Chen, Jie Fu, Linjun Shou, Ming Gong, Pengcheng Wang, Jiusheng Chen, Daxin Jiang, Jiancheng Lv, Ruofei Zhang, Winnie Wu, Ming Zhou, Nan Duan

Multi-task benchmarks such as GLUE and SuperGLUE have driven great progress of pretraining and transfer learning in Natural Language Processing (NLP).

Natural Language Understanding Text Generation +1

Paper
Code

An Enhanced Knowledge Injection Model for Commonsense Generation

no code implementations • COLING 2020 • Zhihao Fan, Yeyun Gong, Zhongyu Wei, Siyuan Wang, Yameng Huang, Jian Jiao, Xuanjing Huang, Nan Duan, Ruofei Zhang

Commonsense generation aims at generating plausible everyday scenario description based on a set of provided concepts.

Position

Paper
Add Code

BANG: Bridging Autoregressive and Non-autoregressive Generation with Large Scale Pretraining

1 code implementation • 31 Dec 2020 • Weizhen Qi, Yeyun Gong, Jian Jiao, Yu Yan, Weizhu Chen, Dayiheng Liu, Kewen Tang, Houqiang Li, Jiusheng Chen, Ruofei Zhang, Ming Zhou, Nan Duan

In this paper, we propose BANG, a new pretraining model to Bridge the gap between Autoregressive (AR) and Non-autoregressive (NAR) Generation.

Dialogue Generation Question Generation +1

Paper
Code

Mask Attention Networks: Rethinking and Strengthen Transformer

1 code implementation • NAACL 2021 • Zhihao Fan, Yeyun Gong, Dayiheng Liu, Zhongyu Wei, Siyuan Wang, Jian Jiao, Nan Duan, Ruofei Zhang, Xuanjing Huang

We therefore introduce a new layer named dynamic mask attention network (DMAN) with a learnable mask matrix which is able to model localness adaptively.

Ranked #11 on Machine Translation on WMT2014 English-German

Abstractive Text Summarization Machine Translation +2

Paper
Code

GalaXC: Graph Neural Networks with Labelwise Attention for Extreme Classification

1 code implementation • The Web Conference 2021 • Deepak Saini, Arnav Kumar Jain, Kushal Dave, Jian Jiao, Amit Singh, Ruofei Zhang and Manik Varma

An efficient end-to-end implementation of GalaXC is presented that could be trained on a dataset with 50M labels and 97M training documents in less than 100 hours on 4×V100 GPUs.

Classification Product Recommendation

Paper
Code

KFCNet: Knowledge Filtering and Contrastive Learning Network for Generative Commonsense Reasoning

no code implementations • 14 Sep 2021 • Haonan Li, Yeyun Gong, Jian Jiao, Ruofei Zhang, Timothy Baldwin, Nan Duan

Pre-trained language models have led to substantial gains over a broad range of natural language processing (NLP) tasks, but have been shown to have limitations for natural language generation tasks with high-quality requirements on the output, such as commonsense generation and ad keyword generation.

Contrastive Learning Text Generation

Paper
Add Code

Taming Sparsely Activated Transformer with Stochastic Experts

1 code implementation • ICLR 2022 • Simiao Zuo, Xiaodong Liu, Jian Jiao, Young Jin Kim, Hany Hassan, Ruofei Zhang, Tuo Zhao, Jianfeng Gao

While most on-going research focuses on improving SAMs models by exploring methods of routing inputs to experts, our analysis reveals that such research might not lead to the solution we expect, i. e., the commonly-used routing methods based on gating mechanisms do not work better than randomly routing inputs to experts.

Machine Translation Translation

Paper
Code

PromptBERT: Improving BERT Sentence Embeddings with Prompts

1 code implementation • 12 Jan 2022 • Ting Jiang, Jian Jiao, Shaohan Huang, Zihan Zhang, Deqing Wang, Fuzhen Zhuang, Furu Wei, Haizhen Huang, Denvy Deng, Qi Zhang

We propose PromptBERT, a novel contrastive learning method for learning better sentence representation.

Contrastive Learning Denoising +6

316

Paper
Code

A Self-Paced Mixed Distillation Method for Non-Autoregressive Generation

no code implementations • 23 May 2022 • Weizhen Qi, Yeyun Gong, Yelong Shen, Jian Jiao, Yu Yan, Houqiang Li, Ruofei Zhang, Weizhu Chen, Nan Duan

To further illustrate the commercial value of our approach, we conduct experiments on three generation tasks in real-world advertisements applications.

Question Generation Question-Generation +1

Paper
Add Code

NGAME: Negative Mining-aware Mini-batching for Extreme Classification

no code implementations • 10 Jul 2022 • Kunal Dahiya, Nilesh Gupta, Deepak Saini, Akshay Soni, Yajun Wang, Kushal Dave, Jian Jiao, Gururaj K, Prasenjit Dey, Amit Singh, Deepesh Hada, Vidit Jain, Bhawna Paliwal, Anshul Mittal, Sonu Mehta, Ramachandran Ramjee, Sumeet Agarwal, Purushottam Kar, Manik Varma

This paper identifies that memory overheads of popular negative mining techniques often force mini-batch sizes to remain small and slow training down.

Classification TAG

Paper
Add Code

The Counterfactual-Shapley Value: Attributing Change in System Metrics

no code implementations • 17 Aug 2022 • Amit Sharma, Hua Li, Jian Jiao

Specifically, we propose a method to estimate counterfactuals using time-series predictive models and construct an attribution score, CF-Shapley, that is consistent with desirable axioms for attributing an observed change in the output metric.

Attribute counterfactual +1

Paper
Add Code

PROD: Progressive Distillation for Dense Retrieval

1 code implementation • 27 Sep 2022 • Zhenghao Lin, Yeyun Gong, Xiao Liu, Hang Zhang, Chen Lin, Anlei Dong, Jian Jiao, Jingwen Lu, Daxin Jiang, Rangan Majumder, Nan Duan

It is common that a better teacher model results in a bad student via distillation due to the nonnegligible gap between teacher and student.

Knowledge Distillation Natural Questions +1

Paper
Code

Metric-guided Distillation: Distilling Knowledge from the Metric to Ranker and Retriever for Generative Commonsense Reasoning

no code implementations • 21 Oct 2022 • Xingwei He, Yeyun Gong, A-Long Jin, Weizhen Qi, Hang Zhang, Jian Jiao, Bartuer Zhou, Biao Cheng, SM Yiu, Nan Duan

Commonsense generation aims to generate a realistic sentence describing a daily scene under the given concepts, which is very challenging, since it requires models to have relational reasoning and compositional generalization capabilities.

Relational Reasoning Re-Ranking +1

Paper
Add Code

Efficient Long Sequence Modeling via State Space Augmented Transformer

1 code implementation • 15 Dec 2022 • Simiao Zuo, Xiaodong Liu, Jian Jiao, Denis Charles, Eren Manavoglu, Tuo Zhao, Jianfeng Gao

Specifically, we augment a SSM into the bottom layer of SPADE, and we employ efficient local attention methods for the other layers.

Computational Efficiency Language Modelling +2

Paper
Code

CAPSTONE: Curriculum Sampling for Dense Retrieval with Document Expansion

1 code implementation • 18 Dec 2022 • Xingwei He, Yeyun Gong, A-Long Jin, Hang Zhang, Anlei Dong, Jian Jiao, Siu Ming Yiu, Nan Duan

The dual-encoder has become the de facto architecture for dense retrieval.

Passage Retrieval Retrieval

Paper
Code

Pre-training Transformers for Knowledge Graph Completion

no code implementations • 28 Mar 2023 • Sanxing Chen, Hao Cheng, Xiaodong Liu, Jian Jiao, Yangfeng Ji, Jianfeng Gao

Learning transferable representation of knowledge graphs (KGs) is challenging due to the heterogeneous, multi-relational nature of graph structures.

Paper
Add Code

AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators

2 code implementations • 29 Mar 2023 • Xingwei He, Zhenghao Lin, Yeyun Gong, A-Long Jin, Hang Zhang, Chen Lin, Jian Jiao, Siu Ming Yiu, Nan Duan, Weizhu Chen

Many natural language processing (NLP) tasks rely on labeled data to train machine learning models with high performance.

Information Retrieval Retrieval

1,779

Paper
Code

AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation

1 code implementation • NeurIPS 2023 • Tong Wu, Zhihao Fan, Xiao Liu, Yeyun Gong, Yelong Shen, Jian Jiao, Hai-Tao Zheng, Juntao Li, Zhongyu Wei, Jian Guo, Nan Duan, Weizhu Chen

Diffusion models have gained significant attention in the realm of image generation due to their exceptional performance.

Common Sense Reasoning Denoising +4

616

Paper
Code

Dual-Alignment Pre-training for Cross-lingual Sentence Embedding

1 code implementation • 16 May 2023 • Ziheng Li, Shaohan Huang, Zihan Zhang, Zhi-Hong Deng, Qiang Lou, Haizhen Huang, Jian Jiao, Furu Wei, Weiwei Deng, Qi Zhang

Recent studies have shown that dual encoder models trained with the sentence-level translation ranking task are effective methods for cross-lingual sentence embedding.

Language Modelling Sentence +3

Paper
Code

DeepTagger: Knowledge Enhanced Named Entity Recognition for Web-Based Ads Queries

no code implementations • 30 Jun 2023 • Simiao Zuo, Pengfei Tang, Xinyu Hu, Qiang Lou, Jian Jiao, Denis Charles

For model-free enhancement, we collect unlabeled web queries to augment domain knowledge; and we collect web search results to enrich the information of ads queries.

Data Augmentation named-entity-recognition +2

Paper
Add Code

AutoHint: Automatic Prompt Optimization with Hint Generation

1 code implementation • 13 Jul 2023 • Hong Sun, Xue Li, Yinchuan Xu, Youkow Homma, Qi Cao, Min Wu, Jian Jiao, Denis Charles

This paper presents AutoHint, a novel framework for automatic prompt engineering and optimization for Large Language Models (LLM).

In-Context Learning Prompt Engineering +1

Paper
Code

Evoke: Evoking Critical Thinking Abilities in LLMs via Reviewer-Author Prompt Editing

no code implementations • 20 Oct 2023 • Xinyu Hu, Pengfei Tang, Simiao Zuo, Zihan Wang, Bowen Song, Qiang Lou, Jian Jiao, Denis Charles

In Evoke, there are two instances of a same LLM: one as a reviewer (LLM-Reviewer), it scores the current prompt; the other as an author (LLM-Author), it edits the prompt by considering the edit history and the reviewer's feedback.

Logical Fallacy Detection

Paper
Add Code

Slightly Shift New Classes to Remember Old Classes for Video Class-Incremental Learning

no code implementations • 1 Apr 2024 • Jian Jiao, Yu Dai, Hefei Mei, Heqian Qiu, Chuanyang Gong, Shiyuan Tang, Xinpeng Hao, Hongliang Li

So we propose SNRO, which slightly shifts the features of new classes to remember old classes.

Class Incremental Learning Incremental Learning

Paper
Add Code

Rho-1: Not All Tokens Are What You Need

2 code implementations • 11 Apr 2024 • Zhenghao Lin, Zhibin Gou, Yeyun Gong, Xiao Liu, Yelong Shen, Ruochen Xu, Chen Lin, Yujiu Yang, Jian Jiao, Nan Duan, Weizhu Chen

After fine-tuning, Rho-1-1B and 7B achieved state-of-the-art results of 40. 6% and 51. 8% on MATH dataset, respectively - matching DeepSeekMath with only 3% of the pretraining tokens.

Continual Pretraining Language Modelling +1

218

Paper
Code

CULG: Commercial Universal Language Generation

no code implementations • NAACL (ACL) 2022 • Haonan Li, Yameng Huang, Yeyun Gong, Jian Jiao, Ruofei Zhang, Timothy Baldwin, Nan Duan

Pre-trained language models (PLMs) have dramatically improved performance for many natural language processing (NLP) tasks in domains such as finance and healthcare.

Marketing Text Generation

Paper
Add Code

KFCNet: Knowledge Filtering and Contrastive Learning for Generative Commonsense Reasoning

no code implementations • Findings (EMNLP) 2021 • Haonan Li, Yeyun Gong, Jian Jiao, Ruofei Zhang, Timothy Baldwin, Nan Duan

Contrastive Learning Text Generation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.