Search Results for author: Jiafeng Guo

Found 120 papers, 56 papers with code

Class-Incremental Few-Shot Event Detection

no code implementations2 Apr 2024 Kailin Zhao, Xiaolong Jin, Long Bai, Jiafeng Guo, Xueqi Cheng

Therefore, this paper proposes a new task, called class-incremental few-shot event detection.

Event Detection Few-Shot Learning +1

Multi-granular Adversarial Attacks against Black-box Neural Ranking Models

no code implementations2 Apr 2024 Yu-An Liu, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Yixing Fan, Xueqi Cheng

However, limiting perturbations to a single level of granularity may reduce the flexibility of adversarial examples, thereby diminishing the potential threat of the attack.

Adversarial Attack Decision Making +2

Selective Temporal Knowledge Graph Reasoning

no code implementations2 Apr 2024 Zhongni Hou, Xiaolong Jin, Zixuan Li, Long Bai, Jiafeng Guo, Xueqi Cheng

Temporal Knowledge Graph (TKG), which characterizes temporally evolving facts in the form of (subject, relation, object, timestamp), has attracted much attention recently.

Are Large Language Models Good at Utility Judgments?

1 code implementation28 Mar 2024 Hengran Zhang, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Yixing Fan, Xueqi Cheng

Retrieval-augmented generation (RAG) is considered to be a promising approach to alleviate the hallucination issue of large language models (LLMs), and it has received widespread attention from researchers recently.

Answer Generation Benchmarking +4

Listwise Generative Retrieval Models via a Sequential Learning Process

no code implementations19 Mar 2024 Yubao Tang, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Wei Chen, Xueqi Cheng

Specifically, we view the generation of a ranked docid list as a sequence learning process: at each step we learn a subset of parameters that maximizes the corresponding generation likelihood of the $i$-th docid given the (preceding) top $i-1$ docids.

Retrieval

KnowCoder: Coding Structured Knowledge into LLMs for Universal Information Extraction

no code implementations12 Mar 2024 Zixuan Li, Yutao Zeng, Yuxin Zuo, Weicheng Ren, Wenxuan Liu, Miao Su, Yucan Guo, Yantao Liu, Xiang Li, Zhilei Hu, Long Bai, Wei Li, Yidan Liu, Pan Yang, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng

After instruction tuning, KnowCoder further exhibits strong generalization ability on unseen schemas and achieves up to $\textbf{12. 5%}$ and $\textbf{21. 9%}$, compared to sota baselines, under the zero-shot setting and the low resource setting, respectively.

Code Generation Language Modelling +2

CorpusBrain++: A Continual Generative Pre-Training Framework for Knowledge-Intensive Language Tasks

no code implementations26 Feb 2024 Jiafeng Guo, Changjiang Zhou, Ruqing Zhang, Jiangui Chen, Maarten de Rijke, Yixing Fan, Xueqi Cheng

Very recently, a pre-trained generative retrieval model for KILTs, named CorpusBrain, was proposed and reached new state-of-the-art retrieval performance.

Retrieval

MORE: Multi-mOdal REtrieval Augmented Generative Commonsense Reasoning

no code implementations21 Feb 2024 Wanqing Cui, Keping Bi, Jiafeng Guo, Xueqi Cheng

Since commonsense information has been recorded significantly less frequently than its existence, language models pre-trained by text generation have difficulty to learn sufficient commonsense knowledge.

Retrieval Text Generation +1

When Do LLMs Need Retrieval Augmentation? Mitigating LLMs' Overconfidence Helps Retrieval Augmentation

1 code implementation18 Feb 2024 Shiyu Ni, Keping Bi, Jiafeng Guo, Xueqi Cheng

This motivates us to enhance the LLMs' ability to perceive their knowledge boundaries to help RA.

Retrieval

A Unified Causal View of Instruction Tuning

no code implementations9 Feb 2024 Lu Chen, Wei Huang, Ruqing Zhang, Wei Chen, Jiafeng Guo, Xueqi Cheng

The key idea is to learn task-required causal factors and only use those to make predictions for a given task.

Reproducibility Analysis and Enhancements for Multi-Aspect Dense Retriever with Aspect Learning

1 code implementation8 Jan 2024 Keping Bi, Xiaojie Sun, Jiafeng Guo, Xueqi Cheng

MADRAL was evaluated on proprietary data and its code was not released, making it challenging to validate its effectiveness on other datasets.

Retrieval

Perturbation-Invariant Adversarial Training for Neural Ranking Models: Improving the Effectiveness-Robustness Trade-Off

no code implementations16 Dec 2023 Yu-An Liu, Ruqing Zhang, Mingkun Zhang, Wei Chen, Maarten de Rijke, Jiafeng Guo, Xueqi Cheng

We decompose the robust ranking error into two components, i. e., a natural ranking error for effectiveness evaluation and a boundary ranking error for assessing adversarial robustness.

Adversarial Robustness Information Retrieval

RIGHT: Retrieval-augmented Generation for Mainstream Hashtag Recommendation

1 code implementation16 Dec 2023 Run-Ze Fan, Yixing Fan, Jiangui Chen, Jiafeng Guo, Ruqing Zhang, Xueqi Cheng

Automatic mainstream hashtag recommendation aims to accurately provide users with concise and popular topical hashtags before publication.

Retrieval

A Multi-Granularity-Aware Aspect Learning Model for Multi-Aspect Dense Retrieval

1 code implementation5 Dec 2023 Xiaojie Sun, Keping Bi, Jiafeng Guo, Sihui Yang, Qishen Zhang, Zhongyi Liu, Guannan Zhang, Xueqi Cheng

Dense retrieval methods have been mostly focused on unstructured text and less attention has been drawn to structured data with various aspects, e. g., products with aspects such as category and brand.

Language Modelling Retrieval +1

Retrieval-Augmented Code Generation for Universal Information Extraction

no code implementations6 Nov 2023 Yucan Guo, Zixuan Li, Xiaolong Jin, Yantao Liu, Yutao Zeng, Wenxuan Liu, Xiang Li, Pan Yang, Long Bai, Jiafeng Guo, Xueqi Cheng

Therefore, in this paper, we propose a universal retrieval-augmented code generation framework based on LLMs, called Code4UIE, for IE tasks.

Code Generation In-Context Learning +1

CAME: Competitively Learning a Mixture-of-Experts Model for First-stage Retrieval

no code implementations6 Nov 2023 Yinqiong Cai, Yixing Fan, Keping Bi, Jiafeng Guo, Wei Chen, Ruqing Zhang, Xueqi Cheng

The first-stage retrieval aims to retrieve a subset of candidate documents from a huge collection both effectively and efficiently.

Retrieval

CIR at the NTCIR-17 ULTRE-2 Task

no code implementations18 Oct 2023 Lulu Yu, Keping Bi, Jiafeng Guo, Xueqi Cheng

The Chinese academy of sciences Information Retrieval team (CIR) has participated in the NTCIR-17 ULTRE-2 task.

Information Retrieval Position +1

From Relevance to Utility: Evidence Retrieval with Feedback for Fact Verification

1 code implementation18 Oct 2023 Hengran Zhang, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Yixing Fan, Xueqi Cheng

We argue that, rather than relevance, for FV we need to focus on the utility that a claim verifier derives from the retrieved evidence.

Fact Verification Retrieval

A Comparative Study of Training Objectives for Clarification Facet Generation

1 code implementation1 Oct 2023 Shiyu Ni, Keping Bi, Jiafeng Guo, Xueqi Cheng

In this paper, we aim to conduct a systematic comparative study of various types of training objectives, with different properties of not only whether it is permutation-invariant but also whether it conducts sequential prediction and whether it can control the count of output facets.

Text Generation

Nested Event Extraction upon Pivot Element Recogniton

1 code implementation22 Sep 2023 Weicheng Ren, Zixuan Li, Xiaolong Jin, Long Bai, Miao Su, Yantao Liu, Saiping Guan, Jiafeng Guo, Xueqi Cheng

Since existing NEE datasets (e. g., Genia11) are limited to specific domains and contain a narrow range of event types with nested structures, we systematically categorize nested events in the generic domain and construct a new NEE dataset, called ACE2005-Nest.

Event Extraction

ProtoEM: A Prototype-Enhanced Matching Framework for Event Relation Extraction

no code implementations22 Sep 2023 Zhilei Hu, Zixuan Li, Daozhu Xu, Long Bai, Cheng Jin, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng

To comprehensively understand their intrinsic semantics, in this paper, we obtain prototype representations for each type of event relation and propose a Prototype-Enhanced Matching (ProtoEM) framework for the joint extraction of multiple kinds of event relations.

Event Relation Extraction Relation +1

Continual Learning for Generative Retrieval over Dynamic Corpora

no code implementations29 Aug 2023 Jiangui Chen, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Wei Chen, Yixing Fan, Xueqi Cheng

We put forward a novel Continual-LEarner for generatiVE Retrieval (CLEVER) model and make two major contributions to continual learning for GR: (i) To encode new documents into docids with low computational cost, we present Incremental Product Quantization, which updates a partial quantization codebook according to two adaptive thresholds; and (ii) To memorize new documents for querying without forgetting previous knowledge, we propose a memory-augmented learning mechanism, to form meaningful connections between old and new documents.

Continual Learning Quantization +1

Inducing Causal Structure for Abstractive Text Summarization

1 code implementation24 Aug 2023 Lu Chen, Ruqing Zhang, Wei Huang, Wei Chen, Jiafeng Guo, Xueqi Cheng

The key idea is to reformulate the Variational Auto-encoder (VAE) to fit the joint distribution of the document and summary variables from the training corpus.

Abstractive Text Summarization

Pre-training with Aspect-Content Text Mutual Prediction for Multi-Aspect Dense Retrieval

1 code implementation22 Aug 2023 Xiaojie Sun, Keping Bi, Jiafeng Guo, Xinyu Ma, Fan Yixing, Hongyu Shan, Qishen Zhang, Zhongyi Liu

Extensive experiments on two real-world datasets (product and mini-program search) show that our approach can outperform competitive baselines both treating aspect values as classes and conducting the same MLM for aspect and content strings.

Language Modelling Masked Language Modeling +1

L^2R: Lifelong Learning for First-stage Retrieval with Backward-Compatible Representations

1 code implementation22 Aug 2023 Yinqiong Cai, Keping Bi, Yixing Fan, Jiafeng Guo, Wei Chen, Xueqi Cheng

First-stage retrieval is a critical task that aims to retrieve relevant document candidates from a large-scale collection.

Retrieval

Black-box Adversarial Attacks against Dense Retrieval Models: A Multi-view Contrastive Learning Method

no code implementations19 Aug 2023 Yu-An Liu, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Wei Chen, Yixing Fan, Xueqi Cheng

The AREA task is meant to trick DR models into retrieving a target document that is outside the initial set of candidate documents retrieved by the DR model in response to a query.

Adversarial Attack Attribute +2

On the Robustness of Generative Retrieval Models: An Out-of-Distribution Perspective

no code implementations22 Jun 2023 Yu-An Liu, Ruqing Zhang, Jiafeng Guo, Wei Chen, Xueqi Cheng

Recently, we have witnessed generative retrieval increasingly gaining attention in the information retrieval (IR) field, which retrieves documents by directly generating their identifiers.

Information Retrieval Retrieval

Semantic-Enhanced Differentiable Search Index Inspired by Learning Strategies

no code implementations24 May 2023 Yubao Tang, Ruqing Zhang, Jiafeng Guo, Jiangui Chen, Zuowei Zhu, Shuaiqiang Wang, Dawei Yin, Xueqi Cheng

Specifically, we assign each document an Elaborative Description based on the query generation technique, which is more meaningful than a string of integers in the original DSI; and (2) For the associations between a document and its identifier, we take inspiration from Rehearsal Strategies in human learning.

Memorization Retrieval

Semantic Structure Enhanced Event Causality Identification

no code implementations22 May 2023 Zhilei Hu, Zixuan Li, Xiaolong Jin, Long Bai, Saiping Guan, Jiafeng Guo, Xueqi Cheng

This is a very challenging task, because causal relations are usually expressed by implicit associations between events.

Event Causality Identification

Few-shot Link Prediction on N-ary Facts

no code implementations10 May 2023 Jiyao Wei, Saiping Guan, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng

Thus, we introduce a new task, Few-Shot Link Prediction on Hyper-relational Facts (FSLPHFs).

Attribute Knowledge Graphs +3

Visual Transformation Telling

no code implementations3 May 2023 Xin Hong, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng

In this paper, we propose a new visual reasoning task, called Visual Transformation Telling (VTT).

Dense Video Captioning Visual Reasoning +1

Visual Reasoning: from State to Transformation

1 code implementation2 May 2023 Xin Hong, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng

Such \textbf{state driven} visual reasoning has limitations in reflecting the ability to infer the dynamics between different states, which has shown to be equally important for human cognition in Piaget's theory.

Visual Question Answering (VQA) Visual Reasoning

Topic-oriented Adversarial Attacks against Black-box Neural Ranking Models

1 code implementation28 Apr 2023 Yu-An Liu, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Wei Chen, Yixing Fan, Xueqi Cheng

In this paper, we focus on a more general type of perturbation and introduce the topic-oriented adversarial ranking attack task against NRMs, which aims to find an imperceptible perturbation that can promote a target document in ranking for a group of queries with the same topic.

Information Retrieval Retrieval

A Unified Generative Retriever for Knowledge-Intensive Language Tasks via Prompt Learning

1 code implementation28 Apr 2023 Jiangui Chen, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Yiqun Liu, Yixing Fan, Xueqi Cheng

Learning task-specific retrievers that return relevant contexts at an appropriate level of semantic granularity, such as a document retriever, passage retriever, sentence retriever, and entity retriever, may help to achieve better performance on the end-to-end task.

Retrieval Sentence

Ensemble Ranking Model with Multiple Pretraining Strategies for Web Search

no code implementations18 Feb 2023 Xiaojie Sun, Lulu Yu, Yiting Wang, Keping Bi, Jiafeng Guo

Then we fine-tune several pre-trained models and train an ensemble model to aggregate all the predictions from various pre-trained models with human-annotation data in the fine-tuning stage.

Learning-To-Rank

Feature-Enhanced Network with Hybrid Debiasing Strategies for Unbiased Learning to Rank

no code implementations15 Feb 2023 Lulu Yu, Yiting Wang, Xiaojie Sun, Keping Bi, Jiafeng Guo

Unbiased learning to rank (ULTR) aims to mitigate various biases existing in user clicks, such as position bias, trust bias, presentation bias, and learn an effective ranker.

Learning-To-Rank

Rich Event Modeling for Script Event Prediction

1 code implementation16 Dec 2022 Long Bai, Saiping Guan, Zixuan Li, Jiafeng Guo, Xiaolong Jin, Xueqi Cheng

Fundamentally, it is based on the proposed rich event description, which enriches the existing ones with three kinds of important information, namely, the senses of verbs, extra semantic roles, and types of participants.

Visual Named Entity Linking: A New Dataset and A Baseline

1 code implementation9 Nov 2022 Wenxiang Sun, Yixing Fan, Jiafeng Guo, Ruqing Zhang, Xueqi Cheng

Since each entity often contains rich visual and textual information in KBs, we thus propose three different sub-tasks, i. e., visual to visual entity linking (V2VEL), visual to textual entity linking (V2TEL), and visual to visual-textual entity linking (V2VTEL).

Entity Linking Image Retrieval +3

LegoNet: A Fast and Exact Unlearning Architecture

no code implementations28 Oct 2022 Sihao Yu, Fei Sun, Jiafeng Guo, Ruqing Zhang, Xueqi Cheng

However, such a strategy typically leads to a loss in model performance, which poses the challenge that increasing the unlearning efficiency while maintaining acceptable performance.

Machine Unlearning Representation Learning

HiSMatch: Historical Structure Matching based Temporal Knowledge Graph Reasoning

no code implementations18 Oct 2022 Zixuan Li, Zhongni Hou, Saiping Guan, Xiaolong Jin, Weihua Peng, Long Bai, Yajuan Lyu, Wei Li, Jiafeng Guo, Xueqi Cheng

This is actually a matching task between a query and candidate entities based on their historical structures, which reflect behavioral trends of the entities at different timestamps.

Relation

CofeNet: Context and Former-Label Enhanced Net for Complicated Quotation Extraction

1 code implementation COLING 2022 Yequan Wang, Xiang Li, Aixin Sun, Xuying Meng, Huaming Liao, Jiafeng Guo

CofeNet is able to extract complicated quotations with components of variable lengths and complicated structures.

Certified Robustness to Word Substitution Ranking Attack for Neural Ranking Models

1 code implementation14 Sep 2022 Chen Wu, Ruqing Zhang, Jiafeng Guo, Wei Chen, Yixing Fan, Maarten de Rijke, Xueqi Cheng

A ranking model is said to be Certified Top-$K$ Robust on a ranked list when it is guaranteed to keep documents that are out of the top $K$ away from the top $K$ under any attack.

Information Retrieval Retrieval

Hard Negatives or False Negatives: Correcting Pooling Bias in Training Neural Ranking Models

no code implementations12 Sep 2022 Yinqiong Cai, Jiafeng Guo, Yixing Fan, Qingyao Ai, Ruqing Zhang, Xueqi Cheng

When sampling top-ranked results (excluding the labeled positives) as negatives from a stronger retriever, the performance of the learned NRM becomes even worse.

Information Retrieval Retrieval

Scattered or Connected? An Optimized Parameter-efficient Tuning Approach for Information Retrieval

no code implementations21 Aug 2022 Xinyu Ma, Jiafeng Guo, Ruqing Zhang, Yixing Fan, Xueqi Cheng

Unlike the promising results in NLP, we find that these methods cannot achieve comparable performance to full fine-tuning at both stages when updating less than 1\% of the original model parameters.

Information Retrieval Re-Ranking +1

A Contrastive Pre-training Approach to Learn Discriminative Autoencoder for Dense Retrieval

no code implementations21 Aug 2022 Xinyu Ma, Ruqing Zhang, Jiafeng Guo, Yixing Fan, Xueqi Cheng

Empirical results show that our method can significantly outperform the state-of-the-art autoencoder-based language models and other pre-trained models for dense retrieval.

Information Retrieval Retrieval

CorpusBrain: Pre-train a Generative Retrieval Model for Knowledge-Intensive Language Tasks

1 code implementation16 Aug 2022 Jiangui Chen, Ruqing Zhang, Jiafeng Guo, Yiqun Liu, Yixing Fan, Xueqi Cheng

We show that a strong generative retrieval model can be learned with a set of adequately designed pre-training tasks, and be adopted to improve a variety of downstream KILT tasks with further fine-tuning.

Retrieval

Evaluating Interpolation and Extrapolation Performance of Neural Retrieval Models

1 code implementation25 Apr 2022 Jingtao Zhan, Xiaohui Xie, Jiaxin Mao, Yiqun Liu, Jiafeng Guo, Min Zhang, Shaoping Ma

For example, representation-based retrieval models perform almost as well as interaction-based retrieval models in terms of interpolation but not extrapolation.

Retrieval

Pre-train a Discriminative Text Encoder for Dense Retrieval via Contrastive Span Prediction

1 code implementation22 Apr 2022 Xinyu Ma, Jiafeng Guo, Ruqing Zhang, Yixing Fan, Xueqi Cheng

% Therefore, in this work, we propose to drop out the decoder and introduce a novel contrastive span prediction task to pre-train the encoder alone.

Contrastive Learning Information Retrieval +2

GERE: Generative Evidence Retrieval for Fact Verification

1 code implementation12 Apr 2022 Jiangui Chen, Ruqing Zhang, Jiafeng Guo, Yixing Fan, Xueqi Cheng

This classical approach has clear drawbacks as follows: i) a large document index as well as a complicated search process is required, leading to considerable memory and computational overhead; ii) independent scoring paradigms fail to capture the interactions among documents and sentences in ranking; iii) a fixed number of sentences are selected to form the final evidence set.

Claim Verification Fact Verification +2

PRADA: Practical Black-Box Adversarial Attacks against Neural Ranking Models

no code implementations4 Apr 2022 Chen Wu, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Yixing Fan, Xueqi Cheng

We focus on the decision-based black-box attack setting, where the attackers cannot directly get access to the model information, but can only query the target model to obtain the rank positions of the partial retrieved list.

Document Ranking Information Retrieval +1

Complex Evolutional Pattern Learning for Temporal Knowledge Graph Reasoning

1 code implementation ACL 2022 Zixuan Li, Saiping Guan, Xiaolong Jin, Weihua Peng, Yajuan Lyu, Yong Zhu, Long Bai, Wei Li, Jiafeng Guo, Xueqi Cheng

Furthermore, these models are all trained offline, which cannot well adapt to the changes of evolutional patterns from then on.

A Re-Balancing Strategy for Class-Imbalanced Classification Based on Instance Difficulty

no code implementations CVPR 2022 Sihao Yu, Jiafeng Guo, Ruqing Zhang, Yixing Fan, Zizhen Wang, Xueqi Cheng

By reducing the weights of the majority classes, such instances would become more difficult to learn and hurt the overall performance consequently.

imbalanced classification

What is Event Knowledge Graph: A Survey

1 code implementation31 Dec 2021 Saiping Guan, Xueqi Cheng, Long Bai, Fujun Zhang, Zixuan Li, Yutao Zeng, Xiaolong Jin, Jiafeng Guo

Besides entity-centric knowledge, usually organized as Knowledge Graph (KG), events are also an essential kind of knowledge in the world, which trigger the spring up of event-centric knowledge representation form like Event KG (EKG).

Question Answering Text Generation

Pre-training Methods in Information Retrieval

no code implementations27 Nov 2021 Yixing Fan, Xiaohui Xie, Yinqiong Cai, Jia Chen, Xinyu Ma, Xiangsheng Li, Ruqing Zhang, Jiafeng Guo

The core of information retrieval (IR) is to identify relevant information from large-scale resources and return it as a ranked list to respond to the user's information need.

Information Retrieval Re-Ranking +1

Interpreting Dense Retrieval as Mixture of Topics

no code implementations27 Nov 2021 Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Jiafeng Guo, Min Zhang, Shaoping Ma

Dense Retrieval (DR) reaches state-of-the-art results in first-stage retrieval, but little is known about the mechanisms that contribute to its success.

Retrieval

Learning Discrete Representations via Constrained Clustering for Effective and Efficient Dense Retrieval

4 code implementations12 Oct 2021 Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Jiafeng Guo, Min Zhang, Shaoping Ma

However, the efficiency of most existing DR models is limited by the large memory cost of storing dense vectors and the time-consuming nearest neighbor search (NNS) in vector space.

Constrained Clustering Information Retrieval +3

Piecing and Chipping: An effective solution for the information-erasing view generation in Self-supervised Learning

no code implementations29 Sep 2021 Jingwei Liu, Yi Gu, Shentong Mo, Zhun Sun, Shumin Han, Jiafeng Guo, Xueqi Cheng

In self-supervised learning frameworks, deep networks are optimized to align different views of an instance that contains the similar visual semantic information.

Data Augmentation Self-Supervised Learning

Integrating Deep Event-Level and Script-Level Information for Script Event Prediction

1 code implementation EMNLP 2021 Long Bai, Saiping Guan, Jiafeng Guo, Zixuan Li, Xiaolong Jin, Xueqi Cheng

In this paper, we propose a Transformer-based model, called MCPredictor, which integrates deep event-level and script-level information for script event prediction.

Toward the Understanding of Deep Text Matching Models for Information Retrieval

no code implementations16 Aug 2021 Lijuan Chen, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng

We further extend these constraints to the semantic settings, which are shown to be better satisfied for all the deep text matching models.

Information Retrieval Retrieval +2

FedMatch: Federated Learning Over Heterogeneous Question Answering Data

2 code implementations11 Aug 2021 Jiangui Chen, Ruqing Zhang, Jiafeng Guo, Yixing Fan, Xueqi Cheng

A possible solution to this dilemma is a new approach known as federated learning, which is a privacy-preserving machine learning technique over distributed datasets.

Federated Learning Privacy Preserving +1

Jointly Optimizing Query Encoder and Product Quantization to Improve Retrieval Performance

5 code implementations2 Aug 2021 Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Jiafeng Guo, Min Zhang, Shaoping Ma

Compared with previous DR models that use brute-force search, JPQ almost matches the best retrieval performance with 30x compression on index size.

Information Retrieval Quantization +1

A Discriminative Semantic Ranker for Question Retrieval

no code implementations18 Jul 2021 Yinqiong Cai, Yixing Fan, Jiafeng Guo, Ruqing Zhang, Yanyan Lan, Xueqi Cheng

However, these methods often lose the discriminative power as term-based methods, thus introduce noise during retrieval and hurt the recall performance.

Question Answering Re-Ranking +1

Search from History and Reason for Future: Two-stage Reasoning on Temporal Knowledge Graphs

no code implementations ACL 2021 Zixuan Li, Xiaolong Jin, Saiping Guan, Wei Li, Jiafeng Guo, Yuanzhuo Wang, Xueqi Cheng

Specifically, at the clue searching stage, CluSTeR learns a beam search policy via reinforcement learning (RL) to induce multiple clues from historical facts.

Knowledge Graphs Reinforcement Learning (RL)

Link Prediction on N-ary Relational Data Based on Relatedness Evaluation

1 code implementation21 Apr 2021 Saiping Guan, Xiaolong Jin, Jiafeng Guo, Yuanzhuo Wang, Xueqi Cheng

However, they mainly focus on link prediction on binary relational data, where facts are usually represented as triples in the form of (head entity, relation, tail entity).

Knowledge Graphs Link Prediction

Temporal Knowledge Graph Reasoning Based on Evolutional Representation Learning

1 code implementation21 Apr 2021 Zixuan Li, Xiaolong Jin, Wei Li, Saiping Guan, Jiafeng Guo, HuaWei Shen, Yuanzhuo Wang, Xueqi Cheng

To capture these properties effectively and efficiently, we propose a novel Recurrent Evolution network based on Graph Convolution Network (GCN), called RE-GCN, which learns the evolutional representations of entities and relations at each timestamp by modeling the KG sequence recurrently.

Representation Learning

B-PROP: Bootstrapped Pre-training with Representative Words Prediction for Ad-hoc Retrieval

1 code implementation20 Apr 2021 Xinyu Ma, Jiafeng Guo, Ruqing Zhang, Yixing Fan, Yingyan Li, Xueqi Cheng

The basic idea of PROP is to construct the \textit{representative words prediction} (ROP) task for pre-training inspired by the query likelihood model.

Information Retrieval Language Modelling +1

Optimizing Dense Retrieval Model Training with Hard Negatives

4 code implementations16 Apr 2021 Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Jiafeng Guo, Min Zhang, Shaoping Ma

ADORE replaces the widely-adopted static hard negative sampling method with a dynamic one to directly optimize the ranking performance.

Information Retrieval Representation Learning +1

Sketch and Customize: A Counterfactual Story Generator

1 code implementation2 Apr 2021 Changying Hao, Liang Pang, Yanyan Lan, Yan Wang, Jiafeng Guo, Xueqi Cheng

In the sketch stage, a skeleton is extracted by removing words which are conflict to the counterfactual condition, from the original ending.

counterfactual Text Generation

Semantic Models for the First-stage Retrieval: A Comprehensive Review

1 code implementation8 Mar 2021 Jiafeng Guo, Yinqiong Cai, Yixing Fan, Fei Sun, Ruqing Zhang, Xueqi Cheng

We believe it is the right time to survey current status, learn from existing methods, and gain some insights for future development.

Re-Ranking Retrieval +1

A Linguistic Study on Relevance Modeling in Information Retrieval

no code implementations1 Mar 2021 Yixing Fan, Jiafeng Guo, Xinyu Ma, Ruqing Zhang, Yanyan Lan, Xueqi Cheng

We employ 16 linguistic tasks to probe a unified retrieval model over these three retrieval tasks to answer this question.

Information Retrieval Natural Language Understanding +2

Learning to Truncate Ranked Lists for Information Retrieval

no code implementations25 Feb 2021 Chen Wu, Ruqing Zhang, Jiafeng Guo, Yixing Fan, Yanyan Lan, Xueqi Cheng

One is the widely adopted metric such as F1 which acts as a balanced objective, and the other is the best F1 under some minimal recall constraint which represents a typical objective in professional search.

Information Retrieval Retrieval

Dynamic-K Recommendation with Personalized Decision Boundary

no code implementations25 Dec 2020 Yan Gao, Jiafeng Guo, Yanyan Lan, Huaming Liao

The ranking objective is the same as existing methods, i. e., to create a ranking list of items according to users' interests.

Event Coreference Resolution with their Paraphrases and Argument-aware Embeddings

no code implementations COLING 2020 Yutao Zeng, Xiaolong Jin, Saiping Guan, Jiafeng Guo, Xueqi Cheng

To resolve event coreference, existing methods usually calculate the similarities between event mentions and between specific kinds of event arguments.

coreference-resolution Event Coreference Resolution

Transformation Driven Visual Reasoning

1 code implementation CVPR 2021 Xin Hong, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng

Following this definition, a new dataset namely TRANCE is constructed on the basis of CLEVR, including three levels of settings, i. e.~Basic (single-step transformation), Event (multi-step transformation), and View (multi-step transformation with variant views).

Attribute Visual Question Answering (VQA) +1

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

1 code implementation20 Oct 2020 Xinyu Ma, Jiafeng Guo, Ruqing Zhang, Yixing Fan, Xiang Ji, Xueqi Cheng

Recently pre-trained language representation models such as BERT have shown great success when fine-tuned on downstream tasks including information retrieval (IR).

Information Retrieval Language Modelling +1

Beyond Language: Learning Commonsense from Images for Reasoning

1 code implementation Findings of the Association for Computational Linguistics 2020 Wanqing Cui, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng

This paper proposes a novel approach to learn commonsense from images, instead of limited raw texts or costly constructed knowledge bases, for the commonsense reasoning problem in NLP.

Language Modelling Question Answering

Query Understanding via Intent Description Generation

1 code implementation25 Aug 2020 Ruqing Zhang, Jiafeng Guo, Yixing Fan, Yanyan Lan, Xue-Qi Cheng

To address this new task, we propose a novel Contrastive Generation model, namely CtrsGen for short, to generate the intent description by contrasting the relevant documents with the irrelevant documents given a query.

Clustering Information Retrieval +1

Continual Domain Adaptation for Machine Reading Comprehension

no code implementations25 Aug 2020 Lixin Su, Jiafeng Guo, Ruqing Zhang, Yixing Fan, Yanyan Lan, Xue-Qi Cheng

To tackle such a challenge, in this work, we introduce the \textit{Continual Domain Adaptation} (CDA) task for MRC.

Continual Learning Domain Adaptation +2

Ranking Enhanced Dialogue Generation

no code implementations13 Aug 2020 Changying Hao, Liang Pang, Yanyan Lan, Fei Sun, Jiafeng Guo, Xue-Qi Cheng

To tackle this problem, we propose a Ranking Enhanced Dialogue generation framework in this paper.

Dialogue Generation Response Generation

On the Relation between Quality-Diversity Evaluation and Distribution-Fitting Goal in Text Generation

no code implementations ICML 2020 Jianing Li, Yanyan Lan, Jiafeng Guo, Xue-Qi Cheng

We prove that under certain conditions, a linear combination of quality and diversity constitutes a divergence metric between the generated distribution and the real distribution.

Relation Text Generation

NeuInfer: Knowledge Inference on N-ary Facts

no code implementations ACL 2020 Saiping Guan, Xiaolong Jin, Jiafeng Guo, Yuanzhuo Wang, Xue-Qi Cheng

It aims to infer an unknown element in a partial fact consisting of the primary triple coupled with any number of its auxiliary description(s).

Attribute Descriptive +1

Match$^2$: A Matching over Matching Model for Similar Question Identification

no code implementations21 Jun 2020 Zizhen Wang, Yixing Fan, Jiafeng Guo, Liu Yang, Ruqing Zhang, Yanyan Lan, Xue-Qi Cheng, Hui Jiang, Xiaozhao Wang

However, it has long been a challenge to properly measure the similarity between two questions due to the inherent variation of natural language, i. e., there could be different ways to ask a same question or different questions sharing similar expressions.

Community Question Answering

IART: Intent-aware Response Ranking with Transformers in Information-seeking Conversation Systems

1 code implementation3 Feb 2020 Liu Yang, Minghui Qiu, Chen Qu, Cen Chen, Jiafeng Guo, Yongfeng Zhang, W. Bruce Croft, Haiqing Chen

We also perform case studies and analysis of learned user intent and its impact on response ranking in information-seeking conversations to provide interpretation of results.

Representation Learning

Neural or Statistical: An Empirical Study on Language Models for Chinese Input Recommendation on Mobile

no code implementations9 Jul 2019 Hainan Zhang, Yanyan Lan, Jiafeng Guo, Jun Xu, Xue-Qi Cheng

Chinese input recommendation plays an important role in alleviating human cost in typing Chinese words, especially in the scenario of mobile applications.

Language Modelling

MatchZoo: A Learning, Practicing, and Developing System for Neural Text Matching

1 code implementation24 May 2019 Jiafeng Guo, Yixing Fan, Xiang Ji, Xue-Qi Cheng

Text matching is the core problem in many natural language processing (NLP) tasks, such as information retrieval, question answering, and conversation.

Information Retrieval Question Answering +2

Controlling Risk of Web Question Answering

no code implementations24 May 2019 Lixin Su, Jiafeng Guo, Yixing Fan, Yanyan Lan, Xue-Qi Cheng

Web question answering (QA) has become an indispensable component in modern search systems, which can significantly improve users' search experience by providing a direct answer to users' information need.

Machine Reading Comprehension Question Answering

Outline Generation: Understanding the Inherent Content Structure of Documents

no code implementations24 May 2019 Ruqing Zhang, Jiafeng Guo, Yixing Fan, Yanyan Lan, Xue-Qi Cheng

To generate a sound outline, an ideal OG model should be able to capture three levels of coherence, namely the coherence between context paragraphs, that between a section and its heading, and that between context headings.

Structured Prediction

Tailored Sequence to Sequence Models to Different Conversation Scenarios

no code implementations ACL 2018 Hainan Zhang, Yanyan Lan, Jiafeng Guo, Jun Xu, Xue-Qi Cheng

In this paper, we propose two tailored optimization criteria for Seq2Seq to different conversation scenarios, i. e., the maximum generated likelihood for specific-requirement scenario, and the conditional value-at-risk for diverse-requirement scenario.

Dialogue Generation Response Generation

Modeling Diverse Relevance Patterns in Ad-hoc Retrieval

2 code implementations SIGIR '18 2018 Yixing Fan, Jiafeng Guo, Yanyan Lan, Jun Xu, ChengXiang Zhai, Xue-Qi Cheng

The local matching layer focuses on producing a set of local relevance signals by modeling the semantic matching between a query and each passage of a document.

Retrieval

Response Ranking with Deep Matching Networks and External Knowledge in Information-seeking Conversation Systems

1 code implementation1 May 2018 Liu Yang, Minghui Qiu, Chen Qu, Jiafeng Guo, Yongfeng Zhang, W. Bruce Croft, Jun Huang, Haiqing Chen

Our models and research findings provide new insights on how to utilize external knowledge with deep neural models for response selection and have implications for the design of the next generation of information-seeking conversation systems.

Knowledge Distillation Retrieval +1

A Tree Search Algorithm for Sequence Labeling

1 code implementation29 Apr 2018 Yadi Lao, Jun Xu, Yanyan Lan, Jiafeng Guo, Sheng Gao, Xue-Qi Cheng

Inspired by the success and methodology of the AlphaGo Zero, MM-Tag formalizes the problem of sequence tagging with a Monte Carlo tree search (MCTS) enhanced Markov decision process (MDP) model, in which the time steps correspond to the positions of words in a sentence from left to right, and each action corresponds to assign a tag to a word.

Chunking Decision Making +4

MQGrad: Reinforcement Learning of Gradient Quantization in Parameter Server

no code implementations22 Apr 2018 Guoxin Cui, Jun Xu, Wei Zeng, Yanyan Lan, Jiafeng Guo, Xue-Qi Cheng

One of the most significant bottleneck in training large scale machine learning models on parameter server (PS) is the communication overhead, because it needs to frequently exchange the model gradients between the workers and servers during the training iterations.

BIG-bench Machine Learning Quantization +2

Unbiased Learning to Rank with Unbiased Propensity Estimation

1 code implementation16 Apr 2018 Qingyao Ai, Keping Bi, Cheng Luo, Jiafeng Guo, W. Bruce Croft

We find that the problem of estimating a propensity model from click data is a dual problem of unbiased learning to rank.

Learning-To-Rank

Learning a Deep Listwise Context Model for Ranking Refinement

1 code implementation16 Apr 2018 Qingyao Ai, Keping Bi, Jiafeng Guo, W. Bruce Croft

Specifically, we employ a recurrent neural network to sequentially encode the top results using their feature vectors, learn a local context model and use it to re-rank the top results.

Information Retrieval Learning-To-Rank +1

aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model

1 code implementation5 Jan 2018 Liu Yang, Qingyao Ai, Jiafeng Guo, W. Bruce Croft

As an alternative to question answering methods based on feature engineering, deep learning approaches such as convolutional neural networks (CNNs) and Long Short-Term Memory Models (LSTMs) have recently been proposed for semantic matching of questions and answers.

Feature Engineering Question Answering

Locally Smoothed Neural Networks

1 code implementation22 Nov 2017 Liang Pang, Yanyan Lan, Jun Xu, Jiafeng Guo, Xue-Qi Cheng

The main idea is to represent the weight matrix of the locally connected layer as the product of the kernel and the smoother, where the kernel is shared over different local receptive fields, and the smoother is for determining the importance and relations of different local receptive fields.

Face Verification Question Answering +1

A Deep Investigation of Deep IR Models

no code implementations24 Jul 2017 Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Xue-Qi Cheng

Therefore, it is necessary to identify the difference between automatically learned features by deep IR models and hand-crafted features used in traditional learning to rank approaches.

Information Retrieval Learning-To-Rank +1

MatchZoo: A Toolkit for Deep Text Matching

1 code implementation23 Jul 2017 Yixing Fan, Liang Pang, Jianpeng Hou, Jiafeng Guo, Yanyan Lan, Xue-Qi Cheng

In recent years, deep neural models have been widely adopted for text matching tasks, such as question answering and information retrieval, showing improved performance as compared with previous methods.

Ad-Hoc Information Retrieval Information Retrieval +3

Spherical Paragraph Model

no code implementations18 Jul 2017 Ruqing Zhang, Jiafeng Guo, Yanyan Lan, Jun Xu, Xue-Qi Cheng

Representing texts as fixed-length vectors is central to many language processing tasks.

Representation Learning Sentiment Analysis

A Study of MatchPyramid Models on Ad-hoc Retrieval

1 code implementation15 Jun 2016 Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Xue-Qi Cheng

Although ad-hoc retrieval can also be formalized as a text matching task, few deep models have been tested on it.

Machine Translation Paraphrase Identification +4

Match-SRNN: Modeling the Recursive Matching Structure with Spatial RNN

1 code implementation15 Apr 2016 Shengxian Wan, Yanyan Lan, Jun Xu, Jiafeng Guo, Liang Pang, Xue-Qi Cheng

In this paper, we propose to view the generation of the global interaction between two texts as a recursive process: i. e. the interaction of two texts at each position is a composition of the interactions between their prefixes as well as the word level interaction at the current position.

Position

Semantic Regularities in Document Representations

no code implementations24 Mar 2016 Fei Sun, Jiafeng Guo, Yanyan Lan, Jun Xu, Xue-Qi Cheng

Recent work exhibited that distributed word representations are good at capturing linguistic regularities in language.

Text Matching as Image Recognition

7 code implementations20 Feb 2016 Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Shengxian Wan, Xue-Qi Cheng

An effective way is to extract meaningful matching patterns from words, phrases, and sentences to produce the matching score.

Ad-Hoc Information Retrieval Text Matching

A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations

1 code implementation26 Nov 2015 Shengxian Wan, Yanyan Lan, Jiafeng Guo, Jun Xu, Liang Pang, Xue-Qi Cheng

Our model has several advantages: (1) By using Bi-LSTM, rich context of the whole sentence is leveraged to capture the contextualized local information in each positional sentence representation; (2) By matching with multiple positional sentence representations, it is flexible to aggregate different important contextualized local information in a sentence to support the matching; (3) Experiments on different tasks such as question answering and sentence completion demonstrate the superiority of our model.

Information Retrieval Question Answering +3

Stochastic Rank Aggregation

no code implementations26 Sep 2013 Shuzi Niu, Yanyan Lan, Jiafeng Guo, Xue-Qi Cheng

Traditional rank aggregation methods are deterministic, and can be categorized into explicit and implicit methods depending on whether rank information is explicitly or implicitly utilized.

Cannot find the paper you are looking for? You can Submit a new open access paper.