Search Results for author: Chenyan Xiong

Found 56 papers, 37 papers with code

Distantly-Supervised Dense Retrieval Enables Open-Domain Question Answering without Evidence Annotation

1 code implementation EMNLP 2021 Chen Zhao, Chenyan Xiong, Jordan Boyd-Graber, Hal Daumé III

This paper investigates whether models can learn to find evidence from a large corpus, with only distant supervision from answer labels for model training, thereby generating no additional annotation cost.

Open-Domain Question Answering

Dimension Reduction for Efficient Dense Retrieval via Conditional Autoencoder

1 code implementation6 May 2022 Zhenghao Liu, Han Zhang, Chenyan Xiong, Zhiyuan Liu, Yu Gu, Xiaohua LI

To reduce the embedding dimensions of dense retrieval, this paper proposes a Conditional Autoencoder (ConAE) to compress the high-dimensional embeddings to maintain the same embedding distribution and better recover the ranking features.

Dimensionality Reduction Information Retrieval

P^3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning

1 code implementation4 May 2022 Xiaomeng Hu, Shi Yu, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu, Ge Yu

In this paper, we identify and study the two mismatches between pre-training and ranking fine-tuning: the training schema gap regarding the differences in training objectives and model architectures, and the task knowledge gap considering the discrepancy between the knowledge needed in ranking and that learned during pre-training.

Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators

1 code implementation ICLR 2022 Yu Meng, Chenyan Xiong, Payal Bajaj, Saurabh Tiwary, Paul Bennett, Jiawei Han, Xia Song

We present a new framework AMOS that pretrains text encoders with an Adversarial learning curriculum via a Mixture Of Signals from multiple auxiliary generators.

Neural Approaches to Conversational Information Retrieval

no code implementations13 Jan 2022 Jianfeng Gao, Chenyan Xiong, Paul Bennett, Nick Craswell

A conversational information retrieval (CIR) system is an information retrieval (IR) system with a conversational interface which allows users to interact with the system to seek information via multi-turn conversations of natural language, in spoken or written form.

Information Retrieval Natural Language Processing

Zero-Shot Dense Retrieval with Momentum Adversarial Domain Invariant Representations

no code implementations Findings (ACL) 2022 Ji Xin, Chenyan Xiong, Ashwin Srinivasan, Ankita Sharma, Damien Jose, Paul N. Bennett

Dense retrieval (DR) methods conduct text retrieval by first encoding texts in the embedding space and then matching them by nearest neighbor search.

Representation Learning

Zero-Shot Dense Retrieval with Momentum Adversarial Domain Invariant Representation

no code implementations29 Sep 2021 Ji Xin, Chenyan Xiong, Ashwin Srinivasan, Ankita Sharma, Damien Jose, Paul N. Bennett

Dense retrieval (DR) methods conduct text retrieval by first encoding texts in the embedding space and then matching them by nearest neighbor search.

Representation Learning

Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback

2 code implementations30 Aug 2021 HongChien Yu, Chenyan Xiong, Jamie Callan

This paper proposes ANCE-PRF, a new query encoder that uses pseudo relevance feedback (PRF) to improve query representations for dense retrieval.

More Robust Dense Retrieval with Contrastive Dual Learning

1 code implementation16 Jul 2021 Yizhi Li, Zhenghao Liu, Chenyan Xiong, Zhiyuan Liu

With contrastive learning, the dual training object of DANCE learns more tailored representations for queries and documents to keep the embedding space smooth and uniform, thriving on the ranking performance of DANCE on the MS MARCO document retrieval task.

Contrastive Learning Information Retrieval

Few-Shot Conversational Dense Retrieval

1 code implementation10 May 2021 Shi Yu, Zhenghao Liu, Chenyan Xiong, Tao Feng, Zhiyuan Liu

In this paper, we present a Conversational Dense Retrieval system, ConvDR, that learns contextualized embeddings for multi-turn conversational queries and retrieves documents solely using embedding dot products.

Conversational Search

Complex Factoid Question Answering with a Free-Text Knowledge Graph

no code implementations23 Mar 2021 Chen Zhao, Chenyan Xiong, Xin Qian, Jordan Boyd-Graber

DELFT's advantage comes from both the high coverage of its free-text knowledge graph-more than double that of dbpedia relations-and the novel graph neural network which reasons on the rich but noisy free-text evidence.

Graph Question Answering Question Answering +1

Data Augmentation for Abstractive Query-Focused Multi-Document Summarization

1 code implementation2 Mar 2021 Ramakanth Pasunuru, Asli Celikyilmaz, Michel Galley, Chenyan Xiong, Yizhe Zhang, Mohit Bansal, Jianfeng Gao

The progress in Query-focused Multi-Document Summarization (QMDS) has been limited by the lack of sufficient largescale high-quality training datasets.

Data Augmentation Document Summarization +1

COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining

2 code implementations NeurIPS 2021 Yu Meng, Chenyan Xiong, Payal Bajaj, Saurabh Tiwary, Paul Bennett, Jiawei Han, Xia Song

The first token-level task, Corrective Language Modeling, is to detect and correct tokens replaced by the auxiliary model, in order to better capture token-level semantics.

Contrastive Learning Language Modelling +1

OpenMatch: An Open Source Library for Neu-IR Research

1 code implementation30 Jan 2021 Zhenghao Liu, Kaitao Zhang, Chenyan Xiong, Zhiyuan Liu, Maosong Sun

OpenMatch is a Python-based library that serves for Neural Information Retrieval (Neu-IR) research.

Document Ranking Information Retrieval

Pretrain Knowledge-Aware Language Models

no code implementations1 Jan 2021 Corbin L Rosset, Chenyan Xiong, Minh Phan, Xia Song, Paul N. Bennett, Saurabh Tiwary

Rather, we simply signal the existence of entities to the input of the transformer in pretraining, with an entity-extended tokenizer; and at the output, with an additional entity prediction task.

Language Modelling Pretrained Language Models +1

Few-Shot Text Ranking with Meta Adapted Synthetic Weak Supervision

1 code implementation ACL 2021 Si Sun, Yingzhuo Qian, Zhenghao Liu, Chenyan Xiong, Kaitao Zhang, Jie Bao, Zhiyuan Liu, Paul Bennett

To democratize the benefits of Neu-IR, this paper presents MetaAdaptRank, a domain adaptive learning method that generalizes Neu-IR models from label-rich source domains to few-shot target domains.

Information Retrieval Learning-To-Rank

Towards Interpretable Natural Language Understanding with Explanations as Latent Variables

1 code implementation NeurIPS 2020 Wangchunshu Zhou, Jinyi Hu, HANLIN ZHANG, Xiaodan Liang, Maosong Sun, Chenyan Xiong, Jian Tang

In this paper, we develop a general framework for interpretable natural language understanding that requires only a small set of human annotated explanations for training.

Explanation Generation Natural Language Understanding

Text Classification Using Label Names Only: A Language Model Self-Training Approach

1 code implementation EMNLP 2020 Yu Meng, Yunyi Zhang, Jiaxin Huang, Chenyan Xiong, Heng Ji, Chao Zhang, Jiawei Han

In this paper, we explore the potential of only using the label name of each class to train classification models on unlabeled data, without using any labeled documents.

Classification Document Classification +4

Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval

4 code implementations ICLR 2021 Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul Bennett, Junaid Ahmed, Arnold Overwijk

In this paper, we identify that the main bottleneck is in the training mechanisms, where the negative instances used in training are not representative of the irrelevant documents in testing.

Contrastive Learning Passage Retrieval +1

Proceedings of the KG-BIAS Workshop 2020 at AKBC 2020

no code implementations18 Jun 2020 Edgar Meij, Tara Safavi, Chenyan Xiong, Gianluca Demartini, Miriam Redi, Fatma Özcan

The KG-BIAS 2020 workshop touches on biases and how they surface in knowledge graphs (KGs), biases in the source data that is used to create KGs, methods for measuring or remediating bias in KGs, but also identifying other biases such as how and which languages are represented in automatically constructed KGs or how personal KGs might incur inherent biases.

Knowledge Graphs

Few-Shot Generative Conversational Query Rewriting

1 code implementation9 Jun 2020 Shi Yu, Jiahua Liu, Jingqin Yang, Chenyan Xiong, Paul Bennett, Jianfeng Gao, Zhiyuan Liu

Conversational query rewriting aims to reformulate a concise conversational query to a fully specified, context-independent query that can be effectively handled by existing information retrieval systems.

Information Retrieval Self-Supervised Learning +1

Capturing Global Informativeness in Open Domain Keyphrase Extraction

2 code implementations28 Apr 2020 Si Sun, Zhenghao Liu, Chenyan Xiong, Zhiyuan Liu, Jie Bao

Open-domain KeyPhrase Extraction (KPE) aims to extract keyphrases from documents without domain or quality restrictions, e. g., web pages with variant domains and qualities.

Chunking Informativeness +1

Selective Weak Supervision for Neural Information Retrieval

1 code implementation28 Jan 2020 Kaitao Zhang, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu

This paper democratizes neural information retrieval to scenarios where large scale relevance training signals are not available.

Information Retrieval Learning-To-Rank

TaxoExpan: Self-supervised Taxonomy Expansion with Position-Enhanced Graph Neural Network

2 code implementations26 Jan 2020 Jiaming Shen, Zhihong Shen, Chenyan Xiong, Chi Wang, Kuansan Wang, Jiawei Han

Taxonomies consist of machine-interpretable semantics and provide valuable knowledge for many web applications.

Product Recommendation

Open Domain Web Keyphrase Extraction Beyond Language Modeling

2 code implementations IJCNLP 2019 Lee Xiong, Chuan Hu, Chenyan Xiong, Daniel Campos, Arnold Overwijk

This paper studies keyphrase extraction in real-world scenarios where documents are from diverse domains and have variant content quality.

Keyphrase Extraction Language Modelling

Fine-grained Fact Verification with Kernel Graph Attention Network

1 code implementation ACL 2020 Zhenghao Liu, Chenyan Xiong, Maosong Sun, Zhiyuan Liu

Fact Verification requires fine-grained natural language inference capability that finds subtle clues to identify the syntactical and semantically correct but not well-supported claims.

Fact Verification Graph Attention +1

Explore Entity Embedding Effectiveness in Entity Retrieval

no code implementations28 Aug 2019 Zhenghao Liu, Chenyan Xiong, Maosong Sun, Zhiyuan Liu

Entity embedding learns lots of semantic information from the knowledge graph and represents entities with a low-dimensional representation, which provides an opportunity to establish interactions between query related entities and candidate entities for entity retrieval.

Entity Retrieval Learning-To-Rank

Latent Relation Language Models

no code implementations21 Aug 2019 Hiroaki Hayashi, Zecong Hu, Chenyan Xiong, Graham Neubig

In this paper, we propose Latent Relation Language Models (LRLMs), a class of language models that parameterizes the joint distribution over the words in a document and the entities that occur therein via knowledge graph relations.

Language Modelling

Neural Document Expansion with User Feedback

1 code implementation8 Aug 2019 Yue Yin, Chenyan Xiong, Cheng Luo, Zhiyuan Liu

This paper presents a neural document expansion approach (NeuDEF) that enriches document representations for neural ranking models.

Generic Intent Representation in Web Search

no code implementations24 Jul 2019 Hongfei Zhang, Xia Song, Chenyan Xiong, Corby Rosset, Paul N. Bennett, Nick Craswell, Saurabh Tiwary

This paper presents GEneric iNtent Encoder (GEN Encoder) which learns a distributed representation space for user intent in search.

Multi-Task Learning

An Axiomatic Approach to Regularizing Neural Ranking Models

no code implementations15 Apr 2019 Corby Rosset, Bhaskar Mitra, Chenyan Xiong, Nick Craswell, Xia Song, Saurabh Tiwary

The training of these models involve a search for appropriate parameter values based on large quantities of labeled examples.

Information Retrieval

Consistency and Variation in Kernel Neural Ranking Model

no code implementations27 Sep 2018 Mary Arpita Pyreddy, Varshini Ramaseshan, Narendra Nath Joshi, Zhuyun Dai, Chenyan Xiong, Jamie Callan, Zhiyuan Liu

This paper studies the consistency of the kernel-based neural ranking model K-NRM, a recent state-of-the-art neural IR model, which is important for reproducible research and deployment in the industry.

Word Embeddings

Automatic Event Salience Identification

1 code implementation EMNLP 2018 Zhengzhong Liu, Chenyan Xiong, Teruko Mitamura, Eduard Hovy

Our analyses demonstrate that our neural model captures interesting connections between salience and discourse unit relations (e. g., scripts and frame structures).

Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling

no code implementations3 May 2018 Chenyan Xiong, Zhengzhong Liu, Jamie Callan, Tie-Yan Liu

The salience model also improves ad hoc search accuracy, providing effective ranking features by modeling the salience of query entities in candidate documents.

Convolutional Neural Networks for Soft Matching N-Grams in Ad-hoc Search

no code implementations WSDM 2018 2018 Zhuyun Dai, Chenyan Xiong, Jamie Callan, Zhiyuan Liu

This paper presents Conv-KNRM, a Convolutional Kernel-based Neural Ranking Model that models n-gram soft matches for ad-hoc search.

Learning-To-Rank

End-to-End Neural Ad-hoc Ranking with Kernel Pooling

1 code implementation20 Jun 2017 Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, Russell Power

Given a query and a set of documents, K-NRM uses a translation matrix that models word-level similarities via word embeddings, a new kernel-pooling technique that uses kernels to extract multi-level soft match features, and a learning-to-rank layer that combines those features into the final ranking score.

Document Ranking Learning-To-Rank +2

Word-Entity Duet Representations for Document Ranking

no code implementations20 Jun 2017 Chenyan Xiong, Jamie Callan, Tie-Yan Liu

This paper presents a word-entity duet framework for utilizing knowledge bases in ad-hoc retrieval.

Document Ranking Learning-To-Rank

Cannot find the paper you are looking for? You can Submit a new open access paper.