Search Results for author: Ji Xin

Found 19 papers, 8 papers with code

Bag-of-Words Baselines for Semantic Code Search

no code implementations ACL (NLP4Prog) 2021 Xinyu Zhang, Ji Xin, Andrew Yates, Jimmy Lin

The task of semantic code search is to retrieve code snippets from a source code corpus based on an information need expressed in natural language.

Code Search Information Retrieval +2

Simple and Effective Unsupervised Redundancy Elimination to Compress Dense Vectors for Passage Retrieval

no code implementations EMNLP 2021 Xueguang Ma, Minghan Li, Kai Sun, Ji Xin, Jimmy Lin

Recent work has shown that dense passage retrieval techniques achieve better ranking accuracy in open-domain question answering compared to sparse retrieval techniques such as BM25, but at the cost of large space and memory requirements.

Open-Domain Question Answering Passage Retrieval +2

Building an Efficiency Pipeline: Commutativity and Cumulativeness of Efficiency Operators for Transformers

no code implementations31 Jul 2022 Ji Xin, Raphael Tang, Zhiying Jiang, YaoLiang Yu, Jimmy Lin

There exists a wide variety of efficiency methods for natural language processing (NLP) tasks, such as pruning, distillation, dynamic inference, quantization, etc.

Quantization

Few-Shot Non-Parametric Learning with Deep Latent Variable Model

no code implementations23 Jun 2022 Zhiying Jiang, Yiqin Dai, Ji Xin, Ming Li, Jimmy Lin

Most real-world problems that machine learning algorithms are expected to solve face the situation with 1) unknown data distribution; 2) little domain-specific knowledge; and 3) datasets with limited annotation.

Classification Image Classification

Certified Error Control of Candidate Set Pruning for Two-Stage Relevance Ranking

1 code implementation19 May 2022 Minghan Li, Xinyu Zhang, Ji Xin, Hongyang Zhang, Jimmy Lin

For example, on MS MARCO Passage v1, our method yields an average candidate set size of 27 out of 1, 000 which increases the reranking speed by about 37 times, while the MRR@10 is greater than a pre-specified value of 0. 38 with about 90% empirical coverage and the empirical baselines fail to provide such guarantee.

Computational Efficiency Information Retrieval +1

Zero-Shot Dense Retrieval with Momentum Adversarial Domain Invariant Representation

no code implementations29 Sep 2021 Ji Xin, Chenyan Xiong, Ashwin Srinivasan, Ankita Sharma, Damien Jose, Paul N. Bennett

Dense retrieval (DR) methods conduct text retrieval by first encoding texts in the embedding space and then matching them by nearest neighbor search.

Representation Learning Retrieval +1

The Art of Abstention: Selective Prediction and Error Regularization for Natural Language Processing

1 code implementation ACL 2021 Ji Xin, Raphael Tang, YaoLiang Yu, Jimmy Lin

To fill this void in the literature, we study in this paper selective prediction for NLP, comparing different models and confidence estimators.

BERxiT: Early Exiting for BERT with Better Fine-Tuning and Extension to Regression

1 code implementation EACL 2021 Ji Xin, Raphael Tang, YaoLiang Yu, Jimmy Lin

The slow speed of BERT has motivated much research on accelerating its inference, and the early exiting idea has been proposed to make trade-offs between model quality and efficiency.

regression

Inserting Information Bottlenecks for Attribution in Transformers

1 code implementation Findings of the Association for Computational Linguistics 2020 Zhiying Jiang, Raphael Tang, Ji Xin, Jimmy Lin

We show the effectiveness of our method in terms of attribution and the ability to provide insight into how information flows through layers.

Showing Your Work Doesn't Always Work

1 code implementation ACL 2020 Raphael Tang, Jaejun Lee, Ji Xin, Xinyu Liu, Yao-Liang Yu, Jimmy Lin

In natural language processing, a recently popular line of work explores how to best report the experimental results of neural networks.

DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

3 code implementations ACL 2020 Ji Xin, Raphael Tang, Jaejun Lee, Yao-Liang Yu, Jimmy Lin

Large-scale pre-trained language models such as BERT have brought significant improvements to NLP applications.

Exploiting Token and Path-based Representations of Code for Identifying Security-Relevant Commits

no code implementations15 Nov 2019 Achyudh Ram, Ji Xin, Meiyappan Nagappan, Yao-Liang Yu, Rocío Cabrera Lozoya, Antonino Sabetta, Jimmy Lin

Public vulnerability databases such as CVE and NVD account for only 60% of security vulnerabilities present in open-source projects, and are known to suffer from inconsistent quality.

What Part of the Neural Network Does This? Understanding LSTMs by Measuring and Dissecting Neurons

no code implementations IJCNLP 2019 Ji Xin, Jimmy Lin, Yao-Liang Yu

Memory neurons of long short-term memory (LSTM) networks encode and process information in powerful yet mysterious ways.

Cannot find the paper you are looking for? You can Submit a new open access paper.