Search Results for author: Zhen Qin

Found 89 papers, 29 papers with code

Optimizing Compound Retrieval Systems

no code implementations16 Apr 2025 Harrie Oosterhuis, Rolf Jagerman, Zhen Qin, Xuanhui Wang

We focus on the optimization of compound retrieval system design which uniquely involves learning where to apply the component models and how to aggregate their predictions into a final ranking.

Information Retrieval Re-Ranking +1

Can Pre-training Indicators Reliably Predict Fine-tuning Outcomes of LLMs?

no code implementations16 Apr 2025 Hansi Zeng, Kai Hui, Honglei Zhuang, Zhen Qin, Zhenrui Yue, Hamed Zamani, Dana Alon

While metrics available during pre-training, such as perplexity, correlate well with model performance at scaling-laws studies, their predictive capacities at a fixed model size remain unclear, hindering effective model selection and development.

Model Selection

Rankers, Judges, and Assistants: Towards Understanding the Interplay of LLMs in Information Retrieval Evaluation

no code implementations24 Mar 2025 Krisztian Balog, Donald Metzler, Zhen Qin

Large language models (LLMs) are increasingly integral to information retrieval (IR), powering ranking, evaluation, and AI-assisted content creation.

Information Retrieval

Vertical Federated Learning in Practice: The Good, the Bad, and the Ugly

no code implementations12 Feb 2025 Zhaomin Wu, Zhen Qin, Junyi Hou, Haodong Zhao, Qinbin Li, Bingsheng He, Lixin Fan

Based on these observations, we outline key research directions aimed at bridging the gap between current VFL research and real-world applications.

Privacy Preserving Vertical Federated Learning

LLM Alignment as Retriever Optimization: An Information Retrieval Perspective

no code implementations6 Feb 2025 Bowen Jin, Jinsung Yoon, Zhen Qin, Ziqi Wang, Wei Xiong, Yu Meng, Jiawei Han, Sercan O. Arik

In this work, we introduce a novel direct optimization approach for LLM alignment by drawing on established Information Retrieval (IR) principles.

Information Retrieval Misinformation +2

Tensor Product Attention Is All You Need

1 code implementation11 Jan 2025 Yifan Zhang, Yifeng Liu, Huizhuo Yuan, Zhen Qin, Yang Yuan, Quanquan Gu, Andrew Chi-Chih Yao

Scaling language models to handle longer input sequences typically necessitates large key-value (KV) caches, resulting in substantial memory overhead during inference.

All Language Modeling +1

SeMi: When Imbalanced Semi-Supervised Learning Meets Mining Hard Examples

no code implementations10 Jan 2025 Yin Wang, Zixuan Wang, Hao Lu, Zhen Qin, Hailiang Zhao, Guanjie Cheng, Ge Su, Li Kuang, Mengchu Zhou, Shuiguang Deng

This method distinguishes the entropy differences among logits of hard and easy examples, thereby identifying hard examples and increasing the utility of unlabeled data, better addressing the imbalance problem in CISSL.

Deep Non-rigid Structure-from-Motion Revisited: Canonicalization and Sequence Modeling

no code implementations10 Dec 2024 Hui Deng, Jiawei Shi, Zhen Qin, Yiran Zhong, Yuchao Dai

In this paper, we revisit deep NRSfM from two perspectives to address the limitations of current deep NRSfM methods : (1) canonicalization and (2) sequence modeling.

Scaling Image Tokenizers with Grouped Spherical Quantization

1 code implementation3 Dec 2024 Jiangtao Wang, Zhen Qin, Yifan Zhang, Vincent Tao Hu, Björn Ommer, Rania Briq, Stefan Kesselheim

Vision tokenizers have gained a lot of attraction due to their scalability and compactness; previous works depend on old-school GAN-based hyperparameters, biased comparisons, and a lack of comprehensive analysis of the scaling behaviours.

Quantization

Personalized Federated Fine-Tuning for LLMs via Data-Driven Heterogeneous Model Architectures

1 code implementation28 Nov 2024 YiCheng Zhang, Zhen Qin, Zhaomin Wu, Shuiguang Deng

Furthermore, we develop a reverse selection-based expert assignment (RSEA) strategy, which enables data-driven model architecture adjustment during fine-tuning by allowing domain experts to select clients that best align with their knowledge domains.

Federated Learning

Robust Low-rank Tensor Train Recovery

no code implementations19 Oct 2024 Zhen Qin, Zhihui Zhu

We first establish the $\ell_1/\ell_2$-restricted isometry property (RIP) for Gaussian measurement operators, demonstrating that the information in the TT format tensor can be preserved using a number of measurements that grows linearly with $N$.

Federated Data-Efficient Instruction Tuning for Large Language Models

no code implementations14 Oct 2024 Zhen Qin, Zhaomin Wu, Bingsheng He, Shuiguang Deng

Instruction tuning helps improve pretrained large language models (LLMs) in terms of the responsiveness to human instructions, which is benefited from diversified instruction data.

Federated Learning

Integrating Planning into Single-Turn Long-Form Text Generation

no code implementations8 Oct 2024 Yi Liang, You Wu, Honglei Zhuang, Li Chen, Jiaming Shen, Yiling Jia, Zhen Qin, Sumit Sanghai, Xuanhui Wang, Carl Yang, Michael Bendersky

To overcome the scarcity of training data for these intermediate steps, we leverage LLMs to generate synthetic intermediate writing data such as outlines, key information and summaries from existing full articles.

Form Text Generation

Inference Scaling for Long-Context Retrieval Augmented Generation

no code implementations6 Oct 2024 Zhenrui Yue, Honglei Zhuang, Aijun Bai, Kai Hui, Rolf Jagerman, Hansi Zeng, Zhen Qin, Dong Wang, Xuanhui Wang, Michael Bendersky

Our observations reveal that increasing inference computation leads to nearly linear gains in RAG performance when optimally allocated, a relationship we describe as the inference scaling laws for RAG.

In-Context Learning RAG +1

Building Math Agents with Multi-Turn Iterative Preference Learning

no code implementations4 Sep 2024 Wei Xiong, Chengshuai Shi, Jiaming Shen, Aviv Rosenberg, Zhen Qin, Daniele Calandriello, Misha Khalman, Rishabh Joshi, Bilal Piot, Mohammad Saleh, Chi Jin, Tong Zhang, Tianqi Liu

Recent studies have shown that large language models' (LLMs) mathematical problem-solving capabilities can be enhanced by integrating external tools, such as code interpreters, and employing multi-turn Chain-of-Thought (CoT) reasoning.

GSM8K Math +3

LAMPO: Large Language Models as Preference Machines for Few-shot Ordinal Classification

no code implementations6 Aug 2024 Zhen Qin, Junru Wu, Jiaming Shen, Tianqi Liu, Xuanhui Wang

We introduce LAMPO, a novel paradigm that leverages Large Language Models (LLMs) for solving few-shot multi-class ordinal classification tasks.

Hate Speech Detection Ordinal Classification

Multilingual Fine-Grained News Headline Hallucination Detection

no code implementations22 Jul 2024 Jiaming Shen, Tianqi Liu, Jialu Liu, Zhen Qin, Jay Pavagadhi, Simon Baumgartner, Michael Bendersky

In this study, we introduce the first multilingual, fine-grained news headline hallucination detection dataset that contains over 11 thousand pairs in 5 languages, each annotated with detailed hallucination types by experts.

Hallucination Headline Generation +1

Boosting Reward Model with Preference-Conditional Multi-Aspect Synthetic Data Generation

no code implementations22 Jul 2024 Jiaming Shen, ran Xu, Yennie Jun, Zhen Qin, Tianqi Liu, Carl Yang, Yi Liang, Simon Baumgartner, Michael Bendersky

Unlike traditional methods, which generate two responses before obtaining the preference label, RMBoost first generates one response and selects a preference label, followed by generating the second more (or less) preferred response conditioned on the pre-selected preference label and the first response.

Synthetic Data Generation

The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective

1 code implementation11 Jul 2024 Zhen Qin, Daoyuan Chen, WenHao Zhang, Liuyi Yao, Yilun Huang, Bolin Ding, Yaliang Li, Shuiguang Deng

As LLMs and MLLMs rely on vast amounts of model parameters and data to achieve emergent capabilities, the importance of data is receiving increasingly widespread attention and recognition.

Reliable Confidence Intervals for Information Retrieval Evaluation Using Generative A.I

no code implementations2 Jul 2024 Harrie Oosterhuis, Rolf Jagerman, Zhen Qin, Xuanhui Wang, Michael Bendersky

In this work, we propose two methods based on prediction-powered inference and conformal risk control that utilize computer-generated relevance annotations to place reliable confidence intervals (CIs) around IR evaluation metrics.

Information Retrieval Retrieval

Scaling Laws for Linear Complexity Language Models

1 code implementation24 Jun 2024 Xuyang Shen, Dong Li, Ruitao Leng, Zhen Qin, Weigao Sun, Yiran Zhong

In this study, we present the scaling laws for linear complexity language models to establish a foundation for their scalability.

Information Retrieval Retrieval

Computational and Statistical Guarantees for Tensor-on-Tensor Regression with Tensor Train Decomposition

no code implementations10 Jun 2024 Zhen Qin, Zhihui Zhu

However, the exponential growth in tensor complexity poses challenges for storage and computation in ToT regression.

Computational Efficiency regression

You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet

no code implementations31 May 2024 Zhen Qin, Yuxin Mao, Xuyang Shen, Dong Li, Jing Zhang, Yuchao Dai, Yiran Zhong

Linear attention mechanisms have gained prominence in causal language models due to their linear computational complexity and enhanced speed.

Image Classification Image Generation +2

TAVGBench: Benchmarking Text to Audible-Video Generation

1 code implementation22 Apr 2024 Yuxin Mao, Xuyang Shen, Jing Zhang, Zhen Qin, Jinxing Zhou, Mochu Xiang, Yiran Zhong, Yuchao Dai

To support research in this field, we have developed a comprehensive Text to Audible-Video Generation Benchmark (TAVGBench), which contains over 1. 7 million clips with a total duration of 11. 8 thousand hours.

Benchmarking Contrastive Learning +1

HGRN2: Gated Linear RNNs with State Expansion

4 code implementations11 Apr 2024 Zhen Qin, Songlin Yang, Weixuan Sun, Xuyang Shen, Dong Li, Weigao Sun, Yiran Zhong

Hierarchically gated linear RNN (HGRN, \citealt{HGRN}) has demonstrated competitive training speed and performance in language modeling while offering efficient inference.

Image Classification Language Modeling +1

Linear Attention Sequence Parallelism

1 code implementation3 Apr 2024 Weigao Sun, Zhen Qin, Dong Li, Xuyang Shen, Yu Qiao, Yiran Zhong

However, for linear sequence modeling methods like linear attention, existing SP approaches do not take advantage of their right-product-first feature, resulting in sub-optimal communication efficiency and usability.

2k

LiPO: Listwise Preference Optimization through Learning-to-Rank

1 code implementation2 Feb 2024 Tianqi Liu, Zhen Qin, Junru Wu, Jiaming Shen, Misha Khalman, Rishabh Joshi, Yao Zhao, Mohammad Saleh, Simon Baumgartner, Jialu Liu, Peter J. Liu, Xuanhui Wang

In this work, we formulate the LM alignment as a \textit{listwise} ranking problem and describe the LiPO framework, where the policy can potentially learn more effectively from a ranked list of plausible responses given the prompt.

Learning-To-Rank

CO2: Efficient Distributed Training with Full Communication-Computation Overlap

1 code implementation29 Jan 2024 Weigao Sun, Zhen Qin, Weixuan Sun, Shidi Li, Dong Li, Xuyang Shen, Yu Qiao, Yiran Zhong

CO2 is able to attain a high scalability even on extensive multi-node clusters constrained by very limited communication bandwidth.

Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models

1 code implementation9 Jan 2024 Zhen Qin, Weigao Sun, Dong Li, Xuyang Shen, Weixuan Sun, Yiran Zhong

With its ability to process tokens in linear computational complexities, linear attention, in theory, can handle sequences of unlimited length without sacrificing speed, i. e., maintaining a constant training speed for various sequence lengths with a fixed memory consumption.

Guaranteed Nonconvex Factorization Approach for Tensor Train Recovery

no code implementations5 Jan 2024 Zhen Qin, Michael B. Wakin, Zhihui Zhu

We first delve into the TT factorization problem and establish the local linear convergence of RGD.

Convergence Analysis for Learning Orthonormal Deep Linear Neural Networks

no code implementations24 Nov 2023 Zhen Qin, Xuwei Tan, Zhihui Zhu

Enforcing orthonormal or isometric property for the weight matrices has been shown to enhance the training of deep neural networks by mitigating gradient exploding/vanishing and increasing the robustness of the learned networks.

Can Query Expansion Improve Generalization of Strong Cross-Encoder Rankers?

no code implementations15 Nov 2023 Minghan Li, Honglei Zhuang, Kai Hui, Zhen Qin, Jimmy Lin, Rolf Jagerman, Xuanhui Wang, Michael Bendersky

In this paper, we re-examine this conclusion and raise the following question: Can query expansion improve generalization of strong cross-encoder rankers?

Instruction Following Language Modelling +2

Accelerating Toeplitz Neural Network with Constant-time Inference Complexity

1 code implementation15 Nov 2023 Zhen Qin, Yiran Zhong

On the other hand, State Space Models (SSMs) achieve lower performance than TNNs in language modeling but offer the advantage of constant inference complexity.

Language Modeling Language Modelling +1

Predicting Text Preference Via Structured Comparative Reasoning

no code implementations14 Nov 2023 Jing Nathan Yan, Tianqi Liu, Justin T Chiu, Jiaming Shen, Zhen Qin, Yue Yu, Yao Zhao, Charu Lakshmanan, Yair Kurzion, Alexander M. Rush, Jialu Liu, Michael Bendersky

Comparative reasoning plays a crucial role in text preference prediction; however, large language models (LLMs) often demonstrate inconsistencies in their reasoning.

Hallucination Retrieval

Explanation-aware Soft Ensemble Empowers Large Language Model In-context Learning

no code implementations13 Nov 2023 Yue Yu, Jiaming Shen, Tianqi Liu, Zhen Qin, Jing Nathan Yan, Jialu Liu, Chao Zhang, Michael Bendersky

To fully unleash the power of explanations, we propose EASE, an Explanation-Aware Soft Ensemble framework to empower in-context learning with LLMs.

In-Context Learning Language Modeling +3

Convergence of Sign-based Random Reshuffling Algorithms for Nonconvex Optimization

no code implementations24 Oct 2023 Zhen Qin, Zhishuai Liu, Pan Xu

Yet, existing analyses of signSGD rely on assuming that data are sampled with replacement in each iteration, contradicting the practical implementation where data are randomly reshuffled and sequentially fed into the algorithm.

PaRaDe: Passage Ranking using Demonstrations with Large Language Models

no code implementations22 Oct 2023 Andrew Drozdov, Honglei Zhuang, Zhuyun Dai, Zhen Qin, Razieh Rahimi, Xuanhui Wang, Dana Alon, Mohit Iyyer, Andrew McCallum, Donald Metzler, Kai Hui

Recent studies show that large language models (LLMs) can be instructed to effectively perform zero-shot passage re-ranking, in which the results of a first stage retrieval method, such as BM25, are rated and reordered to improve relevance.

Passage Ranking Passage Re-Ranking +6

Beyond Yes and No: Improving Zero-Shot LLM Rankers via Scoring Fine-Grained Relevance Labels

no code implementations21 Oct 2023 Honglei Zhuang, Zhen Qin, Kai Hui, Junru Wu, Le Yan, Xuanhui Wang, Michael Bendersky

We propose to incorporate fine-grained relevance labels into the prompt for LLM rankers, enabling them to better differentiate among documents with different levels of relevance to the query and thus derive a more accurate ranking.

Resisting Backdoor Attacks in Federated Learning via Bidirectional Elections and Individual Perspective

1 code implementation28 Sep 2023 Zhen Qin, Feiyi Chen, Chen Zhi, Xueqiang Yan, Shuiguang Deng

Existing approaches defend against backdoor attacks in federated learning (FL) mainly through a) mitigating the impact of infected models, or b) excluding infected models.

Federated Learning

All-pairs Consistency Learning for Weakly Supervised Semantic Segmentation

1 code implementation8 Aug 2023 Weixuan Sun, Yanhao Zhang, Zhen Qin, Zheyuan Liu, Lin Cheng, Fanyi Wang, Yiran Zhong, Nick Barnes

Given a pair of augmented views, our approach regularizes the activation intensities between a pair of augmented views, while also ensuring that the affinity across regions within each view remains consistent.

All Object Localization +2

TransNormerLLM: A Faster and Better Large Language Model with Improved TransNormer

2 code implementations27 Jul 2023 Zhen Qin, Dong Li, Weigao Sun, Weixuan Sun, Xuyang Shen, Xiaodong Han, Yunshen Wei, Baohong Lv, Xiao Luo, Yu Qiao, Yiran Zhong

TransNormerLLM evolves from the previous linear attention architecture TransNormer by making advanced modifications that include positional embedding, linear attention acceleration, gating mechanisms, tensor normalization, and inference acceleration and stabilization.

Language Modeling Language Modelling +1

Exploring Transformer Extrapolation

no code implementations19 Jul 2023 Zhen Qin, Yiran Zhong, Hui Deng

While these methods perform well on a variety of corpora, the conditions for length extrapolation have yet to be investigated.

Language Modeling Language Modelling

Linearized Relative Positional Encoding

no code implementations18 Jul 2023 Zhen Qin, Weixuan Sun, Kaiyue Lu, Hui Deng, Dongxu Li, Xiaodong Han, Yuchao Dai, Lingpeng Kong, Yiran Zhong

Meanwhile, it emphasizes a general paradigm for designing broadly more relative positional encoding methods that are applicable to linear transformers.

Image Classification Language Modeling +3

Large Language Models are Effective Text Rankers with Pairwise Ranking Prompting

no code implementations30 Jun 2023 Zhen Qin, Rolf Jagerman, Kai Hui, Honglei Zhuang, Junru Wu, Le Yan, Jiaming Shen, Tianqi Liu, Jialu Liu, Donald Metzler, Xuanhui Wang, Michael Bendersky

Ranking documents using Large Language Models (LLMs) by directly feeding the query and candidate documents into the prompt is an interesting and practical problem.

Learning to Rank when Grades Matter

no code implementations14 Jun 2023 Le Yan, Zhen Qin, Gil Shamir, Dong Lin, Xuanhui Wang, Mike Bendersky

In this paper, we conduct a rigorous study of learning to rank with grades, where both ranking performance and grade prediction performance are important.

Learning-To-Rank Prediction

Query Expansion by Prompting Large Language Models

no code implementations5 May 2023 Rolf Jagerman, Honglei Zhuang, Zhen Qin, Xuanhui Wang, Michael Bendersky

Query expansion is a widely used technique to improve the recall of search systems.

Towards Disentangling Relevance and Bias in Unbiased Learning to Rank

no code implementations28 Dec 2022 Yunan Zhang, Le Yan, Zhen Qin, Honglei Zhuang, Jiaming Shen, Xuanhui Wang, Michael Bendersky, Marc Najork

We give both theoretical analysis and empirical results to show the negative effects on relevance tower due to such a correlation.

Learning-To-Rank

Regression Compatible Listwise Objectives for Calibrated Ranking with Binary Relevance

no code implementations2 Nov 2022 Aijun Bai, Rolf Jagerman, Zhen Qin, Le Yan, Pratyush Kar, Bing-Rong Lin, Xuanhui Wang, Michael Bendersky, Marc Najork

As Learning-to-Rank (LTR) approaches primarily seek to improve ranking quality, their output scores are not scale-calibrated by design.

Learning-To-Rank regression

Proportionate Recursive Maximum Correntropy Criterion Adaptive Filtering Algorithms and their Performance Analysis

no code implementations22 Oct 2022 Zhen Qin, Jun Tao, Le Yang, Ming Jiang

Motivated by the success of our recently proposed proportionate recursive least squares (PRLS) algorithm for sparse system identification, we propose to introduce the proportionate updating (PU) mechanism into the RMCC, leading to two sparsity-aware RMCC algorithms: the proportionate recursive MCC (PRMCC) algorithm and the combinational PRMCC (CPRMCC) algorithm.

The Devil in Linear Transformer

1 code implementation19 Oct 2022 Zhen Qin, Xiaodong Han, Weixuan Sun, Dongxu Li, Lingpeng Kong, Nick Barnes, Yiran Zhong

In this paper, we examine existing kernel-based linear transformers and identify two key issues that lead to such performance gaps: 1) unbounded gradients in the attention computation adversely impact the convergence of linear transformer models; 2) attention dilution which trivially distributes attention scores over long sequences while neglecting neighbouring structures.

Language Modeling Language Modelling +1

Linear Video Transformer with Feature Fixation

no code implementations15 Oct 2022 Kaiyue Lu, Zexiang Liu, Jianyuan Wang, Weixuan Sun, Zhen Qin, Dong Li, Xuyang Shen, Hui Deng, Xiaodong Han, Yuchao Dai, Yiran Zhong

Therefore, we propose a feature fixation module to reweight the feature importance of the query and key before computing linear attention.

Feature Importance Video Classification

RankT5: Fine-Tuning T5 for Text Ranking with Ranking Losses

no code implementations12 Oct 2022 Honglei Zhuang, Zhen Qin, Rolf Jagerman, Kai Hui, Ji Ma, Jing Lu, Jianmo Ni, Xuanhui Wang, Michael Bendersky

Recently, substantial progress has been made in text ranking based on pretrained language models such as BERT.

Decoder

A Validation Approach to Over-parameterized Matrix and Image Recovery

no code implementations21 Sep 2022 Lijun Ding, Zhen Qin, Liwei Jiang, Jinxin Zhou, Zhihui Zhu

This paper studies the problem of recovering a low-rank matrix from several noisy random linear measurements.

Image Restoration

Neural Architecture Search on Efficient Transformers and Beyond

no code implementations28 Jul 2022 Zexiang Liu, Dong Li, Kaiyue Lu, Zhen Qin, Weixuan Sun, Jiacheng Xu, Yiran Zhong

To address this issue, we propose a new framework to find optimal architectures for efficient Transformers with the neural architecture search (NAS) technique.

Computational Efficiency Image Classification +2

Error Analysis of Tensor-Train Cross Approximation

no code implementations9 Jul 2022 Zhen Qin, Alexander Lidiak, Zhexuan Gong, Gongguo Tang, Michael B. Wakin, Zhihui Zhu

Tensor train decomposition is widely used in machine learning and quantum physics due to its concise representation of high-dimensional tensors, overcoming the curse of dimensionality.

Vicinity Vision Transformer

1 code implementation21 Jun 2022 Weixuan Sun, Zhen Qin, Hui Deng, Jianyuan Wang, Yi Zhang, Kaihao Zhang, Nick Barnes, Stan Birchfield, Lingpeng Kong, Yiran Zhong

Based on this observation, we present a Vicinity Attention that introduces a locality bias to vision transformers with linear complexity.

Ranked #294 on Image Classification on ImageNet (Top 1 Accuracy metric)

Image Classification

cosFormer: Rethinking Softmax in Attention

3 code implementations ICLR 2022 Zhen Qin, Weixuan Sun, Hui Deng, Dongxu Li, Yunshen Wei, Baohong Lv, Junjie Yan, Lingpeng Kong, Yiran Zhong

As one of its core components, the softmax attention helps to capture long-range dependencies yet prohibits its scale-up due to the quadratic space and time complexity to the sequence length.

Ranked #6 on D4RL on D4RL

D4RL Language Modeling +2

Transformer Memory as a Differentiable Search Index

1 code implementation14 Feb 2022 Yi Tay, Vinh Q. Tran, Mostafa Dehghani, Jianmo Ni, Dara Bahri, Harsh Mehta, Zhen Qin, Kai Hui, Zhe Zhao, Jai Gupta, Tal Schuster, William W. Cohen, Donald Metzler

In this paper, we demonstrate that information retrieval can be accomplished with a single Transformer, in which all information about the corpus is encoded in the parameters of the model.

Information Retrieval Retrieval

Rank4Class: A Ranking Formulation for Multiclass Classification

no code implementations17 Dec 2021 Nan Wang, Zhen Qin, Le Yan, Honglei Zhuang, Xuanhui Wang, Michael Bendersky, Marc Najork

Multiclass classification (MCC) is a fundamental machine learning problem of classifying each instance into one of a predefined set of classes.

Classification Image Classification +4

Improving Neural Ranking via Lossless Knowledge Distillation

no code implementations30 Sep 2021 Zhen Qin, Le Yan, Yi Tay, Honglei Zhuang, Xuanhui Wang, Michael Bendersky, Marc Najork

We explore a novel perspective of knowledge distillation (KD) for learning to rank (LTR), and introduce Self-Distilled neural Rankers (SDR), where student rankers are parameterized identically to their teachers.

Knowledge Distillation Learning-To-Rank

Rank4Class: Examining Multiclass Classification through the Lens of Learning to Rank

no code implementations29 Sep 2021 Nan Wang, Zhen Qin, Le Yan, Honglei Zhuang, Xuanhui Wang, Michael Bendersky, Marc Najork

We further demonstrate that the most popular MCC architecture in deep learning can be mathematically formulated as a LTR pipeline equivalently, with a specific set of choices in terms of ranking model architecture and loss function.

Image Classification Information Retrieval +4

Are Pretrained Convolutions Better than Pretrained Transformers?

1 code implementation ACL 2021 Yi Tay, Mostafa Dehghani, Jai Prakash Gupta, Vamsi Aribandi, Dara Bahri, Zhen Qin, Donald Metzler

In the context of language models, are convolutional models competitive to Transformers when pre-trained?

Are Pre-trained Convolutions Better than Pre-trained Transformers?

1 code implementation7 May 2021 Yi Tay, Mostafa Dehghani, Jai Gupta, Dara Bahri, Vamsi Aribandi, Zhen Qin, Donald Metzler

In the context of language models, are convolutional models competitive to Transformers when pre-trained?

OmniNet: Omnidirectional Representations from Transformers

1 code implementation1 Mar 2021 Yi Tay, Mostafa Dehghani, Vamsi Aribandi, Jai Gupta, Philip Pham, Zhen Qin, Dara Bahri, Da-Cheng Juan, Donald Metzler

In OmniNet, instead of maintaining a strictly horizontal receptive field, each token is allowed to attend to all tokens in the entire network.

de-en Few-Shot Learning +4

Neural Rankers are hitherto Outperformed by Gradient Boosted Decision Trees

no code implementations ICLR 2021 Zhen Qin, Le Yan, Honglei Zhuang, Yi Tay, Rama Kumar Pasumarthi, Xuanhui Wang, Michael Bendersky, Marc Najork

We first validate this concern by showing that most recent neural LTR models are, by a large margin, inferior to the best publicly available Gradient Boosted Decision Trees (GBDT) in terms of their reported ranking accuracy on benchmark datasets.

Learning-To-Rank

DeepKeyGen: A Deep Learning-based Stream Cipher Generator for Medical Image Encryption and Decryption

no code implementations21 Dec 2020 Yi Ding, Fuyuan Tan, Zhen Qin, Mingsheng Cao, Kim-Kwang Raymond Choo, Zhiguang Qin

In this paper, a novel deep learning-based key generation network (DeepKeyGen) is proposed as a stream cipher generator to generate the private key, which can then be used for encrypting and decrypting of medical images.

Generative Adversarial Network

Do RNN and LSTM have Long Memory?

1 code implementation ICML 2020 Jingyu Zhao, Feiqing Huang, Jia Lv, Yanjie Duan, Zhen Qin, Guodong Li, Guangjian Tian

The LSTM network was proposed to overcome the difficulty in learning long-term dependence, and has made significant advancements in applications.

Non-Clicks Mean Irrelevant? Propensity Ratio Scoring As a Correction

no code implementations18 May 2020 Nan Wang, Zhen Qin, Xuanhui Wang, Hongning Wang

Recent advances in unbiased learning to rank (LTR) count on Inverse Propensity Scoring (IPS) to eliminate bias in implicit feedback.

Learning-To-Rank

Multi-Task Learning for Email Search Ranking with Auxiliary Query Clustering

no code implementations15 Sep 2018 Jiaming Shen, Maryam Karimzadehgan, Michael Bendersky, Zhen Qin, Donald Metzler

In this paper, we study how to obtain query type in an unsupervised fashion and how to incorporate this information into query-dependent ranking models.

Clustering Multi-Task Learning +1

An Online Learned Elementary Grouping Model for Multi-target Tracking

no code implementations CVPR 2014 Xiaojing Chen, Zhen Qin, Le An, Bir Bhanu

We introduce an online approach to learn possible elementary groups (groups that contain only two targets) for inferring high level context that can be used to improve multi-target tracking in a data-association based framework.

Efficient Online Bootstrapping for Large Scale Learning

no code implementations18 Dec 2013 Zhen Qin, Vaclav Petricek, Nikos Karampatziakis, Lihong Li, John Langford

Bootstrapping is a useful technique for estimating the uncertainty of a predictor, for example, confidence intervals for prediction.

Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.