Search Results for author: Leonid Boytsov

Found 19 papers, 11 papers with code

A Curious Case of Remarkable Resilience to Gradient Attacks via Fully Convolutional and Differentiable Front End with a Skip Connection

no code implementations26 Feb 2024 Leonid Boytsov, Ameya Joshi, Filipe Condessa

By training them using a small learning rate for about one epoch, we obtained models that retained the accuracy of the backbone classifier while being unusually resistant to gradient attacks including APGD and FAB-T attacks from the AutoAttack package, which we attributed to gradient masking.

Adversarial Robustness

InPars-Light: Cost-Effective Unsupervised Training of Efficient Rankers

no code implementations8 Jan 2023 Leonid Boytsov, Preksha Patel, Vivek Sourabh, Riddhi Nisar, Sayani Kundu, Ramya Ramanathan, Eric Nyberg

Unlike InPars, InPars-light uses 7x-100x smaller ranking models and only a freely available language model BLOOM, which -- as we found out -- produced more accurate rankers compared to a proprietary GPT-3 model.

Language Modelling Re-Ranking +1

Understanding Performance of Long-Document Ranking Models through Comprehensive Evaluation and Leaderboarding

3 code implementations4 Jul 2022 Leonid Boytsov, David Akinpelu, Tianyi Lin, Fangwei Gao, Yutian Zhao, Jeffrey Huang, Eric Nyberg

Most other models had poor zero-shot performance (sometimes at a random baseline level) but outstripped MaxP by as much 13-28\% after finetuning.

Benchmarking Document Ranking

Smooth-Reduce: Leveraging Patches for Improved Certified Robustness

no code implementations12 May 2022 Ameya Joshi, Minh Pham, Minsu Cho, Leonid Boytsov, Filipe Condessa, J. Zico Kolter, Chinmay Hegde

Randomized smoothing (RS) has been shown to be a fast, scalable technique for certifying the robustness of deep neural network classifiers.

The Impact of Cross-Lingual Adjustment of Contextual Word Representations on Zero-Shot Transfer

no code implementations13 Apr 2022 Pavel Efimov, Leonid Boytsov, Elena Arslanova, Pavel Braslavski

Our study reproduced gains in NLI for four languages, showed improved NER, XSR, and cross-lingual QA results in three languages (though some cross-lingual QA gains were not statistically significant), while mono-lingual QA performance never improved and sometimes degraded.

Continual Learning Machine Reading Comprehension +4

Empirical robustification of pre-trained classifiers

no code implementations ICML Workshop AML 2021 Mohammad Sadegh Norouzzadeh, Wan-Yi Lin, Leonid Boytsov, Leslie Rice, huan zhang, Filipe Condessa, J Zico Kolter

Most pre-trained classifiers, though they may work extremely well on the domain they were trained upon, are not trained in a robust fashion, and therefore are sensitive to adversarial attacks.

Denoising Image Reconstruction +1

A Systematic Evaluation of Transfer Learning and Pseudo-labeling with BERT-based Ranking Models

1 code implementation4 Mar 2021 Iurii Mokrii, Leonid Boytsov, Pavel Braslavski

Due to high annotation costs making the best use of existing human-created training data is an important research direction.

Transfer Learning

Exploring Classic and Neural Lexical Translation Models for Information Retrieval: Interpretability, Effectiveness, and Efficiency Benefits

2 code implementations12 Feb 2021 Leonid Boytsov, Zico Kolter

We study the utility of the lexical translation model (IBM Model 1) for English text retrieval, in particular, its neural variants that are trained end-to-end.

Document Ranking Information Retrieval +3

Traditional IR rivals neural models on the MS MARCO Document Ranking Leaderboard

2 code implementations15 Dec 2020 Leonid Boytsov

This short document describes a traditional IR system that achieved MRR@100 equal to 0. 298 on the MS MARCO Document Ranking leaderboard (on 2020-12-06).

Document Ranking Re-Ranking

Flexible retrieval with NMSLIB and FlexNeuART

2 code implementations EMNLP (NLPOSS) 2020 Leonid Boytsov, Eric Nyberg

Our objective is to introduce to the NLP community an existing k-NN search library NMSLIB, a new retrieval toolkit FlexNeuART, as well as their integration capabilities.

Re-Ranking Retrieval

SberQuAD -- Russian Reading Comprehension Dataset: Description and Analysis

no code implementations20 Dec 2019 Pavel Efimov, Andrey Chertok, Leonid Boytsov, Pavel Braslavski

SberQuAD -- a large scale analog of Stanford SQuAD in the Russian language - is a valuable resource that has not been properly presented to the scientific community.

Question Answering Reading Comprehension

Pruning Algorithms for Low-Dimensional Non-metric k-NN Search: A Case Study

no code implementations8 Oct 2019 Leonid Boytsov, Eric Nyberg

We consider two known data-driven approaches to extend these rules to non-metric spaces: TriGen and a piece-wise linear approximation of the pruning rule.

Retrieval

Accurate and Fast Retrieval for Complex Non-metric Data via Neighborhood Graphs

no code implementations8 Oct 2019 Leonid Boytsov, Eric Nyberg

We demonstrate that a graph-based search algorithm-relying on the construction of an approximate neighborhood graph-can directly work with challenging non-metric and/or non-symmetric distances without resorting to metric-space mapping and/or distance symmetrization, which, in turn, lead to substantial performance degradation.

graph construction Retrieval

Non-Metric Space Library Manual

2 code implementations22 Aug 2015 Bilegsaikhan Naidan, Leonid Boytsov, Yury Malkov, David Novak

This document covers a library for fast similarity (k-NN)search.

Permutation Search Methods are Efficient, Yet Faster Search is Possible

1 code implementation10 Jun 2015 Bilegsaikhan Naidan, Leonid Boytsov, Eric Nyberg

The underpinning assumption is that, for both metric and non-metric spaces, the distance between permutations is a good proxy for the distance between original points.

Retrieval

SIMD Compression and the Intersection of Sorted Integers

4 code implementations24 Jan 2014 Daniel Lemire, Leonid Boytsov, Nathan Kurz

We can use the SIMD instructions available in common processors to boost the speed of integer compression schemes.

Information Retrieval Databases Performance

Decoding billions of integers per second through vectorization

2 code implementations10 Sep 2012 Daniel Lemire, Leonid Boytsov

In many important applications -- such as search engines and relational database systems -- data is stored in the form of arrays of integers.

Cannot find the paper you are looking for? You can Submit a new open access paper.