no code implementations • 26 Feb 2024 • Leonid Boytsov, Ameya Joshi, Filipe Condessa
By training them using a small learning rate for about one epoch, we obtained models that retained the accuracy of the backbone classifier while being unusually resistant to gradient attacks including APGD and FAB-T attacks from the AutoAttack package, which we attributed to gradient masking.
no code implementations • 8 Jan 2023 • Leonid Boytsov, Preksha Patel, Vivek Sourabh, Riddhi Nisar, Sayani Kundu, Ramya Ramanathan, Eric Nyberg
Unlike InPars, InPars-light uses 7x-100x smaller ranking models and only a freely available language model BLOOM, which -- as we found out -- produced more accurate rankers compared to a proprietary GPT-3 model.
3 code implementations • 4 Jul 2022 • Leonid Boytsov, David Akinpelu, Tianyi Lin, Fangwei Gao, Yutian Zhao, Jeffrey Huang, Eric Nyberg
Most other models had poor zero-shot performance (sometimes at a random baseline level) but outstripped MaxP by as much 13-28\% after finetuning.
no code implementations • 12 May 2022 • Ameya Joshi, Minh Pham, Minsu Cho, Leonid Boytsov, Filipe Condessa, J. Zico Kolter, Chinmay Hegde
Randomized smoothing (RS) has been shown to be a fast, scalable technique for certifying the robustness of deep neural network classifiers.
no code implementations • 13 Apr 2022 • Pavel Efimov, Leonid Boytsov, Elena Arslanova, Pavel Braslavski
Our study reproduced gains in NLI for four languages, showed improved NER, XSR, and cross-lingual QA results in three languages (though some cross-lingual QA gains were not statistically significant), while mono-lingual QA performance never improved and sometimes degraded.
no code implementations • ICML Workshop AML 2021 • Mohammad Sadegh Norouzzadeh, Wan-Yi Lin, Leonid Boytsov, Leslie Rice, huan zhang, Filipe Condessa, J Zico Kolter
Most pre-trained classifiers, though they may work extremely well on the domain they were trained upon, are not trained in a robust fashion, and therefore are sensitive to adversarial attacks.
1 code implementation • 4 Mar 2021 • Iurii Mokrii, Leonid Boytsov, Pavel Braslavski
Due to high annotation costs making the best use of existing human-created training data is an important research direction.
2 code implementations • 12 Feb 2021 • Leonid Boytsov, Zico Kolter
We study the utility of the lexical translation model (IBM Model 1) for English text retrieval, in particular, its neural variants that are trained end-to-end.
2 code implementations • 15 Dec 2020 • Leonid Boytsov
This short document describes a traditional IR system that achieved MRR@100 equal to 0. 298 on the MS MARCO Document Ranking leaderboard (on 2020-12-06).
2 code implementations • EMNLP (NLPOSS) 2020 • Leonid Boytsov, Eric Nyberg
Our objective is to introduce to the NLP community an existing k-NN search library NMSLIB, a new retrieval toolkit FlexNeuART, as well as their integration capabilities.
no code implementations • 20 Dec 2019 • Pavel Efimov, Andrey Chertok, Leonid Boytsov, Pavel Braslavski
SberQuAD -- a large scale analog of Stanford SQuAD in the Russian language - is a valuable resource that has not been properly presented to the scientific community.
Ranked #1 on Question Answering on SberQuAD
no code implementations • 8 Oct 2019 • Leonid Boytsov, Eric Nyberg
We consider two known data-driven approaches to extend these rules to non-metric spaces: TriGen and a piece-wise linear approximation of the pruning rule.
no code implementations • 8 Oct 2019 • Leonid Boytsov, Eric Nyberg
We demonstrate that a graph-based search algorithm-relying on the construction of an approximate neighborhood graph-can directly work with challenging non-metric and/or non-symmetric distances without resorting to metric-space mapping and/or distance symmetrization, which, in turn, lead to substantial performance degradation.
2 code implementations • 22 Aug 2015 • Bilegsaikhan Naidan, Leonid Boytsov, Yury Malkov, David Novak
This document covers a library for fast similarity (k-NN)search.
1 code implementation • 10 Jun 2015 • Bilegsaikhan Naidan, Leonid Boytsov, Eric Nyberg
The underpinning assumption is that, for both metric and non-metric spaces, the distance between permutations is a good proxy for the distance between original points.
4 code implementations • 24 Jan 2014 • Daniel Lemire, Leonid Boytsov, Nathan Kurz
We can use the SIMD instructions available in common processors to boost the speed of integer compression schemes.
Information Retrieval Databases Performance
1 code implementation • NeurIPS 2013 • Leonid Boytsov, Bilegsaikhan Naidan
Our focus is on approximate nearest neighbor retrieval in metric and non-metric spaces.
2 code implementations • 10 Sep 2012 • Daniel Lemire, Leonid Boytsov
In many important applications -- such as search engines and relational database systems -- data is stored in the form of arrays of integers.