1 code implementation • 21 Oct 2024 • Zhuoming Chen, Ranajoy Sadhukhan, Zihao Ye, Yang Zhou, Jianyu Zhang, Niklas Nolte, Yuandong Tian, Matthijs Douze, Leon Bottou, Zhihao Jia, Beidi Chen
Large language models (LLMs) with long context windows have gained significant attention.
no code implementations • ICCV 2023 • Vivien Cabannes, Leon Bottou, Yann Lecun, Randall Balestriero
Third, it provides a proper active learning framework yielding low-cost solutions to annotate datasets, arguably bringing the gap between theory and practice of active learning that is based on simple-to-answer-by-non-experts queries of semantic relationships between inputs.
no code implementations • 7 Apr 2022 • Randall Balestriero, Leon Bottou, Yann Lecun
The optimal amount of DA or weight decay found from cross-validation leads to disastrous model performances on some classes e. g. on Imagenet with a resnet50, the "barn spider" classification test accuracy falls from $68\%$ to $46\%$ only by introducing random crop DA during training.
no code implementations • 29 Sep 2021 • Maxwell Goldstein, Leon Bottou, Rob Fergus
Contemporary ranking systems that are based on win/loss history, such as Elo or TrueSkill represent each player using a scalar estimate of ability (plus variance, in the latter case).
no code implementations • 17 Jun 2021 • Alexander Peysakhovich, Anna Klimovskaia Susmel, Leon Bottou
Dot product embeddings take a graph and construct vectors for nodes such that dot products between two vectors give the strength of the edge.
2 code implementations • 22 Feb 2021 • Benjamin Aubin, Agnieszka Słowik, Martin Arjovsky, Leon Bottou, David Lopez-Paz
There is an increasing interest in algorithms to learn invariant correlations across training environments.
no code implementations • ICLR 2020 • Aaron Defazio, Leon Bottou
Abstract In this work, we describe a set of rules for the design and initialization of well-conditioned neural networks, guided by the goal of naturally balancing the diagonal blocks of the Hessian at the start of training.
no code implementations • ICLR 2019 • Utku Evci, Nicolas Le Roux, Pablo Castro, Leon Bottou
Finally, we show that the units selected by the best performing scoring functions are somewhat consistent over the course of training, implying the dead parts of the network appear during the stages of training.
1 code implementation • 5 Jun 2018 • Rachel Ward, Xiaoxia Wu, Leon Bottou
Adaptive gradient methods such as AdaGrad and its variants update the stepsize in stochastic gradient descent on the fly according to the gradients received along the way; such methods have gained widespread use in large-scale optimization for their ability to converge robustly, without the need to fine-tune the stepsize schedule.
no code implementations • 21 Dec 2017 • Leon Bottou, Martin Arjovsky, David Lopez-Paz, Maxime Oquab
Learning algorithms for implicit generative models can optimize a variety of criteria that measure how the data distribution differs from the implicit model distribution, including the Wasserstein distance, the Energy distance, and the Maximum Mean Discrepancy criterion.
no code implementations • ICLR 2018 • Levent Sagun, Utku Evci, V. Ugur Guney, Yann Dauphin, Leon Bottou
In particular, we present a case that links the two observations: small and large batch gradient descent appear to converge to different basins of attraction but we show that they are in fact connected through their flat region and so belong to the same basin.
no code implementations • 22 Nov 2016 • Levent Sagun, Leon Bottou, Yann Lecun
We look at the eigenvalues of the Hessian of a loss function before and after training.
no code implementations • CVPR 2015 • Maxime Oquab, Leon Bottou, Ivan Laptev, Josef Sivic
Successful visual object recognition methods typically rely on training datasets containing lots of richly annotated images.
no code implementations • 2 Oct 2014 • Alekh Agarwal, Leon Bottou
This paper presents a lower bound for optimizing a finite sum of $n$ functions, where each function is $L$-smooth and the sum is $\mu$-strongly convex.
no code implementations • 16 Sep 2014 • Patrice Simard, David Chickering, Aparna Lakshmiratan, Denis Charles, Leon Bottou, Carlos Garcia Jurado Suarez, David Grangier, Saleema Amershi, Johan Verwey, Jina Suh
Based on the machine's output, the teacher can revise the definition of the task or make it more precise.
1 code implementation • CVPR 2014 • Maxime Oquab, Leon Bottou, Ivan Laptev, Josef Sivic
We show that despite differences in image statistics and tasks in the two datasets, the transferred representation leads to significantly improved results for object and action classification, outperforming the current state of the art on Pascal VOC 2007 and 2012 datasets.
no code implementations • 4 Nov 2013 • Dhruv Mahajan, S. Sathiya Keerthi, S. Sundararajan, Leon Bottou
The method has strong convergence properties.
no code implementations • 31 Oct 2013 • Dhruv Mahajan, Nikunj Agrawal, S. Sathiya Keerthi, S. Sundararajan, Leon Bottou
In this paper we give a novel approach to the distributed training of linear classifiers (involving smooth losses and L2 regularization) that is designed to reduce the total communication costs.
no code implementations • 30 Oct 2013 • Alekh Agarwal, Leon Bottou, Miroslav Dudik, John Langford
We leverage the same observation to build a generic strategy for parallelizing learning algorithms.
1 code implementation • 2 Mar 2011 • Ronan Collobert, Jason Weston, Leon Bottou, Michael Karlen, Koray Kavukcuoglu, Pavel Kuksa
We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including: part-of-speech tagging, chunking, named entity recognition, and semantic role labeling.
no code implementations • 9 Feb 2011 • Leon Bottou
This observation suggests a conceptual continuity between algebraically rich inference systems, such as logical or probabilistic inference, and simple manipulations, such as the mere concatenation of trainable learning systems.