1 code implementation • 17 Oct 2024 • Patrik Okanovic, Andreas Kirsch, Jannes Kasper, Torsten Hoefler, Andreas Krause, Nezihe Merve Gürel
We introduce MODEL SELECTOR, a framework for label-efficient selection of pretrained classifiers.
no code implementations • 4 Sep 2024 • Andreas Kirsch
Epistemic uncertainty is crucial for safety-critical applications and out-of-distribution detection tasks.
no code implementations • 1 Jul 2024 • Minh Nguyen, Andrew Baker, Clement Neo, Allen Roush, Andreas Kirsch, Ravid Shwartz-Ziv
Large Language Models (LLMs) generate text by sampling the next token from a probability distribution over the vocabulary at each decoding step.
no code implementations • 17 Jun 2024 • Muhammed Razzak, Andreas Kirsch, Yarin Gal
Recently, transductive learning methods, which leverage holdout sets during training, have gained popularity for their potential to improve speed, accuracy, and fairness in machine learning models.
1 code implementation • 15 Jun 2024 • David Brandfonbrener, HANLIN ZHANG, Andreas Kirsch, Jonathan Richard Schwarz, Sham Kakade
Selecting high-quality data for pre-training is crucial in shaping the downstream task performance of language models.
no code implementations • 9 Jan 2024 • Andreas Kirsch
At its core, this thesis aims to enhance the practicality of deep learning by improving the label and training efficiency of deep learning models.
1 code implementation • 17 Apr 2023 • Freddie Bickford Smith, Andreas Kirsch, Sebastian Farquhar, Yarin Gal, Adam Foster, Tom Rainforth
Information-theoretic approaches to active learning have traditionally focused on maximising the information gathered about the model parameters, most commonly by optimising the BALD score.
1 code implementation • 26 Mar 2023 • Andreas Kirsch
Unfortunately, neither the GraNd score at initialization nor the input norm surpasses random pruning in performance.
2 code implementations • 17 Feb 2023 • Andreas Kirsch
This approach is compatible with a wide range of machine learning models, including regular and Bayesian deep learning models and non-differentiable models such as random forests.
no code implementations • 23 Jan 2023 • Andreas Kirsch
One commonly used technique for active learning is BatchBALD, which uses Bayesian neural networks to find the most informative points to label in a pool set.
no code implementations • CVPR 2023 • Jishnu Mukhoti, Andreas Kirsch, Joost van Amersfoort, Philip H.S. Torr, Yarin Gal
Reliable uncertainty from deterministic single-forward pass models is sought after because conventional methods of uncertainty quantification are computationally expensive.
1 code implementation • 1 Aug 2022 • Andreas Kirsch, Yarin Gal
Recently proposed methods in data subset selection, that is active learning and active sampling, use Fisher information, Hessians, similarity matrices based on gradients, and gradient lengths to estimate how informative data is for a model's training.
1 code implementation • 15 Jul 2022 • Dustin Tran, Jeremiah Liu, Michael W. Dusenberry, Du Phan, Mark Collier, Jie Ren, Kehang Han, Zi Wang, Zelda Mariet, Huiyi Hu, Neil Band, Tim G. J. Rudner, Karan Singhal, Zachary Nado, Joost van Amersfoort, Andreas Kirsch, Rodolphe Jenatton, Nithum Thain, Honglin Yuan, Kelly Buchanan, Kevin Murphy, D. Sculley, Yarin Gal, Zoubin Ghahramani, Jasper Snoek, Balaji Lakshminarayanan
A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures.
2 code implementations • 14 Jun 2022 • Sören Mindermann, Jan Brauner, Muhammed Razzak, Mrinank Sharma, Andreas Kirsch, Winnie Xu, Benedikt Höltgen, Aidan N. Gomez, Adrien Morisot, Sebastian Farquhar, Yarin Gal
But most computation and time is wasted on redundant and noisy points that are already learnt or not learnable.
no code implementations • 18 May 2022 • Andreas Kirsch, Jannik Kossen, Yarin Gal
They are more realistic than previously suggested ones, building on work by Wen et al. (2021) and Osband et al. (2022), and focus on evaluating the performance of approximate BNNs in an online supervised setting.
1 code implementation • 3 Feb 2022 • Andreas Kirsch, Yarin Gal
Several recent works find empirically that the average test error of deep neural networks can be estimated via the prediction disagreement of models, which does not require labels.
3 code implementations • NeurIPS 2021 • Andrew Jesson, Panagiotis Tigas, Joost van Amersfoort, Andreas Kirsch, Uri Shalit, Yarin Gal
We introduce causal, Bayesian acquisition functions grounded in information theory that bias data acquisition towards regions with overlapping support to maximize sample efficiency for learning personalized treatment effects.
no code implementations • 6 Jul 2021 • Sören Mindermann, Muhammed Razzak, Winnie Xu, Andreas Kirsch, Mrinank Sharma, Adrien Morisot, Aidan N. Gomez, Sebastian Farquhar, Jan Brauner, Yarin Gal
We introduce Goldilocks Selection, a technique for faster model training which selects a sequence of training points that are "just right".
2 code implementations • 22 Jun 2021 • Andreas Kirsch, Sebastian Farquhar, Parmida Atighehchian, Andrew Jesson, Frederic Branchaud-Charron, Yarin Gal
We examine a simple stochastic strategy for adapting well-known single-point acquisition functions to allow batch active learning.
no code implementations • 22 Jun 2021 • Andreas Kirsch, Yarin Gal
A practical notation can convey valuable intuitions and concisely express new ideas.
no code implementations • 22 Jun 2021 • Andreas Kirsch, Tom Rainforth, Yarin Gal
Expanding on MacKay (1992), we argue that conventional model-based methods for active learning - like BALD - have a fundamental shortfall: they fail to directly account for the test-time distribution of the input variables.
4 code implementations • 23 Feb 2021 • Jishnu Mukhoti, Andreas Kirsch, Joost van Amersfoort, Philip H. S. Torr, Yarin Gal
Reliable uncertainty from deterministic single-forward pass models is sought after because conventional methods of uncertainty quantification are computationally expensive.
no code implementations • 10 Jan 2021 • Andreas Kirsch, Yarin Gal
We develop BatchEvaluationBALD, a new acquisition function for deep Bayesian active learning, as an expansion of BatchBALD that takes into account an evaluation set of unlabeled data, for example, the pool set.
no code implementations • 1 Jan 2021 • Andreas Kirsch, Clare Lyle, Yarin Gal
The Information Bottleneck principle offers both a mechanism to explain how deep neural networks train and generalize, as well as a regularized objective with which to train models.
no code implementations • 27 Mar 2020 • Andreas Kirsch, Clare Lyle, Yarin Gal
The Information Bottleneck principle offers both a mechanism to explain how deep neural networks train and generalize, as well as a regularized objective with which to train models.
3 code implementations • NeurIPS 2019 • Andreas Kirsch, Joost van Amersfoort, Yarin Gal
We develop BatchBALD, a tractable approximation to the mutual information between a batch of points and model parameters, which we use as an acquisition function to select multiple informative points jointly for the task of deep Bayesian active learning.
1 code implementation • 26 Sep 2017 • Andreas Kirsch
The OpenAI Gym provides researchers and enthusiasts with simple to use environments for reinforcement learning.