Search Results for author: Thomas Hofmann

Found 65 papers, 23 papers with code

Phenomenology of Double Descent in Finite-Width Neural Networks

no code implementations ICLR 2022 Sidak Pal Singh, Aurelien Lucchi, Thomas Hofmann, Bernhard Schölkopf

`Double descent' delineates the generalization behaviour of models depending on the regime they belong to: under- or over-parameterized.

Generalization Through The Lens Of Leave-One-Out Error

2 code implementations ICLR 2022 Gregor Bachmann, Thomas Hofmann, Aurélien Lucchi

Despite the tremendous empirical success of deep learning models to solve various learning tasks, our theoretical understanding of their generalization ability is very limited.

Generalization Bounds Transfer Learning

FIGARO: Generating Symbolic Music with Fine-Grained Artistic Control

1 code implementation26 Jan 2022 Dimitri von Rütte, Luca Biggio, Yannic Kilcher, Thomas Hofmann

Generating music with deep neural networks has been an area of active research in recent years.

Music Generation

How to Query Language Models?

1 code implementation4 Aug 2021 Leonard Adolphs, Shehzaad Dhuliawala, Thomas Hofmann

We apply this approach of querying by example to the LAMA probe and obtain substantial improvements of up to 37. 8% for BERT-large on the T-REx data when providing only 10 demonstrations--even outperforming a baseline that queries the model with up to 40 paraphrases of the question.

Analytic Insights into Structure and Rank of Neural Network Hessian Maps

no code implementations NeurIPS 2021 Sidak Pal Singh, Gregor Bachmann, Thomas Hofmann

Moreover, we demonstrate that our bounds remain faithful as an estimate of the numerical Hessian rank, for a larger class of models such as rectified and hyperbolic tangent networks.

Precise characterization of the prior predictive distribution of deep ReLU networks

no code implementations NeurIPS 2021 Lorenzo Noci, Gregor Bachmann, Kevin Roth, Sebastian Nowozin, Thomas Hofmann

Recent works on Bayesian neural networks (BNNs) have highlighted the need to better understand the implications of using Gaussian priors in combination with the compositional structure of the network architecture.

Disentangling the Roles of Curation, Data-Augmentation and the Prior in the Cold Posterior Effect

no code implementations NeurIPS 2021 Lorenzo Noci, Kevin Roth, Gregor Bachmann, Sebastian Nowozin, Thomas Hofmann

The dataset curation hypothesis of Aitchison (2020): we show empirically that the CPE does not arise in a real curated data set but can be produced in a controlled experiment with varying curation strength.

Data Augmentation

Vanishing Curvature and the Power of Adaptive Methods in Randomly Initialized Deep Networks

no code implementations7 Jun 2021 Antonio Orvieto, Jonas Kohler, Dario Pavllo, Thomas Hofmann, Aurelien Lucchi

This paper revisits the so-called vanishing gradient phenomenon, which commonly occurs in deep randomly initialized neural networks.

Uniform Convergence, Adversarial Spheres and a Simple Remedy

no code implementations7 May 2021 Gregor Bachmann, Seyed-Mohsen Moosavi-Dezfooli, Thomas Hofmann

By considering a specific dataset, it was observed that a neural network completely misclassifies a projection of the training data (adversarial set), rendering any existing generalization bound based on uniform convergence vacuous.

Learning Generative Models of Textured 3D Meshes from Real-World Images

1 code implementation ICCV 2021 Dario Pavllo, Jonas Kohler, Thomas Hofmann, Aurelien Lucchi

Recent advances in differentiable rendering have sparked an interest in learning generative models of textured 3D meshes from image collections.

Pose Estimation

Generative Minimization Networks: Training GANs Without Competition

no code implementations23 Mar 2021 Paulina Grnarova, Yannic Kilcher, Kfir Y. Levy, Aurelien Lucchi, Thomas Hofmann

Among known problems experienced by practitioners is the lack of convergence guarantees or convergence to a non-optimum cycle.

SwissDial: Parallel Multidialectal Corpus of Spoken Swiss German

no code implementations21 Mar 2021 Pelin Dogan-Schönberger, Julian Mäder, Thomas Hofmann

Swiss German is a dialect continuum whose natively acquired dialects significantly differ from the formal variety of the language.

Speech Synthesis

Revisiting the Role of Euler Numerical Integration on Acceleration and Stability in Convex Optimization

no code implementations23 Feb 2021 Peiyuan Zhang, Antonio Orvieto, Hadi Daneshmand, Thomas Hofmann, Roy Smith

Viewing optimization methods as numerical integrators for ordinary differential equations (ODEs) provides a thought-provoking modern framework for studying accelerated first-order optimizers.

Numerical Integration

Batch normalization provably avoids ranks collapse for randomly initialised deep networks

no code implementations NeurIPS 2020 Hadi Daneshmand, Jonas Kohler, Francis Bach, Thomas Hofmann, Aurelien Lucchi

Randomly initialized neural networks are known to become harder to train with increasing depth, unless architectural enhancements like residual connections and batch normalization are used.

Convolutional Generation of Textured 3D Meshes

1 code implementation NeurIPS 2020 Dario Pavllo, Graham Spinks, Thomas Hofmann, Marie-Francine Moens, Aurelien Lucchi

A key contribution of our work is the encoding of the mesh and texture as 2D representations, which are semantically aligned and can be easily modeled by a 2D convolutional GAN.

BERT as a Teacher: Contextual Embeddings for Sequence-Level Reward

no code implementations5 Mar 2020 Florian Schmidt, Thomas Hofmann

Measuring the quality of a generated sequence against a set of references is a central problem in many learning frameworks, be it to compute a score, to assign a reward, or to perform discrimination.


Batch Normalization Provably Avoids Rank Collapse for Randomly Initialised Deep Networks

no code implementations3 Mar 2020 Hadi Daneshmand, Jonas Kohler, Francis Bach, Thomas Hofmann, Aurelien Lucchi

Randomly initialized neural networks are known to become harder to train with increasing depth, unless architectural enhancements like residual connections and batch normalization are used.

Controlling Style and Semantics in Weakly-Supervised Image Generation

1 code implementation ECCV 2020 Dario Pavllo, Aurelien Lucchi, Thomas Hofmann

We propose a weakly-supervised approach for conditional image generation of complex scenes where a user has fine control over objects appearing in the scene.

Conditional Image Generation

Mixing of Stochastic Accelerated Gradient Descent

no code implementations31 Oct 2019 Peiyuan Zhang, Hadi Daneshmand, Thomas Hofmann

We study the mixing properties for stochastic accelerated gradient descent (SAGD) on least-squares regression.

Stochastic Optimization

Adversarial Training Generalizes Data-dependent Spectral Norm Regularization

no code implementations25 Sep 2019 Kevin Roth, Yannic Kilcher, Thomas Hofmann

We establish a theoretical link between adversarial training and operator norm regularization for deep neural networks.

LeDeepChef: Deep Reinforcement Learning Agent for Families of Text-Based Games

no code implementations4 Sep 2019 Leonard Adolphs, Thomas Hofmann

We, however, consider the task of designing an agent that not just succeeds in a single game, but performs well across a whole family of games, sharing the same theme.

Atari Games Hierarchical Reinforcement Learning +2

Autoregressive Text Generation Beyond Feedback Loops

1 code implementation IJCNLP 2019 Florian Schmidt, Stephan Mandt, Thomas Hofmann

Autoregressive state transitions, where predictions are conditioned on past predictions, are the predominant choice for both deterministic and stochastic sequential models.

Text Generation

Cosmological N-body simulations: a challenge for scalable generative models

1 code implementation15 Aug 2019 Nathanaël Perraudin, Ankit Srivastava, Aurelien Lucchi, Tomasz Kacprzak, Thomas Hofmann, Alexandre Réfrégier

Our results show that the proposed model produces samples of high visual quality, although the statistical analysis reveals that capturing rare features in the data poses significant problems for the generative models.

Cosmological constraints with deep learning from KiDS-450 weak lensing maps

no code implementations7 Jun 2019 Janis Fluri, Tomasz Kacprzak, Aurelien Lucchi, Alexandre Refregier, Adam Amara, Thomas Hofmann, Aurel Schneider

We present the cosmological results with a CNN from the KiDS-450 tomographic weak lensing dataset, constraining the total matter density $\Omega_m$, the fluctuation amplitude $\sigma_8$, and the intrinsic alignment amplitude $A_{\rm{IA}}$.

Cosmology and Nongalactic Astrophysics

Adversarial Training is a Form of Data-dependent Operator Norm Regularization

no code implementations NeurIPS 2020 Kevin Roth, Yannic Kilcher, Thomas Hofmann

We establish a theoretical link between adversarial training and operator norm regularization for deep neural networks.

Evaluating GANs via Duality

no code implementations ICLR 2019 Paulina Grnarova, Kfir. Y. Levy, Aurelien Lucchi, Nathanael Perraudin, Thomas Hofmann, Andreas Krause

Generative Adversarial Networks (GANs) have shown great results in accurately modeling complex distributions, but their training is known to be difficult due to instabilities caused by a challenging minimax optimization problem.

The Odds are Odd: A Statistical Test for Detecting Adversarial Examples

1 code implementation13 Feb 2019 Kevin Roth, Yannic Kilcher, Thomas Hofmann

We investigate conditions under which test statistics exist that can reliably detect examples, which have been adversarially manipulated in a white-box attack.

A domain agnostic measure for monitoring and evaluating GANs

1 code implementation NeurIPS 2019 Paulina Grnarova, Kfir. Y. Levy, Aurelien Lucchi, Nathanael Perraudin, Ian Goodfellow, Thomas Hofmann, Andreas Krause

Evaluations are essential for: (i) relative assessment of different models and (ii) monitoring the progress of a single model throughout training.

Learning and Evaluating Sparse Interpretable Sentence Embeddings

no code implementations WS 2018 Valentin Trifonov, Octavian-Eugen Ganea, Anna Potapenko, Thomas Hofmann

Previous research on word embeddings has shown that sparse representations, which can be either learned on top of existing dense embeddings or obtained through model constraints during training time, have the benefit of increased interpretability properties: to some degree, each dimension can be understood by a human and associated with a recognizable feature in the data.

Sentence Embedding Sentence-Embedding +1

Cosmological constraints from noisy convergence maps through deep learning

no code implementations23 Jul 2018 Janis Fluri, Tomasz Kacprzak, Aurelien Lucchi, Alexandre Refregier, Adam Amara, Thomas Hofmann

We find that, for a shape noise level corresponding to 8. 53 galaxies/arcmin$^2$ and the smoothing scale of $\sigma_s = 2. 34$ arcmin, the network is able to generate 45% tighter constraints.

Cosmology and Nongalactic Astrophysics

A Distributed Second-Order Algorithm You Can Trust

no code implementations ICML 2018 Celestine Dünner, Aurelien Lucchi, Matilde Gargiani, An Bian, Thomas Hofmann, Martin Jaggi

Due to the rapid growth of data and computational resources, distributed optimization has become an active research area in recent years.

Distributed Optimization Second-order methods

Deep State Space Models for Unconditional Word Generation

no code implementations NeurIPS 2018 Florian Schmidt, Thomas Hofmann

Autoregressive feedback is considered a necessity for successful unconditional text generation using stochastic sequence models.

Text Generation Variational Inference

Zero-Shot Dual Machine Translation

1 code implementation25 May 2018 Lierni Sestorain, Massimiliano Ciaramita, Christian Buck, Thomas Hofmann

Our method can obtain improvements also on the setting where a small amount of parallel data for the zero-shot language pair is available.

Machine Translation Translation

Hyperbolic Neural Networks

3 code implementations NeurIPS 2018 Octavian-Eugen Ganea, Gary Bécigneul, Thomas Hofmann

However, the representational power of hyperbolic geometry is not yet on par with Euclidean geometry, mostly because of the absence of corresponding hyperbolic neural network layers.

Graph Representation Learning Natural Language Inference +1

Adversarially Robust Training through Structured Gradient Regularization

no code implementations22 May 2018 Kevin Roth, Aurelien Lucchi, Sebastian Nowozin, Thomas Hofmann

We propose a novel data-dependent structured gradient regularizer to increase the robustness of neural networks vis-a-vis adversarial perturbations.

Local Saddle Point Optimization: A Curvature Exploitation Approach

1 code implementation15 May 2018 Leonard Adolphs, Hadi Daneshmand, Aurelien Lucchi, Thomas Hofmann

Gradient-based optimization methods are the most popular choice for finding local optima for classical minimization and saddle point problems.

Hyperbolic Entailment Cones for Learning Hierarchical Embeddings

2 code implementations ICML 2018 Octavian-Eugen Ganea, Gary Bécigneul, Thomas Hofmann

Learning graph representations via low-dimensional embeddings that preserve relevant network properties is an important class of problems in machine learning.

Graph Embedding Hypernym Discovery +1

Escaping Saddles with Stochastic Gradients

no code implementations ICML 2018 Hadi Daneshmand, Jonas Kohler, Aurelien Lucchi, Thomas Hofmann

We analyze the variance of stochastic gradients along negative curvature directions in certain non-convex machine learning models and show that stochastic gradients exhibit a strong component along these directions.

Fast cosmic web simulations with generative adversarial networks

no code implementations27 Jan 2018 Andres C. Rodriguez, Tomasz Kacprzak, Aurelien Lucchi, Adam Amara, Raphael Sgier, Janis Fluri, Thomas Hofmann, Alexandre Réfrégier

Computational models of the underlying physical processes, such as classical N-body simulations, are extremely resource intensive, as they track the action of gravity in an expanding universe using billions of particles as tracers of the cosmic matter distribution.

The best defense is a good offense: Countering black box attacks by predicting slightly wrong labels

no code implementations15 Nov 2017 Yannic Kilcher, Thomas Hofmann

Black-Box attacks on machine learning models occur when an attacker, despite having no access to the inner workings of a model, can successfully craft an attack by means of model theft.

Semantic Interpolation in Implicit Models

no code implementations ICLR 2018 Yannic Kilcher, Aurelien Lucchi, Thomas Hofmann

In implicit models, one often interpolates between sampled points in latent space.

Flexible Prior Distributions for Deep Generative Models

no code implementations ICLR 2018 Yannic Kilcher, Aurelien Lucchi, Thomas Hofmann

We consider the problem of training generative models with deep neural networks as generators, i. e. to map latent codes to data points.

Parametrizing filters of a CNN with a GAN

no code implementations ICLR 2018 Yannic Kilcher, Gary Becigneul, Thomas Hofmann

It is commonly agreed that the use of relevant invariances as a good statistical bias is important in machine-learning.

Generator Reversal

no code implementations28 Jul 2017 Yannic Kilcher, Aurélien Lucchi, Thomas Hofmann

We consider the problem of training generative models with deep neural networks as generators, i. e. to map latent codes to data points.

Learning Aerial Image Segmentation from Online Maps

1 code implementation21 Jul 2017 Pascal Kaiser, Jan Dirk Wegner, Aurelien Lucchi, Martin Jaggi, Thomas Hofmann, Konrad Schindler

We adapt a state-of-the-art CNN architecture for semantic segmentation of buildings and roads in aerial images, and compare its performance when using different training data sets, ranging from manually labeled, pixel-accurate ground truth of the same city to automatic training data derived from OpenStreetMap data from distant locations.

General Classification Semantic Segmentation

Cosmological model discrimination with Deep Learning

no code implementations17 Jul 2017 Jorit Schmelzle, Aurelien Lucchi, Tomasz Kacprzak, Adam Amara, Raphael Sgier, Alexandre Réfrégier, Thomas Hofmann

We find that our implementation of DCNN outperforms the skewness and kurtosis statistics, especially for high noise levels.

Accelerated Dual Learning by Homotopic Initialization

no code implementations13 Jun 2017 Hadi Daneshmand, Hamed Hassani, Thomas Hofmann

Gradient descent and coordinate descent are well understood in terms of their asymptotic behavior, but less so in a transient regime often used for approximations in machine learning.

Stabilizing Training of Generative Adversarial Networks through Regularization

1 code implementation NeurIPS 2017 Kevin Roth, Aurelien Lucchi, Sebastian Nowozin, Thomas Hofmann

Deep generative models based on Generative Adversarial Networks (GANs) have demonstrated impressive sample quality but in order to work they require a careful choice of architecture, parameter initialization, and selection of hyper-parameters.

Image Generation

Deep Joint Entity Disambiguation with Local Neural Attention

3 code implementations EMNLP 2017 Octavian-Eugen Ganea, Thomas Hofmann

We propose a novel deep learning model for joint document-level entity disambiguation, which leverages learned neural representations.

Entity Disambiguation

A Semi-supervised Framework for Image Captioning

1 code implementation16 Nov 2016 Wenhu Chen, Aurelien Lucchi, Thomas Hofmann

We here propose a novel way of using such textual data by artificially generating missing visual information.

Image Captioning Word Embeddings

Fully Character-Level Neural Machine Translation without Explicit Segmentation

2 code implementations TACL 2017 Jason Lee, Kyunghyun Cho, Thomas Hofmann

We observe that on CS-EN, FI-EN and RU-EN, the quality of the multilingual character-level translation even surpasses the models specifically trained on that language pair alone, both in terms of BLEU score and human judgment.

Machine Translation Translation

DynaNewton - Accelerating Newton's Method for Machine Learning

no code implementations20 May 2016 Hadi Daneshmand, Aurelien Lucchi, Thomas Hofmann

Solutions on this path are tracked such that the minimizer of the previous objective is guaranteed to be within the quadratic convergence region of the next objective to be optimized.

Starting Small -- Learning with Adaptive Sample Sizes

no code implementations9 Mar 2016 Hadi Daneshmand, Aurelien Lucchi, Thomas Hofmann

For many machine learning problems, data is abundant and it may be prohibitive to make multiple passes through the full training set.

Probabilistic Bag-Of-Hyperlinks Model for Entity Linking

1 code implementation8 Sep 2015 Octavian-Eugen Ganea, Marina Ganea, Aurelien Lucchi, Carsten Eickhoff, Thomas Hofmann

We demonstrate the accuracy of our approach on a wide range of benchmark datasets, showing that it matches, and in many cases outperforms, existing state-of-the-art methods.

Entity Disambiguation Entity Linking +3

Variance Reduced Stochastic Gradient Descent with Neighbors

no code implementations NeurIPS 2015 Thomas Hofmann, Aurelien Lucchi, Simon Lacoste-Julien, Brian McWilliams

As a side-product we provide a unified convergence analysis for a family of variance reduction algorithms, which we call memorization algorithms.

A Variance Reduced Stochastic Newton Method

no code implementations28 Mar 2015 Aurelien Lucchi, Brian McWilliams, Thomas Hofmann

Quasi-Newton methods are widely used in practise for convex loss minimization problems.

Communication-Efficient Distributed Dual Coordinate Ascent

no code implementations NeurIPS 2014 Martin Jaggi, Virginia Smith, Martin Takáč, Jonathan Terhorst, Sanjay Krishnan, Thomas Hofmann, Michael. I. Jordan

Communication remains the most significant bottleneck in the performance of distributed optimization algorithms for large-scale machine learning.

Distributed Optimization

Probabilistic Latent Semantic Analysis

3 code implementations23 Jan 2013 Thomas Hofmann

Probabilistic Latent Semantic Analysis is a novel statistical technique for the analysis of two-mode and co-occurrence data, which has applications in information retrieval and filtering, natural language processing, machine learning from text, and in related areas.

Information Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.