no code implementations • 7 Nov 2024 • Han Yang, Sotiris Anagnostidis, Enis Simsar, Thomas Hofmann
It has three modules: Identity Net, Shading Net, and Harmonization Net.
1 code implementation • 26 Oct 2024 • Amir Joudaki, Thomas Hofmann
Understanding how neural networks transform input data across layers is fundamental to unraveling their learning and generalization capabilities.
1 code implementation • 14 Oct 2024 • Daniel Gareev, Thomas Hofmann, Ezhilmathi Krishnasamy, Tiago Pimentel
Traditional methods, such as top-$k$ and top-$\pi$, apply local normalisation to the model's output distribution, which can distort it.
1 code implementation • 8 Oct 2024 • Stefan Stefanache, Lluís Pastor Pérez, Julen Costa Watanabe, Ernesto Sanchez Tejedor, Thomas Hofmann, Enis Simsar
Evaluating diffusion-based image-editing models is a crucial task in the field of Generative AI.
no code implementations • 10 Sep 2024 • Piera Riccio, Georgina Curto, Thomas Hofmann, Nuria Oliver
At a time when the influence of generative Artificial Intelligence on visual arts is a highly debated topic, we raise the attention towards a more subtle phenomenon: the algorithmic censorship of artistic nudity online.
no code implementations • 23 Jul 2024 • Giulia Lanzillotta, Sidak Pal Singh, Benjamin F. Grewe, Thomas Hofmann
We classify existing continual learning algorithms based on the approximation used, and we assess the practical effects of this distinction in common continual learning settings. Additionally, we study optimal continual learning objectives in the case of local polynomial approximations and we provide examples of existing algorithms implementing the optimal objectives
no code implementations • 24 Jun 2024 • Sidak Pal Singh, Linara Adilova, Michael Kamp, Asja Fischer, Bernhard Schölkopf, Thomas Hofmann
In this work, we take a step towards understanding it by providing a model of how the loss landscape needs to behave topographically for LMC (or the lack thereof) to manifest.
1 code implementation • 6 Jun 2024 • Pietro Lesci, Clara Meister, Thomas Hofmann, Andreas Vlachos, Tiago Pimentel
Understanding memorisation in language models has practical and societal implications, e. g., studying models' training dynamics or preventing copyright infringements.
1 code implementation • 29 May 2024 • Bobby He, Lorenzo Noci, Daniele Paliotta, Imanol Schlag, Thomas Hofmann
They are well known to emerge during standard transformer training and have the undesirable effect of hindering quantisation in afflicted models.
no code implementations • 21 Apr 2024 • Maria Mihaela Trusca, Wolf Nuyts, Jonathan Thomm, Robert Honig, Thomas Hofmann, Tinne Tuytelaars, Marie-Francine Moens
Current diffusion models create photorealistic images given a text prompt as input but struggle to correctly bind attributes mentioned in the text to the right objects in the image.
1 code implementation • 11 Apr 2024 • Anton Schäfer, Shauli Ravfogel, Thomas Hofmann, Tiago Pimentel, Imanol Schlag
In controlled experiments on perfectly equivalent cloned languages, we observe that the existence of a predominant language during training boosts the performance of less frequent languages and leads to stronger alignment of model representations across languages.
1 code implementation • 9 Apr 2024 • Anton Schäfer, Thomas Hofmann, Imanol Schlag, Tiago Pimentel
In this paper, we study the impact of near duplicate subwords on LM training efficiency.
no code implementations • 12 Mar 2024 • Sidak Pal Singh, Bobby He, Thomas Hofmann, Bernhard Schölkopf
We propose a fresh take on understanding the mechanisms of neural networks by analyzing the rich directional structure of optimization trajectories, represented by their pointwise parameters.
no code implementations • 27 Feb 2024 • Lorenzo Noci, Alexandru Meterez, Thomas Hofmann, Antonio Orvieto
Recently, there has been growing evidence that if the width and depth of a neural network are scaled toward the so-called rich feature learning limit (\mup and its depth extension), then some hyperparameters -- such as the learning rate -- exhibit transfer from small to very large models.
1 code implementation • 22 Feb 2024 • Dimitri von Rütte, Sotiris Anagnostidis, Gregor Bachmann, Thomas Hofmann
Concept guidance has emerged as a cheap and simple way to control the behavior of language models by probing their hidden representations for concept vectors and using them to perturb activations at inference time.
1 code implementation • 12 Feb 2024 • Alexander Theus, Olin Geimer, Friedrich Wicke, Thomas Hofmann, Sotiris Anagnostidis, Sidak Pal Singh
Structural pruning of neural networks conventionally relies on identifying and discarding less important neurons, a practice often resulting in significant accuracy loss that necessitates subsequent fine-tuning efforts.
no code implementations • 5 Feb 2024 • Kai Lion, Lorenzo Noci, Thomas Hofmann, Gregor Bachmann
The multi-modal nature of neural loss landscapes is often considered to be the main driver behind the empirical success of deep ensembles.
1 code implementation • 29 Jan 2024 • Michael Hersche, Francesco Di Stefano, Thomas Hofmann, Abu Sebastian, Abbas Rahimi
Abstract reasoning is a cornerstone of human intelligence, and replicating it with artificial intelligence (AI) presents an ongoing challenge.
no code implementations • 15 Dec 2023 • Gul Sena Altintas, Gregor Bachmann, Lorenzo Noci, Thomas Hofmann
Linear mode-connectivity (LMC) (or lack thereof) is one of the intriguing characteristics of neural network loss landscapes.
no code implementations • 14 Dec 2023 • Enis Simsar, Alessio Tonioni, Yongqin Xian, Thomas Hofmann, Federico Tombari
A significant challenge within this domain is localized editing, where specific areas of an image are modified without affecting the rest of the content.
1 code implementation • 3 Dec 2023 • Yuhui Ding, Antonio Orvieto, Bobby He, Thomas Hofmann
Graph neural networks based on iterative one-hop message passing have been shown to struggle in harnessing the information from distant nodes effectively.
Ranked #2 on Graph Classification on CIFAR10 100k
no code implementations • 10 Nov 2023 • Elior Benarous, Sotiris Anagnostidis, Luca Biggio, Thomas Hofmann
In this study, we investigate how neural networks exhibit shape bias during training on synthetic datasets, serving as an indicator of the synthetic data quality.
no code implementations • 6 Nov 2023 • Sotiris Anagnostidis, Gregor Bachmann, Imanol Schlag, Thomas Hofmann
This leads to the notion of a `compute-optimal' model, i. e. a model that allocates a given level of compute during training optimally to maximize performance.
1 code implementation • 3 Nov 2023 • Bobby He, Thomas Hofmann
A simple design recipe for deep Transformers is to compose identical building blocks.
1 code implementation • 9 Oct 2023 • Moritz Imfeld, Jacopo Graldi, Marco Giordano, Thomas Hofmann, Sotiris Anagnostidis, Sidak Pal Singh
Fusion is a technique for merging multiple independently-trained neural networks in order to combine their capabilities.
no code implementations • 2 Oct 2023 • Giulia Lanzillotta, Sidak Pal Singh, Benjamin F. Grewe, Thomas Hofmann
Deep learning has proved to be a successful paradigm for solving many challenges in machine learning.
1 code implementation • 20 Sep 2023 • Aleksandar Stanić, Dylan Ashley, Oleg Serikov, Louis Kirsch, Francesco Faccio, Jürgen Schmidhuber, Thomas Hofmann, Imanol Schlag
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours.
no code implementations • NeurIPS 2023 • Lorenzo Noci, Chuning Li, Mufan Bill Li, Bobby He, Thomas Hofmann, Chris Maddison, Daniel M. Roy
Motivated by the success of Transformers, we study the covariance matrix of a modified Softmax-based attention model with skip connections in the proportional limit of infinite-depth-and-width.
1 code implementation • NeurIPS 2023 • Gregor Bachmann, Sotiris Anagnostidis, Thomas Hofmann
We show that the performance of MLPs drastically improves with scale (95% on CIFAR10, 82% on CIFAR100, 58% on ImageNet ReaL), highlighting that lack of inductive bias can indeed be compensated.
no code implementations • 4 Jun 2023 • Alexandros Delitzas, Maria Parelli, Nikolas Hars, Georgios Vlassis, Sotirios Anagnostidis, Gregor Bachmann, Thomas Hofmann
Training models to apply common-sense linguistic knowledge and visual concepts from 2D images to 3D scene understanding is a promising direction that researchers have only recently started to explore.
no code implementations • NeurIPS 2023 • Sotiris Anagnostidis, Dario Pavllo, Luca Biggio, Lorenzo Noci, Aurelien Lucchi, Thomas Hofmann
Autoregressive Transformers adopted in Large Language Models (LLMs) are hard to scale to long sequences.
no code implementations • 16 May 2023 • Sidak Pal Singh, Thomas Hofmann, Bernhard Schölkopf
While Convolutional Neural Networks (CNNs) have long been investigated and applied, as well as theorized, we aim to provide a slightly different perspective into their nature -- through the perspective of their Hessian maps.
1 code implementation • 12 Apr 2023 • Maria Parelli, Alexandros Delitzas, Nikolas Hars, Georgios Vlassis, Sotirios Anagnostidis, Gregor Bachmann, Thomas Hofmann
Training models to apply linguistic knowledge and visual concepts from 2D images to 3D world understanding is a promising direction that researchers have only recently started to explore.
1 code implementation • CVPR 2023 • Sanghwan Kim, Lorenzo Noci, Antonio Orvieto, Thomas Hofmann
In contrast to the natural capabilities of humans to learn new tasks in a sequential fashion, neural networks are known to suffer from catastrophic forgetting, where the model's performances on old tasks drop dramatically after being optimized for a new task.
1 code implementation • 23 Feb 2023 • Felix Sarnthein, Gregor Bachmann, Sotiris Anagnostidis, Thomas Hofmann
In this work, we investigate the implicit regularization induced by teacher-student learning dynamics in self-distillation.
no code implementations • 22 Nov 2022 • Sotiris Anagnostidis, Arne Thomsen, Tomasz Kacprzak, Tilman Tröster, Luca Biggio, Alexandre Refregier, Thomas Hofmann
In this work, we aim to improve upon two-point statistics by employing a \textit{PointNet}-like neural network to regress the values of the cosmological parameters directly from point cloud data.
no code implementations • 25 Oct 2022 • Sotiris Anagnostidis, Gregor Bachmann, Lorenzo Noci, Thomas Hofmann
While such a memorization capacity seems worrisome, in this work we show that under training protocols that include \textit{data augmentation}, neural networks learn to memorize entirely random labels in a benign way, i. e. they learn embeddings that lead to highly non-trivial performance under nearest neighbour probing.
1 code implementation • 21 Oct 2022 • Leonard Adolphs, Michelle Chen Huebscher, Christian Buck, Sertan Girgin, Olivier Bachem, Massimiliano Ciaramita, Thomas Hofmann
Neural retrieval models have superseded classic bag-of-words methods such as BM25 as the retrieval framework of choice.
no code implementations • ICCV 2023 • Sotiris Anagnostidis, Aurelien Lucchi, Thomas Hofmann
Accurately predicting road networks from satellite images requires a global understanding of the network topology.
1 code implementation • 19 Jul 2022 • Piera Riccio, Bill Psomas, Francesco Galati, Francisco Escolano, Thomas Hofmann, Nuria Oliver
Augmented Reality or AR filters on selfies have become very popular on social media platforms for a variety of applications, including marketing, entertainment and aesthetics.
no code implementations • 27 May 2022 • Gregor Bachmann, Lorenzo Noci, Thomas Hofmann
While data augmentation has been empirically recognized as one of the main drivers of this effect, a theoretical account of its role, on the other hand, is largely missing.
no code implementations • ICLR 2022 • Sidak Pal Singh, Aurelien Lucchi, Thomas Hofmann, Bernhard Schölkopf
`Double descent' delineates the generalization behaviour of models depending on the regime they belong to: under- or over-parameterized.
1 code implementation • ICLR 2022 • Gregor Bachmann, Thomas Hofmann, Aurélien Lucchi
Despite the tremendous empirical success of deep learning models to solve various learning tasks, our theoretical understanding of their generalization ability is very limited.
2 code implementations • 26 Jan 2022 • Dimitri von Rütte, Luca Biggio, Yannic Kilcher, Thomas Hofmann
Generating music with deep neural networks has been an area of active research in recent years.
no code implementations • 2 Jan 2022 • Enea Monzio Compagnoni, Anna Scampicchio, Luca Biggio, Antonio Orvieto, Thomas Hofmann, Josef Teichmann
Many finance, physics, and engineering phenomena are modeled by continuous-time dynamical systems driven by highly irregular (stochastic) inputs.
no code implementations • 1 Sep 2021 • Leonard Adolphs, Benjamin Boerschinger, Christian Buck, Michelle Chen Huebscher, Massimiliano Ciaramita, Lasse Espeholt, Thomas Hofmann, Yannic Kilcher, Sascha Rothe, Pier Giuseppe Sessa, Lierni Sestorain Saralegui
This paper presents first successful steps in designing search agents that learn meta-strategies for iterative query refinement in information-seeking tasks.
1 code implementation • 4 Aug 2021 • Leonard Adolphs, Shehzaad Dhuliawala, Thomas Hofmann
We apply this approach of querying by example to the LAMA probe and obtain substantial improvements of up to 37. 8% for BERT-large on the T-REx data when providing only 10 demonstrations--even outperforming a baseline that queries the model with up to 40 paraphrases of the question.
no code implementations • NeurIPS 2021 • Sidak Pal Singh, Gregor Bachmann, Thomas Hofmann
Moreover, we demonstrate that our bounds remain faithful as an estimate of the numerical Hessian rank, for a larger class of models such as rectified and hyperbolic tangent networks.
no code implementations • NeurIPS 2021 • Lorenzo Noci, Gregor Bachmann, Kevin Roth, Sebastian Nowozin, Thomas Hofmann
Recent works on Bayesian neural networks (BNNs) have highlighted the need to better understand the implications of using Gaussian priors in combination with the compositional structure of the network architecture.
no code implementations • NeurIPS 2021 • Lorenzo Noci, Kevin Roth, Gregor Bachmann, Sebastian Nowozin, Thomas Hofmann
The dataset curation hypothesis of Aitchison (2020): we show empirically that the CPE does not arise in a real curated data set but can be produced in a controlled experiment with varying curation strength.
no code implementations • 7 Jun 2021 • Antonio Orvieto, Jonas Kohler, Dario Pavllo, Thomas Hofmann, Aurelien Lucchi
This paper revisits the so-called vanishing gradient phenomenon, which commonly occurs in deep randomly initialized neural networks.
no code implementations • 7 May 2021 • Gregor Bachmann, Seyed-Mohsen Moosavi-Dezfooli, Thomas Hofmann
By considering a specific dataset, it was observed that a neural network completely misclassifies a projection of the training data (adversarial set), rendering any existing generalization bound based on uniform convergence vacuous.
1 code implementation • ICCV 2021 • Dario Pavllo, Jonas Kohler, Thomas Hofmann, Aurelien Lucchi
Recent advances in differentiable rendering have sparked an interest in learning generative models of textured 3D meshes from image collections.
no code implementations • 23 Mar 2021 • Paulina Grnarova, Yannic Kilcher, Kfir Y. Levy, Aurelien Lucchi, Thomas Hofmann
Among known problems experienced by practitioners is the lack of convergence guarantees or convergence to a non-optimum cycle.
no code implementations • 21 Mar 2021 • Pelin Dogan-Schönberger, Julian Mäder, Thomas Hofmann
Swiss German is a dialect continuum whose natively acquired dialects significantly differ from the formal variety of the language.
no code implementations • 23 Feb 2021 • Peiyuan Zhang, Antonio Orvieto, Hadi Daneshmand, Thomas Hofmann, Roy Smith
Viewing optimization methods as numerical integrators for ordinary differential equations (ODEs) provides a thought-provoking modern framework for studying accelerated first-order optimizers.
no code implementations • NeurIPS 2020 • Hadi Daneshmand, Jonas Kohler, Francis Bach, Thomas Hofmann, Aurelien Lucchi
Randomly initialized neural networks are known to become harder to train with increasing depth, unless architectural enhancements like residual connections and batch normalization are used.
1 code implementation • NeurIPS 2020 • Dario Pavllo, Graham Spinks, Thomas Hofmann, Marie-Francine Moens, Aurelien Lucchi
A key contribution of our work is the encoding of the mesh and texture as 2D representations, which are semantically aligned and can be easily modeled by a 2D convolutional GAN.
no code implementations • 5 Mar 2020 • Florian Schmidt, Thomas Hofmann
Measuring the quality of a generated sequence against a set of references is a central problem in many learning frameworks, be it to compute a score, to assign a reward, or to perform discrimination.
no code implementations • 3 Mar 2020 • Hadi Daneshmand, Jonas Kohler, Francis Bach, Thomas Hofmann, Aurelien Lucchi
Randomly initialized neural networks are known to become harder to train with increasing depth, unless architectural enhancements like residual connections and batch normalization are used.
1 code implementation • ECCV 2020 • Dario Pavllo, Aurelien Lucchi, Thomas Hofmann
We propose a weakly-supervised approach for conditional image generation of complex scenes where a user has fine control over objects appearing in the scene.
no code implementations • 31 Oct 2019 • Peiyuan Zhang, Hadi Daneshmand, Thomas Hofmann
We study the mixing properties for stochastic accelerated gradient descent (SAGD) on least-squares regression.
no code implementations • 25 Sep 2019 • Kevin Roth, Yannic Kilcher, Thomas Hofmann
We establish a theoretical link between adversarial training and operator norm regularization for deep neural networks.
no code implementations • 4 Sep 2019 • Leonard Adolphs, Thomas Hofmann
We, however, consider the task of designing an agent that not just succeeds in a single game, but performs well across a whole family of games, sharing the same theme.
1 code implementation • IJCNLP 2019 • Florian Schmidt, Stephan Mandt, Thomas Hofmann
Autoregressive state transitions, where predictions are conditioned on past predictions, are the predominant choice for both deterministic and stochastic sequential models.
1 code implementation • 15 Aug 2019 • Nathanaël Perraudin, Ankit Srivastava, Aurelien Lucchi, Tomasz Kacprzak, Thomas Hofmann, Alexandre Réfrégier
Our results show that the proposed model produces samples of high visual quality, although the statistical analysis reveals that capturing rare features in the data poses significant problems for the generative models.
no code implementations • 7 Jun 2019 • Janis Fluri, Tomasz Kacprzak, Aurelien Lucchi, Alexandre Refregier, Adam Amara, Thomas Hofmann, Aurel Schneider
We present the cosmological results with a CNN from the KiDS-450 tomographic weak lensing dataset, constraining the total matter density $\Omega_m$, the fluctuation amplitude $\sigma_8$, and the intrinsic alignment amplitude $A_{\rm{IA}}$.
Cosmology and Nongalactic Astrophysics
no code implementations • NeurIPS 2020 • Kevin Roth, Yannic Kilcher, Thomas Hofmann
We establish a theoretical link between adversarial training and operator norm regularization for deep neural networks.
no code implementations • ICLR 2019 • Paulina Grnarova, Kfir. Y. Levy, Aurelien Lucchi, Nathanael Perraudin, Thomas Hofmann, Andreas Krause
Generative Adversarial Networks (GANs) have shown great results in accurately modeling complex distributions, but their training is known to be difficult due to instabilities caused by a challenging minimax optimization problem.
no code implementations • ICLR 2019 • Yannic Kilcher, Gary Bécigneul, Thomas Hofmann
We develop our method for fully-connected as well as convolutional layers.
1 code implementation • 13 Feb 2019 • Kevin Roth, Yannic Kilcher, Thomas Hofmann
We investigate conditions under which test statistics exist that can reliably detect examples, which have been adversarially manipulated in a white-box attack.
1 code implementation • NeurIPS 2019 • Paulina Grnarova, Kfir. Y. Levy, Aurelien Lucchi, Nathanael Perraudin, Ian Goodfellow, Thomas Hofmann, Andreas Krause
Evaluations are essential for: (i) relative assessment of different models and (ii) monitoring the progress of a single model throughout training.
no code implementations • WS 2018 • Valentin Trifonov, Octavian-Eugen Ganea, Anna Potapenko, Thomas Hofmann
Previous research on word embeddings has shown that sparse representations, which can be either learned on top of existing dense embeddings or obtained through model constraints during training time, have the benefit of increased interpretability properties: to some degree, each dimension can be understood by a human and associated with a recognizable feature in the data.
1 code implementation • CONLL 2018 • Nikolaos Kolitsas, Octavian-Eugen Ganea, Thomas Hofmann
Entity Linking (EL) is an essential task for semantic text understanding and information extraction.
Ranked #1 on Entity Linking on OKE-2015
no code implementations • 23 Jul 2018 • Janis Fluri, Tomasz Kacprzak, Aurelien Lucchi, Alexandre Refregier, Adam Amara, Thomas Hofmann
We find that, for a shape noise level corresponding to 8. 53 galaxies/arcmin$^2$ and the smoothing scale of $\sigma_s = 2. 34$ arcmin, the network is able to generate 45% tighter constraints.
Cosmology and Nongalactic Astrophysics
no code implementations • ICML 2018 • Celestine Dünner, Aurelien Lucchi, Matilde Gargiani, An Bian, Thomas Hofmann, Martin Jaggi
Due to the rapid growth of data and computational resources, distributed optimization has become an active research area in recent years.
no code implementations • NeurIPS 2018 • Florian Schmidt, Thomas Hofmann
Autoregressive feedback is considered a necessity for successful unconditional text generation using stochastic sequence models.
no code implementations • 27 May 2018 • Jonas Kohler, Hadi Daneshmand, Aurelien Lucchi, Ming Zhou, Klaus Neymeyr, Thomas Hofmann
Normalization techniques such as Batch Normalization have been applied successfully for training deep neural networks.
1 code implementation • 25 May 2018 • Lierni Sestorain, Massimiliano Ciaramita, Christian Buck, Thomas Hofmann
Our method can obtain improvements also on the setting where a small amount of parallel data for the zero-shot language pair is available.
3 code implementations • NeurIPS 2018 • Octavian-Eugen Ganea, Gary Bécigneul, Thomas Hofmann
However, the representational power of hyperbolic geometry is not yet on par with Euclidean geometry, mostly because of the absence of corresponding hyperbolic neural network layers.
no code implementations • 22 May 2018 • Kevin Roth, Aurelien Lucchi, Sebastian Nowozin, Thomas Hofmann
We propose a novel data-dependent structured gradient regularizer to increase the robustness of neural networks vis-a-vis adversarial perturbations.
1 code implementation • 15 May 2018 • Leonard Adolphs, Hadi Daneshmand, Aurelien Lucchi, Thomas Hofmann
Gradient-based optimization methods are the most popular choice for finding local optima for classical minimization and saddle point problems.
3 code implementations • ICML 2018 • Octavian-Eugen Ganea, Gary Bécigneul, Thomas Hofmann
Learning graph representations via low-dimensional embeddings that preserve relevant network properties is an important class of problems in machine learning.
Ranked #1 on Link Prediction on WordNet
no code implementations • ICML 2018 • Hadi Daneshmand, Jonas Kohler, Aurelien Lucchi, Thomas Hofmann
We analyze the variance of stochastic gradients along negative curvature directions in certain non-convex machine learning models and show that stochastic gradients exhibit a strong component along these directions.
no code implementations • 27 Jan 2018 • Andres C. Rodriguez, Tomasz Kacprzak, Aurelien Lucchi, Adam Amara, Raphael Sgier, Janis Fluri, Thomas Hofmann, Alexandre Réfrégier
Computational models of the underlying physical processes, such as classical N-body simulations, are extremely resource intensive, as they track the action of gravity in an expanding universe using billions of particles as tracers of the cosmic matter distribution.
no code implementations • 15 Nov 2017 • Yannic Kilcher, Thomas Hofmann
Black-Box attacks on machine learning models occur when an attacker, despite having no access to the inner workings of a model, can successfully craft an attack by means of model theft.
no code implementations • ICLR 2018 • Yannic Kilcher, Gary Becigneul, Thomas Hofmann
It is commonly agreed that the use of relevant invariances as a good statistical bias is important in machine-learning.
no code implementations • ICLR 2018 • Yannic Kilcher, Aurelien Lucchi, Thomas Hofmann
In implicit models, one often interpolates between sampled points in latent space.
no code implementations • ICLR 2018 • Yannic Kilcher, Aurelien Lucchi, Thomas Hofmann
We consider the problem of training generative models with deep neural networks as generators, i. e. to map latent codes to data points.
no code implementations • 28 Jul 2017 • Yannic Kilcher, Aurélien Lucchi, Thomas Hofmann
We consider the problem of training generative models with deep neural networks as generators, i. e. to map latent codes to data points.
2 code implementations • 21 Jul 2017 • Pascal Kaiser, Jan Dirk Wegner, Aurelien Lucchi, Martin Jaggi, Thomas Hofmann, Konrad Schindler
We adapt a state-of-the-art CNN architecture for semantic segmentation of buildings and roads in aerial images, and compare its performance when using different training data sets, ranging from manually labeled, pixel-accurate ground truth of the same city to automatic training data derived from OpenStreetMap data from distant locations.
no code implementations • 17 Jul 2017 • Jorit Schmelzle, Aurelien Lucchi, Tomasz Kacprzak, Adam Amara, Raphael Sgier, Alexandre Réfrégier, Thomas Hofmann
We find that our implementation of DCNN outperforms the skewness and kurtosis statistics, especially for high noise levels.
no code implementations • 13 Jun 2017 • Hadi Daneshmand, Hamed Hassani, Thomas Hofmann
Gradient descent and coordinate descent are well understood in terms of their asymptotic behavior, but less so in a transient regime often used for approximations in machine learning.
1 code implementation • ICLR 2018 • Paulina Grnarova, Kfir. Y. Levy, Aurelien Lucchi, Thomas Hofmann, Andreas Krause
We consider the problem of training generative models with a Generative Adversarial Network (GAN).
1 code implementation • NeurIPS 2017 • Kevin Roth, Aurelien Lucchi, Sebastian Nowozin, Thomas Hofmann
Deep generative models based on Generative Adversarial Networks (GANs) have demonstrated impressive sample quality but in order to work they require a careful choice of architecture, parameter initialization, and selection of hyper-parameters.
3 code implementations • EMNLP 2017 • Octavian-Eugen Ganea, Thomas Hofmann
We propose a novel deep learning model for joint document-level entity disambiguation, which leverages learned neural representations.
Ranked #4 on Entity Disambiguation on WNED-CWEB
1 code implementation • 7 Mar 2017 • Jan Deriu, Aurelien Lucchi, Valeria De Luca, Aliaksei Severyn, Simon Müller, Mark Cieliebak, Thomas Hofmann, Martin Jaggi
This paper presents a novel approach for multi-lingual sentiment classification in short texts.
1 code implementation • 16 Nov 2016 • Wenhu Chen, Aurelien Lucchi, Thomas Hofmann
We here propose a novel way of using such textual data by artificially generating missing visual information.
2 code implementations • TACL 2017 • Jason Lee, Kyunghyun Cho, Thomas Hofmann
We observe that on CS-EN, FI-EN and RU-EN, the quality of the multilingual character-level translation even surpasses the models specifically trained on that language pair alone, both in terms of BLEU score and human judgment.
no code implementations • 20 May 2016 • Hadi Daneshmand, Aurelien Lucchi, Thomas Hofmann
Solutions on this path are tracked such that the minimizer of the previous objective is guaranteed to be within the quadratic convergence region of the next objective to be optimized.
no code implementations • 9 Mar 2016 • Hadi Daneshmand, Aurelien Lucchi, Thomas Hofmann
For many machine learning problems, data is abundant and it may be prohibitive to make multiple passes through the full training set.
1 code implementation • 8 Sep 2015 • Octavian-Eugen Ganea, Marina Ganea, Aurelien Lucchi, Carsten Eickhoff, Thomas Hofmann
We demonstrate the accuracy of our approach on a wide range of benchmark datasets, showing that it matches, and in many cases outperforms, existing state-of-the-art methods.
no code implementations • NeurIPS 2015 • Thomas Hofmann, Aurelien Lucchi, Simon Lacoste-Julien, Brian McWilliams
As a side-product we provide a unified convergence analysis for a family of variance reduction algorithms, which we call memorization algorithms.
no code implementations • 28 Mar 2015 • Aurelien Lucchi, Brian McWilliams, Thomas Hofmann
Quasi-Newton methods are widely used in practise for convex loss minimization problems.
no code implementations • NeurIPS 2014 • Martin Jaggi, Virginia Smith, Martin Takáč, Jonathan Terhorst, Sanjay Krishnan, Thomas Hofmann, Michael. I. Jordan
Communication remains the most significant bottleneck in the performance of distributed optimization algorithms for large-scale machine learning.
3 code implementations • 23 Jan 2013 • Thomas Hofmann
Probabilistic Latent Semantic Analysis is a novel statistical technique for the analysis of two-mode and co-occurrence data, which has applications in information retrieval and filtering, natural language processing, machine learning from text, and in related areas.
no code implementations • Advances in Neural Information Processing Systems 2002 • Stuart Andrews, Ioannis Tsochantaridis, Thomas Hofmann
This paper presents two new formulations of multiple-instance learning as a maximum margin problem.