Search Results for author: Richard Nock

Found 61 papers, 12 papers with code

On Modulating the Gradient for Meta-Learning

1 code implementation • ECCV 2020 • Christian Simon, Piotr Koniusz, Richard Nock, Mehrtash Harandi

Inspired by optimization techniques, we propose a novel meta-learning algorithm with gradient modulation to encourage fast-adaptation of neural networks in the absence of abundant data.

Meta-Learning

Paper
Code

Boosting gets full Attention for Relational Learning

no code implementations • 22 Feb 2024 • Mathieu Guillame-Bert, Richard Nock

Second, what has been learned progresses back bottom-up via attention and aggregation mechanisms, progressively crafting new features that complete at the end the set of observation features over which a single tree is learned, boosting's iteration clock is incremented and new class residuals are computed.

Relational Reasoning

Paper
Add Code

Tempered Calculus for ML: Application to Hyperbolic Model Embedding

no code implementations • 6 Feb 2024 • Richard Nock, Ehsan Amid, Frank Nielsen, Alexander Soen, Manfred K. Warmuth

Most mathematical distortions used in ML are fundamentally integral in nature: $f$-divergences, Bregman divergences, (regularized) optimal transport distances, integral probability metrics, geodesic distances, etc.

Paper
Add Code

The Tempered Hilbert Simplex Distance and Its Application To Non-linear Embeddings of TEMs

no code implementations • 22 Nov 2023 • Ehsan Amid, Frank Nielsen, Richard Nock, Manfred K. Warmuth

Tempered Exponential Measures (TEMs) are a parametric generalization of the exponential family of distributions maximizing the tempered entropy function among positive measures subject to a probability normalization of their power densities.

Paper
Add Code

Optimal Transport with Tempered Exponential Measures

no code implementations • 7 Sep 2023 • Ehsan Amid, Frank Nielsen, Richard Nock, Manfred K. Warmuth

In the field of optimal transport, two prominent subfields face each other: (i) unregularized optimal transport, "\`a-la-Kantorovich", which leads to extremely sparse plans but with algorithms that scale poorly, and (ii) entropic-regularized optimal transport, "\`a-la-Sinkhorn-Cuturi", which gets near-linear approximation algorithms but leads to maximally un-sparse plans.

Paper
Add Code

Generative Forests

no code implementations • 7 Aug 2023 • Richard Nock, Mathieu Guillame-Bert

Tabular data represents one of the most prevalent form of data.

Imputation Text Generation

Paper
Add Code

Smoothly Giving up: Robustness for Simple Models

no code implementations • 17 Feb 2023 • Tyler Sypherd, Nathan Stromberg, Richard Nock, Visar Berisha, Lalitha Sankar

There is a growing need for models that are interpretable and have reduced energy and computational cost (e. g., in health care analytics and federated learning).

Federated Learning regression

Paper
Add Code

LegendreTron: Uprising Proper Multiclass Loss Learning

no code implementations • 27 Jan 2023 • Kevin Lam, Christian Walder, Spiridon Penev, Richard Nock

Existing methods do this by fitting an inverse canonical link function which monotonically maps $\mathbb{R}$ to $[0, 1]$ to estimate probabilities for binary problems.

Paper
Add Code

Clustering above Exponential Families with Tempered Exponential Measures

no code implementations • 4 Nov 2022 • Ehsan Amid, Richard Nock, Manfred Warmuth

The link with exponential families has allowed $k$-means clustering to be generalized to a wide variety of data generating distributions in exponential families and clustering distortions among Bregman divergences.

Clustering

Paper
Add Code

What killed the Convex Booster ?

no code implementations • 19 May 2022 • Yishay Mansour, Richard Nock, Robert C. Williamson

A landmark negative result of Long and Servedio established a worst-case spectacular failure of a supervised learning trio (loss, algorithm, model) otherwise praised for its high precision machinery.

Paper
Add Code

Fair Wrapping for Black-box Predictions

1 code implementation • 31 Jan 2022 • Alexander Soen, Ibrahim Alabdulmohsin, Sanmi Koyejo, Yishay Mansour, Nyalleng Moorosi, Richard Nock, Ke Sun, Lexing Xie

We introduce a new family of techniques to post-process ("wrap") a black-box classifier in order to reduce its bias.

Fairness

Paper
Code

Generative Trees: Adversarial and Copycat

no code implementations • 26 Jan 2022 • Richard Nock, Mathieu Guillame-Bert

While Generative Adversarial Networks (GANs) achieve spectacular results on unstructured data like images, there is still a gap on tabular data, data for which state of the art supervised learning still favours to a large extent decision tree (DT)-based models.

Imputation

Paper
Add Code

Manifold Learning Benefits GANs

no code implementations • CVPR 2022 • Yao Ni, Piotr Koniusz, Richard Hartley, Richard Nock

In our design, the manifold learning and coding steps are intertwined with layers of the discriminator, with the goal of attracting intermediate feature representations onto manifolds.

Denoising

Paper
Add Code

Being Properly Improper

no code implementations • 18 Jun 2021 • Tyler Sypherd, Richard Nock, Lalitha Sankar

Hence, optimizing a proper loss function on twisted data could perilously lead the learning algorithm towards the twisted posterior, rather than to the desired clean posterior.

Paper
Add Code

Fair Densities via Boosting the Sufficient Statistics of Exponential Families

1 code implementation • 1 Dec 2020 • Alexander Soen, Hisham Husain, Richard Nock

Furthermore, when the weak learners are specified to be decision trees, the sufficient statistics of the learned distribution can be examined to provide clues on sources of (un)fairness.

Fairness

Paper
Code

All your loss are belong to Bayes

1 code implementation • NeurIPS 2020 • Christian Walder, Richard Nock

Loss functions are a cornerstone of machine learning and the starting point of most algorithms.

Gaussian Processes

Paper
Code

Cumulant-free closed-form formulas for some common (dis)similarities between densities of an exponential family

no code implementations • 5 Mar 2020 • Frank Nielsen, Richard Nock

It is well-known that the Bhattacharyya, Hellinger, Kullback-Leibler, $\alpha$-divergences, and Jeffreys' divergences between densities belonging to a same exponential family have generic closed-form formulas relying on the strictly convex and real-analytic cumulant function characterizing the exponential family.

Paper
Add Code

Generalised Lipschitz Regularisation Equals Distributional Robustness

no code implementations • 11 Feb 2020 • Zac Cranko, Zhan Shi, Xinhua Zhang, Richard Nock, Simon Kornblith

The problem of adversarial examples has highlighted the need for a theory of regularisation that is general enough to apply to exotic function classes, such as universal approximators.

Paper
Add Code

Supervised Learning: No Loss No Cry

no code implementations • ICML 2020 • Richard Nock, Aditya Krishna Menon

In detail, we cast {\sc SLIsotron} as learning a loss from a family of composite square losses.

Paper
Add Code

Boosted and Differentially Private Ensembles of Decision Trees

no code implementations • 26 Jan 2020 • Richard Nock, Wilko Henecka

To address this, we craft a new parametererized proper loss, called the M$\alpha$-loss, which, as we show, allows to finely tune the tradeoff in the complete spectrum of sensitivity vs boosting guarantees.

Paper
Add Code

Advances and Open Problems in Federated Learning

8 code implementations • 10 Dec 2019 • Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, Rafael G. L. D'Oliveira, Hubert Eichner, Salim El Rouayheb, David Evans, Josh Gardner, Zachary Garrett, Adrià Gascón, Badih Ghazi, Phillip B. Gibbons, Marco Gruteser, Zaid Harchaoui, Chaoyang He, Lie He, Zhouyuan Huo, Ben Hutchinson, Justin Hsu, Martin Jaggi, Tara Javidi, Gauri Joshi, Mikhail Khodak, Jakub Konečný, Aleksandra Korolova, Farinaz Koushanfar, Sanmi Koyejo, Tancrède Lepoint, Yang Liu, Prateek Mittal, Mehryar Mohri, Richard Nock, Ayfer Özgür, Rasmus Pagh, Mariana Raykova, Hang Qi, Daniel Ramage, Ramesh Raskar, Dawn Song, Weikang Song, Sebastian U. Stich, Ziteng Sun, Ananda Theertha Suresh, Florian Tramèr, Praneeth Vepakomma, Jianyu Wang, Li Xiong, Zheng Xu, Qiang Yang, Felix X. Yu, Han Yu, Sen Zhao

FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs resulting from traditional, centralized machine learning and data science approaches.

BIG-bench Machine Learning Federated Learning

4,051

Paper
Code

Disentangled behavioural representations

1 code implementation • NeurIPS 2019 • Amir Dezfouli, Hassan Ashtiani, Omar Ghattas, Richard Nock, Peter Dayan, Cheng Soon Ong

Individual characteristics in human decision-making are often quantified by fitting a parametric cognitive model to subjects' behavior and then studying differences between them in the associated parameter space.

Decision Making

Paper
Code

A Primal-Dual link between GANs and Autoencoders

no code implementations • NeurIPS 2019 • Hisham Husain, Richard Nock, Robert C. Williamson

First, we find that the $f$-GAN and WAE objectives partake in a primal-dual relationship and are equivalent under some assumptions, which then allows us to explicate the success of WAE.

Generalization Bounds

Paper
Add Code

Certifying Distributional Robustness using Lipschitz Regularisation

no code implementations • 25 Sep 2019 • Zac Cranko, Zhan Shi, Xinhua Zhang, Simon Kornblith, Richard Nock

Distributional robust risk (DRR) minimisation has arisen as a flexible and effective framework for machine learning.

Paper
Add Code

Proper-Composite Loss Functions in Arbitrary Dimensions

no code implementations • 19 Feb 2019 • Zac Cranko, Robert C. Williamson, Richard Nock

The study of a machine learning problem is in many ways is difficult to separate from the study of the loss function being used.

Density Estimation

Paper
Add Code

Adversarial Networks and Autoencoders: The Primal-Dual Relationship and Generalization Bounds

no code implementations • 3 Feb 2019 • Hisham Husain, Richard Nock, Robert C. Williamson

First, we find that the $f$-GAN and WAE objectives partake in a primal-dual relationship and are equivalent under some assumptions, which then allows us to explicate the success of WAE.

Generalization Bounds

Paper
Add Code

New Tricks for Estimating Gradients of Expectations

no code implementations • 31 Jan 2019 • Christian J. Walder, Paul Roussel, Richard Nock, Cheng Soon Ong, Masashi Sugiyama

We introduce a family of pairwise stochastic gradient estimators for gradients of expectations, which are related to the log-derivative trick, but involve pairwise interactions between samples.

Paper
Add Code

Representation Learning of Compositional Data

2 code implementations • NeurIPS 2018 • Marta Avalos, Richard Nock, Cheng Soon Ong, Julien Rouar, Ke Sun

Our approach combines the benefits of the log-ratio transformation from compositional data analysis and exponential family PCA.

Representation Learning

Paper
Code

The Bregman chord divergence

no code implementations • 22 Oct 2018 • Frank Nielsen, Richard Nock

Distances are fundamental primitives whose choice significantly impacts the performances of algorithms in machine learning and signal processing.

Paper
Add Code

Lipschitz Networks and Distributional Robustness

no code implementations • 4 Sep 2018 • Zac Cranko, Simon Kornblith, Zhan Shi, Richard Nock

Robust risk minimisation has several advantages: it has been studied with regards to improving the generalisation properties of models and robustness to adversarial perturbation.

Paper
Add Code

Hyperparameter Learning for Conditional Kernel Mean Embeddings with Rademacher Complexity Bounds

1 code implementation • 1 Sep 2018 • Kelvin Hsu, Richard Nock, Fabio Ramos

Conditional kernel mean embeddings are nonparametric models that encode conditional expectations in a reproducing kernel Hilbert space.

Paper
Code

D-PAGE: Diverse Paraphrase Generation

no code implementations • 13 Aug 2018 • Qiongkai Xu, Juyan Zhang, Lizhen Qu, Lexing Xie, Richard Nock

In this paper, we investigate the diversity aspect of paraphrase generation.

Machine Translation NMT +2

Paper
Add Code

Variational Network Inference: Strong and Stable with Concrete Support

no code implementations • ICML 2018 • Amir Dezfouli, Edwin Bonilla, Richard Nock

Traditional methods for the discovery of latent network structures are limited in two ways: they either assume that all the signal comes from the network (i. e. there is no source of signal outside the network) or they place constraints on the network parameters to ensure model or algorithmic stability.

Paper
Add Code

Private Text Classification

no code implementations • 19 Jun 2018 • Leif W. Hanlen, Richard Nock, Hanna Suominen, Neil Bacon

Confidential text corpora exist in many forms, but do not allow arbitrary sharing.

General Classification Privacy Preserving +2

Paper
Add Code

Integral Privacy for Sampling

1 code implementation • 13 Jun 2018 • Hisham Husain, Zac Cranko, Richard Nock

Privacy enforces an information theoretic barrier on approximation, and we show how to reach this barrier with guarantees on the approximation of the target non private density.

Density Estimation Fairness

Paper
Code

Monge blunts Bayes: Hardness Results for Adversarial Training

no code implementations • 8 Jun 2018 • Zac Cranko, Aditya Krishna Menon, Richard Nock, Cheng Soon Ong, Zhan Shi, Christian Walder

A key feature of our result is that it holds for all proper losses, and for a popular subset of these, the optimisation of this central measure appears to be independent of the loss.

Paper
Add Code

Boosted Density Estimation Remastered

no code implementations • 22 Mar 2018 • Zac Cranko, Richard Nock

There has recently been a steady increase in the number iterative approaches to density estimation.

Density Estimation Generative Adversarial Network

Paper
Add Code

Entity Resolution and Federated Learning get a Federated Resolution

no code implementations • 11 Mar 2018 • Richard Nock, Stephen Hardy, Wilko Henecka, Hamish Ivey-Law, Giorgio Patrini, Guillaume Smith, Brian Thorne

In our experiments, we modify a simple token-based entity resolution algorithm so that it indeed aims at avoiding matching rows belonging to different classes, and perform experiments in the setting where entity resolution relies on noisy data, which is very relevant to real world domains.

Entity Resolution Federated Learning +1

Paper
Add Code

Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption

no code implementations • 29 Nov 2017 • Stephen Hardy, Wilko Henecka, Hamish Ivey-Law, Richard Nock, Giorgio Patrini, Guillaume Smith, Brian Thorne

Our results bring a clear and strong support for federated learning: under reasonable assumptions on the number and magnitude of entity resolution's mistakes, it can be extremely beneficial to carry out federated learning in the setting where each peer's data provides a significant uplift to the other.

Entity Resolution Federated Learning +1

Paper
Add Code

On $w$-mixtures: Finite convex combinations of prescribed component distributions

no code implementations • 2 Aug 2017 • Frank Nielsen, Richard Nock

The information geometry induced by the Bregman generator set to the Shannon negentropy on this space yields a dually flat space called the mixture family manifold.

Paper
Add Code

f-GANs in an Information Geometric Nutshell

1 code implementation • NeurIPS 2017 • Richard Nock, Zac Cranko, Aditya Krishna Menon, Lizhen Qu, Robert C. Williamson

In this paper, we unveil a broad class of distributions for which such convergence happens --- namely, deformed exponential families, a wide superset of exponential families --- and show tight connections with the three other key GAN parameters: loss, game and architecture.

Paper
Code

Evolving a Vector Space with any Generating Set

no code implementations • 10 Apr 2017 • Richard Nock, Frank Nielsen

In Valiant's model of evolution, a class of representations is evolvable iff a polynomial-time process of random mutations guided by selection converges with high probability to a representation as $\epsilon$-close as desired from the optimal one, for any required $\epsilon>0$.

Paper
Add Code

Semi-parametric Network Structure Discovery Models

no code implementations • 27 Feb 2017 • Amir Dezfouli, Edwin V. Bonilla, Richard Nock

We propose a network structure discovery model for continuous observations that generalizes linear causal models by incorporating a Gaussian process (GP) prior on a network-independent component, and random sparsity and weight matrices as the network-dependent parameters.

Uncertainty Quantification Variational Inference

Paper
Add Code

Generalizing Jensen and Bregman divergences with comparative convexity and the statistical Bhattacharyya distances with comparable means

no code implementations • 16 Feb 2017 • Frank Nielsen, Richard Nock

Comparative convexity is a generalization of convexity relying on abstract notions of means.

Paper
Add Code

A series of maximum entropy upper bounds of the differential entropy

no code implementations • 9 Dec 2016 • Frank Nielsen, Richard Nock

We present a series of closed-form maximum entropy upper bounds for the differential entropy of a continuous univariate random variable and study the properties of that series.

BIG-bench Machine Learning

Paper
Add Code

On Regularizing Rademacher Observation Losses

no code implementations • NeurIPS 2016 • Richard Nock

It has recently been shown that supervised learning linear classifiers with two of the most popular losses, the logistic and square loss, is equivalent to optimizing an equivalent loss over sufficient statistics about the class: Rademacher observations (rados).

Entity Resolution

Paper
Add Code

Large Margin Nearest Neighbor Classification using Curved Mahalanobis Distances

no code implementations • 22 Sep 2016 • Frank Nielsen, Boris Muzellec, Richard Nock

We consider the supervised classification problem of machine learning in Cayley-Klein projective geometries: We show how to learn a curved Mahalanobis metric distance corresponding to either the hyperbolic geometry or the elliptic geometry using the Large Margin Nearest Neighbor (LMNN) framework.

BIG-bench Machine Learning Classification +1

Paper
Add Code

Tsallis Regularized Optimal Transport and Ecological Inference

1 code implementation • 15 Sep 2016 • Boris Muzellec, Richard Nock, Giorgio Patrini, Frank Nielsen

We also present the first application of optimal transport to the problem of ecological inference, that is, the reconstruction of joint distributions from their marginals, a problem of large interest in the social sciences.

Paper
Code

Making Deep Neural Networks Robust to Label Noise: a Loss Correction Approach

2 code implementations • CVPR 2017 • Giorgio Patrini, Alessandro Rozza, Aditya Menon, Richard Nock, Lizhen Qu

We present a theoretically grounded approach to train deep neural networks, including recurrent networks, subject to class-dependent label noise.

Ranked #2 on Image Classification on Clothing1M (using clean data) (using extra training data)

Learning with noisy labels Noise Estimation

140

Paper
Code

A scaled Bregman theorem with applications

no code implementations • NeurIPS 2016 • Richard Nock, Aditya Krishna Menon, Cheng Soon Ong

Experiments on each of these domains validate the analyses and suggest that the scaled Bregman theorem might be a worthy addition to the popular handful of Bregman divergence properties that have been pervasive in machine learning.

BIG-bench Machine Learning Clustering

Paper
Add Code

The Crossover Process: Learnability and Data Protection from Inference Attacks

no code implementations • 13 Jun 2016 • Richard Nock, Giorgio Patrini, Finnian Lattimore, Tiberio Caetano

It is usual to consider data protection and learnability as conflicting objectives.

Causal Inference Fairness

Paper
Add Code

Fast $(1+ε)$-approximation of the Löwner extremal matrices of high-dimensional symmetric matrices

no code implementations • 6 Apr 2016 • Frank Nielsen, Richard Nock

Matrix data sets are common nowadays like in biomedical imaging where the Diffusion Tensor Magnetic Resonance Imaging (DT-MRI) modality produces data sets of 3D symmetric positive definite matrices anchored at voxel positions capturing the anisotropic diffusion properties of water molecules in biological tissues.

Clustering

Paper
Add Code

Fast Learning from Distributed Datasets without Entity Matching

no code implementations • 13 Mar 2016 • Giorgio Patrini, Richard Nock, Stephen Hardy, Tiberio Caetano

Our goal is to learn a classifier in the cross product space of the two domains, in the hard case in which no shared ID is available -- e. g. due to anonymization.

Entity Resolution

Paper
Add Code

Loss factorization, weakly supervised learning and label noise robustness

no code implementations • 8 Feb 2016 • Giorgio Patrini, Frank Nielsen, Richard Nock, Marcello Carioni

We prove that the empirical risk of most well-known loss functions factors into a linear term aggregating all labels with a term that is label free, and can further be expressed by sums of the loss.

Generalization Bounds Weakly-supervised Learning

Paper
Add Code

k-variates++: more pluses in the k-means++

no code implementations • 3 Feb 2016 • Richard Nock, Raphaël Canyasse, Roksana Boreli, Frank Nielsen

For either the specific frameworks considered here, or for the differential privacy setting, there is little to no prior results on the direct application of k-means++ and its approximation bounds --- state of the art contenders appear to be significantly more complex and / or display less favorable (approximation) properties.

Clustering

Paper
Add Code

Learning Games and Rademacher Observations Losses

no code implementations • 16 Dec 2015 • Richard Nock

We first show that this unexpected equivalence can actually be generalized to other example / rado losses, with necessary and sufficient conditions for the equivalence, exemplified on four losses that bear popular names in various fields: exponential (boosting), mean-variance (finance), Linear Hinge (on-line learning), ReLU (deep learning), and unhinged (statistics).

Paper
Add Code

Rademacher Observations, Private Data, and Boosting

no code implementations • 9 Feb 2015 • Richard Nock, Giorgio Patrini, Arik Friedman

We show that rados comply with various privacy requirements that make them good candidates for machine learning in a privacy framework.

Paper
Add Code

(Almost) No Label No Cry

no code implementations • NeurIPS 2014 • Giorgio Patrini, Richard Nock, Paul Rivera, Tiberio Caetano

In Learning with Label Proportions (LLP), the objective is to learn a supervised classifier when, instead of labels, only label proportions for bags of observations are known.

Generalization Bounds Privacy Preserving +1

Paper
Add Code

Further heuristics for $k$-means: The merge-and-split heuristic and the $(k,l)$-means

no code implementations • 23 Jun 2014 • Frank Nielsen, Richard Nock

This novel heuristic can improve Hartigan's $k$-means when it has converged to a local minimum.

Clustering

Paper
Add Code

Optimal interval clustering: Application to Bregman clustering and statistical mixture learning

no code implementations • 11 Mar 2014 • Frank Nielsen, Richard Nock

We present a generic dynamic programming method to compute the optimal clustering of $n$ scalar elements into $k$ pairwise disjoint intervals.

Clustering Model Selection

Paper
Add Code

On the Efficient Minimization of Classification Calibrated Surrogates

no code implementations • NeurIPS 2008 • Richard Nock, Frank Nielsen

Bartlett et al (2006) recently proved that a ground condition for convex surrogates, classification calibration, ties up the minimization of the surrogates and classification risks, and left as an important problem the algorithmic questions about the minimization of these surrogates.

Classification General Classification

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.