Search Results for author: Adrian Weller

Found 72 papers, 23 papers with code

Diverse, Global and Amortised Counterfactual Explanations for Uncertainty Estimates

no code implementations5 Dec 2021 Dan Ley, Umang Bhatt, Adrian Weller

To interpret uncertainty estimates from differentiable probabilistic models, recent work has proposed generating a single Counterfactual Latent Uncertainty Explanation (CLUE) for a given data point where the model is uncertain, identifying a single, on-manifold change to the input such that the model becomes more certain in its prediction.

Exploring Alignment of Representations with Human Perception

no code implementations29 Nov 2021 Vedant Nanda, Ayan Majumdar, Camila Kolling, John P. Dickerson, Krishna P. Gummadi, Bradley C. Love, Adrian Weller

We argue that a valuable perspective on when a model learns \textit{good} representations is that inputs that are mapped to similar representations by the model should be perceived similarly by humans.

Data Augmentation Self-Supervised Learning

Towards Principled Disentanglement for Domain Generalization

1 code implementation27 Nov 2021 HANLIN ZHANG, Yi-Fan Zhang, Weiyang Liu, Adrian Weller, Bernhard Schölkopf, Eric P. Xing

To tackle this challenge, we first formalize the OOD generalization problem as constrained optimization, called Disentanglement-constrained Domain Generalization (DDG).

Domain Generalization

PolyViT: Co-training Vision Transformers on Images, Videos and Audio

no code implementations25 Nov 2021 Valerii Likhosherstov, Anurag Arnab, Krzysztof Choromanski, Mario Lucic, Yi Tay, Adrian Weller, Mostafa Dehghani

Can we train a single transformer model capable of processing multiple modalities and datasets, whilst sharing almost all of its learnable parameters?

Audio Classification

Iterative Teaching by Label Synthesis

no code implementations NeurIPS 2021 Weiyang Liu, Zhen Liu, Hanchen Wang, Liam Paull, Bernhard Schölkopf, Adrian Weller

In this paper, we consider the problem of iterative machine teaching, where a teacher provides examples sequentially based on the current iterative learner.

Hybrid Random Features

1 code implementation8 Oct 2021 Krzysztof Choromanski, Haoxian Chen, Han Lin, Yuanzhe Ma, Arijit Sehanobish, Deepali Jain, Michael S Ryoo, Jake Varley, Andy Zeng, Valerii Likhosherstov, Dmitry Kalashnikov, Vikas Sindhwani, Adrian Weller

We propose a new class of random feature methods for linearizing softmax and Gaussian kernels called hybrid random features (HRFs) that automatically adapt the quality of kernel estimation to provide most accurate approximation in the defined regions of interest.

SphereFace Revived: Unifying Hyperspherical Face Recognition

no code implementations12 Sep 2021 Weiyang Liu, Yandong Wen, Bhiksha Raj, Rita Singh, Adrian Weller

As one of the earliest works in hyperspherical face recognition, SphereFace explicitly proposed to learn face embeddings with large inter-class angular margin.

Face Recognition

SphereFace2: Binary Classification is All You Need for Deep Face Recognition

no code implementations3 Aug 2021 Yandong Wen, Weiyang Liu, Adrian Weller, Bhiksha Raj, Rita Singh

In this paper, we first identify the discrepancy between training and evaluation in the existing multi-class classification framework and then discuss the potential limitations caused by the "competitive" nature of softmax normalization.

Face Recognition Multi-class Classification

DIVINE: Diverse Influential Training Points for Data Visualization and Model Refinement

1 code implementation13 Jul 2021 Umang Bhatt, Isabel Chien, Muhammad Bilal Zafar, Adrian Weller

In this work, we take a step towards finding influential training points that also represent the training data well.

Data Visualization Fairness

On the Expressive Power of Self-Attention Matrices

no code implementations7 Jun 2021 Valerii Likhosherstov, Krzysztof Choromanski, Adrian Weller

Our proof is constructive, enabling us to propose an algorithm for finding adaptive inputs and fixed self-attention parameters in order to approximate a given matrix.

Debiasing a First-order Heuristic for Approximate Bi-level Optimization

1 code implementation4 Jun 2021 Valerii Likhosherstov, Xingyou Song, Krzysztof Choromanski, Jared Davis, Adrian Weller

Approximate bi-level optimization (ABLO) consists of (outer-level) optimization problems, involving numerical (inner-level) optimization loops.

Do Concept Bottleneck Models Learn as Intended?

no code implementations10 May 2021 Andrei Margeloiu, Matthew Ashman, Umang Bhatt, Yanzhi Chen, Mateja Jamnik, Adrian Weller

Concept bottleneck models map from raw inputs to concepts, and then from concepts to targets.

CrossWalk: Fairness-enhanced Node Representation Learning

1 code implementation6 May 2021 Ahmad Khajehnejad, Moein Khajehnejad, Mahmoudreza Babaei, Krishna P. Gummadi, Adrian Weller, Baharan Mirzasoleiman

The potential for machine learning systems to amplify social inequities and unfairness is receiving increasing popular and academic attention.

Fairness Link Prediction +2

Is Disentanglement all you need? Comparing Concept-based & Disentanglement Approaches

1 code implementation14 Apr 2021 Dmitry Kazhdan, Botty Dimanov, Helena Andres Terre, Mateja Jamnik, Pietro Liò, Adrian Weller

Concept-based explanations have emerged as a popular way of extracting human-interpretable representations from deep discriminative models.

δ-CLUE: Diverse Sets of Explanations for Uncertainty Estimates

no code implementations13 Apr 2021 Dan Ley, Umang Bhatt, Adrian Weller

To interpret uncertainty estimates from differentiable probabilistic models, recent work has proposed generating Counterfactual Latent Uncertainty Explanations (CLUEs).

Learning with Hyperspherical Uniformity

1 code implementation2 Mar 2021 Weiyang Liu, Rongmei Lin, Zhen Liu, Li Xiong, Bernhard Schölkopf, Adrian Weller

Due to the over-parameterization nature, neural networks are a powerful tool for nonlinear function approximation.

L2 Regularization

Sub-Linear Memory: How to Make Performers SLiM

2 code implementations NeurIPS 2021 Valerii Likhosherstov, Krzysztof Choromanski, Jared Davis, Xingyou Song, Adrian Weller

Recent works proposed various linear self-attention mechanisms, scaling only as $O(L)$ for serial computation.

Improving Interpretability in Medical Imaging Diagnosis using Adversarial Training

1 code implementation2 Dec 2020 Andrei Margeloiu, Nikola Simidjievski, Mateja Jamnik, Adrian Weller

We investigate the influence of adversarial training on the interpretability of convolutional neural networks (CNNs), specifically applied to diagnosing skin cancer.

Ode to an ODE

no code implementations NeurIPS 2020 Krzysztof M. Choromanski, Jared Quincy Davis, Valerii Likhosherstov, Xingyou Song, Jean-Jacques Slotine, Jacob Varley, Honglak Lee, Adrian Weller, Vikas Sindhwani

We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the orthogonal group O(d).

Now You See Me (CME): Concept-based Model Extraction

1 code implementation25 Oct 2020 Dmitry Kazhdan, Botty Dimanov, Mateja Jamnik, Pietro Liò, Adrian Weller

Deep Neural Networks (DNNs) have achieved remarkable performance on a range of tasks.

Model extraction

Rethinking Attention with Performers

11 code implementations ICLR 2021 Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, Afroz Mohiuddin, Lukasz Kaiser, David Belanger, Lucy Colwell, Adrian Weller

We introduce Performers, Transformer architectures which can estimate regular (softmax) full-rank-attention Transformers with provable accuracy, but using only linear (as opposed to quadratic) space and time complexity, without relying on any priors such as sparsity or low-rankness.

Image Generation

Machine Learning Explainability for External Stakeholders

no code implementations10 Jul 2020 Umang Bhatt, McKane Andrus, Adrian Weller, Alice Xiang

As machine learning is increasingly deployed in high-stakes contexts affecting people's livelihoods, there have been growing calls to open the black box and to make machine learning algorithms more explainable.

Robust Inverse Reinforcement Learning under Transition Dynamics Mismatch

1 code implementation NeurIPS 2021 Luca Viano, Yu-Ting Huang, Parameswaran Kamalaruban, Adrian Weller, Volkan Cevher

We study the inverse reinforcement learning (IRL) problem under a transition dynamics mismatch between the expert and the learner.

An Ode to an ODE

no code implementations NeurIPS 2020 Krzysztof Choromanski, Jared Quincy Davis, Valerii Likhosherstov, Xingyou Song, Jean-Jacques Slotine, Jacob Varley, Honglak Lee, Adrian Weller, Vikas Sindhwani

We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the orthogonal group O(d).

Getting a CLUE: A Method for Explaining Uncertainty Estimates

no code implementations ICLR 2021 Javier Antorán, Umang Bhatt, Tameem Adel, Adrian Weller, José Miguel Hernández-Lobato

Both uncertainty estimation and interpretability are important factors for trustworthy machine learning systems.

UFO-BLO: Unbiased First-Order Bilevel Optimization

no code implementations5 Jun 2020 Valerii Likhosherstov, Xingyou Song, Krzysztof Choromanski, Jared Davis, Adrian Weller

Bilevel optimization (BLO) is a popular approach with many applications including hyperparameter optimization, neural architecture search, adversarial robustness and model-agnostic meta-learning.

Adversarial Robustness bilevel optimization +4

Time Dependence in Non-Autonomous Neural ODEs

no code implementations ICLR Workshop DeepDiffEq 2019 Jared Quincy Davis, Krzysztof Choromanski, Jake Varley, Honglak Lee, Jean-Jacques Slotine, Valerii Likhosterov, Adrian Weller, Ameesh Makadia, Vikas Sindhwani

Neural Ordinary Differential Equations (ODEs) are elegant reinterpretations of deep networks where continuous time can replace the discrete notion of depth, ODE solvers perform forward propagation, and the adjoint method enables efficient, constant memory backpropagation.

Image Classification Video Prediction

Dimensions of Diversity in Human Perceptions of Algorithmic Fairness

no code implementations2 May 2020 Nina Grgić-Hlača, Adrian Weller, Elissa M. Redmiles

Additionally, we find that people beliefs about the fairness of using demographic features such as age, gender and race, for making bail decisions about others, vary egocentrically: that is they vary depending on their own age, gender and race respectively.


Evaluating and Aggregating Feature-based Model Explanations

no code implementations1 May 2020 Umang Bhatt, Adrian Weller, José M. F. Moura

A feature-based model explanation denotes how much each input feature contributes to a model's output for a given data point.

CWY Parametrization: a Solution for Parallelized Optimization of Orthogonal and Stiefel Matrices

no code implementations18 Apr 2020 Valerii Likhosherstov, Jared Davis, Krzysztof Choromanski, Adrian Weller

We introduce an efficient approach for optimization over orthogonal groups on highly parallel computation units such as GPUs or TPUs.

Machine Translation Translation +1

Orthogonal Over-Parameterized Training

1 code implementation CVPR 2021 Weiyang Liu, Rongmei Lin, Zhen Liu, James M. Rehg, Liam Paull, Li Xiong, Le Song, Adrian Weller

The inductive bias of a neural network is largely determined by the architecture and the training algorithm.

Stochastic Flows and Geometric Optimization on the Orthogonal Group

no code implementations ICML 2020 Krzysztof Choromanski, David Cheikhi, Jared Davis, Valerii Likhosherstov, Achille Nazaret, Achraf Bahamou, Xingyou Song, Mrugank Akarte, Jack Parker-Holder, Jacob Bergquist, Yuan Gao, Aldo Pacchiano, Tamas Sarlos, Adrian Weller, Vikas Sindhwani

We present a new class of stochastic, geometrically-driven optimization algorithms on the orthogonal group $O(d)$ and naturally reductive homogeneous manifolds obtained from the action of the rotation group $SO(d)$.

Metric Learning Stochastic Optimization

DADI: Dynamic Discovery of Fair Information with Adversarial Reinforcement Learning

no code implementations30 Oct 2019 Michiel A. Bakker, Duy Patrick Tu, Humberto Riverón Valdés, Krishna P. Gummadi, Kush R. Varshney, Adrian Weller, Alex Pentland

We introduce a framework for dynamic adversarial discovery of information (DADI), motivated by a scenario where information (a feature set) is used by third parties with unknown objectives.

Fairness Representation Learning

An Empirical Study on Learning Fairness Metrics for COMPAS Data with Human Supervision

1 code implementation22 Oct 2019 Hanchen Wang, Nina Grgic-Hlaca, Preethi Lahoti, Krishna P. Gummadi, Adrian Weller

We do not provide a way to directly learn a similarity metric satisfying the individual fairness, but to provide an empirical study on how to derive the similarity metric from human supervisors, then future work can use this as a tool to understand human supervision.

Fairness Metric Learning

Exploring Properties of the Deep Image Prior

no code implementations NeurIPS Workshop Deep_Invers 2019 Andreas Kattamis, Tameem Adel, Adrian Weller

Finally, we examine the adversarial invariancy of the early DIP outputs, and hypothesize that these outputs may remove non-robust image features.

The Sensitivity of Counterfactual Fairness to Unmeasured Confounding

1 code implementation1 Jul 2019 Niki Kilbertus, Philip J. Ball, Matt J. Kusner, Adrian Weller, Ricardo Silva

We demonstrate our new sensitivity analysis tools in real-world fairness scenarios to assess the bias arising from confounding.


Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models: Extension

1 code implementation NeurIPS 2019 Yunfei Teng, Wenbo Gao, Francois Chalus, Anna Choromanska, Donald Goldfarb, Adrian Weller

Finally, we implement an asynchronous version of our algorithm and extend it to the multi-leader setting, where we form groups of workers, each represented by its own local leader (the best performer in a group), and update each worker with a corrective direction comprised of two attractive forces: one to the local, and one to the global leader (the best performer among all workers).

Distributed Optimization

Orthogonal Estimation of Wasserstein Distances

no code implementations9 Mar 2019 Mark Rowland, Jiri Hron, Yunhao Tang, Krzysztof Choromanski, Tamas Sarlos, Adrian Weller

Wasserstein distances are increasingly used in a wide variety of applications in machine learning.

Self-Guided Belief Propagation -- A Homotopy Continuation Method

no code implementations4 Dec 2018 Christian Knoll, Adrian Weller, Franz Pernkopf

Belief propagation (BP) is a popular method for performing probabilistic inference on graphical models.

Proceedings of the 2018 ICML Workshop on Human Interpretability in Machine Learning (WHI 2018)

no code implementations3 Jul 2018 Been Kim, Kush R. Varshney, Adrian Weller

This is the Proceedings of the 2018 ICML Workshop on Human Interpretability in Machine Learning (WHI 2018), which was held in Stockholm, Sweden, July 14, 2018.

A Unified Approach to Quantifying Algorithmic Unfairness: Measuring Individual & Group Unfairness via Inequality Indices

no code implementations2 Jul 2018 Till Speicher, Hoda Heidari, Nina Grgic-Hlaca, Krishna P. Gummadi, Adish Singla, Adrian Weller, Muhammad Bilal Zafar

Further, our work reveals overlooked tradeoffs between different fairness notions: using our proposed measures, the overall individual-level unfairness of an algorithm can be decomposed into a between-group and a within-group component.

Decision Making Fairness

Discovering Interpretable Representations for Both Deep Generative and Discriminative Models

no code implementations ICML 2018 Tameem Adel, Zoubin Ghahramani, Adrian Weller

We use a generative model which takes as input the representation in an existing (generative or discriminative) model, weakly supervised by limited side information.

Active Learning

Blind Justice: Fairness with Encrypted Sensitive Attributes

1 code implementation ICML 2018 Niki Kilbertus, Adrià Gascón, Matt J. Kusner, Michael Veale, Krishna P. Gummadi, Adrian Weller

Recent work has explored how to train machine learning models which do not discriminate against any subgroup of the population as determined by sensitive attributes such as gender or race.


Structured Evolution with Compact Architectures for Scalable Policy Optimization

no code implementations ICML 2018 Krzysztof Choromanski, Mark Rowland, Vikas Sindhwani, Richard E. Turner, Adrian Weller

We present a new method of blackbox optimization via gradient approximation with the use of structured random orthogonal matrices, providing more accurate estimators than baselines and with provable theoretical guarantees.

OpenAI Gym Text-to-Image Generation

Human Perceptions of Fairness in Algorithmic Decision Making: A Case Study of Criminal Risk Prediction

no code implementations26 Feb 2018 Nina Grgić-Hlača, Elissa M. Redmiles, Krishna P. Gummadi, Adrian Weller

As algorithms are increasingly used to make important decisions that affect human lives, ranging from social benefit assignment to predicting risk of criminal recidivism, concerns have been raised about the fairness of algorithmic decision making.

Decision Making Fairness

Gauged Mini-Bucket Elimination for Approximate Inference

no code implementations5 Jan 2018 Sungsoo Ahn, Michael Chertkov, Jinwoo Shin, Adrian Weller

Recently, so-called gauge transformations were used to improve variational lower bounds on $Z$.

Uprooting and Rerooting Higher-Order Graphical Models

no code implementations NeurIPS 2017 Mark Rowland, Adrian Weller

The idea of uprooting and rerooting graphical models was introduced specifically for binary pairwise models by Weller (2016) as a way to transform a model to any of a whole equivalence class of related models, such that inference on any one model yields inference results for all others.

Proceedings of the 2017 ICML Workshop on Human Interpretability in Machine Learning (WHI 2017)

no code implementations8 Aug 2017 Been Kim, Dmitry M. Malioutov, Kush R. Varshney, Adrian Weller

This is the Proceedings of the 2017 ICML Workshop on Human Interpretability in Machine Learning (WHI 2017), which was held in Sydney, Australia, August 10, 2017.

Challenges for Transparency

no code implementations29 Jul 2017 Adrian Weller

Transparency is often deemed critical to enable effective real-world deployment of intelligent systems.

Computers and Society

On Fairness, Diversity and Randomness in Algorithmic Decision Making

no code implementations30 Jun 2017 Nina Grgić-Hlača, Muhammad Bilal Zafar, Krishna P. Gummadi, Adrian Weller

Consider a binary decision making process where a single machine learning classifier replaces a multitude of humans.

Decision Making Fairness

From Parity to Preference-based Notions of Fairness in Classification

1 code implementation NeurIPS 2017 Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, Krishna P. Gummadi, Adrian Weller

The adoption of automated, data-driven decision making in an ever expanding range of applications has raised concerns about its potential unfairness towards certain social groups.

Decision Making Fairness +1

Lost Relatives of the Gumbel Trick

1 code implementation ICML 2017 Matej Balog, Nilesh Tripuraneni, Zoubin Ghahramani, Adrian Weller

We show how a subfamily of our new methods adapts to this setting, proving new upper and lower bounds on the log partition function and deriving a family of sequential samplers for the Gibbs distribution.

The Unreasonable Effectiveness of Structured Random Orthogonal Embeddings

1 code implementation NeurIPS 2017 Krzysztof Choromanski, Mark Rowland, Adrian Weller

We examine a class of embeddings based on structured random matrices with orthogonal rows which can be applied in many machine learning applications including dimensionality reduction and kernel approximation.

Dimensionality Reduction

Train and Test Tightness of LP Relaxations in Structured Prediction

no code implementations4 Nov 2015 Ofer Meshi, Mehrdad Mahdavi, Adrian Weller, David Sontag

Structured prediction is used in areas such as computer vision and natural language processing to predict structured outputs such as segmentations or parse trees.

Structured Prediction

Clamping Improves TRW and Mean Field Approximations

no code implementations1 Oct 2015 Adrian Weller, Justin Domke

We examine the effect of clamping variables for approximate inference in undirected graphical models with pairwise relationships and discrete variables.

Clamping Variables and Approximate Inference

no code implementations NeurIPS 2014 Adrian Weller, Tony Jebara

It was recently proved using graph covers (Ruozzi, 2012) that the Bethe partition function is upper bounded by the true partition function for a binary pairwise model that is attractive.

Approximating the Bethe partition function

no code implementations30 Dec 2013 Adrian Weller, Tony Jebara

When belief propagation (BP) converges, it does so to a stationary point of the Bethe free energy $F$, and is often strikingly accurate.

On MAP Inference by MWSS on Perfect Graphs

no code implementations26 Sep 2013 Adrian Weller, Tony S. Jebara

Finding the most likely (MAP) configuration of a Markov random field (MRF) is NP-hard in general.

Cannot find the paper you are looking for? You can Submit a new open access paper.