Search Results for author: Adrian Weller

Found 124 papers, 54 papers with code

Large Language Models Must Be Taught to Know What They Don't Know

1 code implementation12 Jun 2024 Sanyam Kapoor, Nate Gruver, Manley Roberts, Katherine Collins, Arka Pal, Umang Bhatt, Adrian Weller, Samuel Dooley, Micah Goldblum, Andrew Gordon Wilson

We show that a thousand graded examples are sufficient to outperform baseline methods and that training through the features of a model is necessary for good performance and tractable for large open-source models when using LoRA.

Representational Alignment Supports Effective Machine Teaching

no code implementations6 Jun 2024 Ilia Sucholutsky, Katherine M. Collins, Maya Malaviya, Nori Jacoby, Weiyang Liu, Theodore R. Sumers, Michalis Korakakis, Umang Bhatt, Mark Ho, Joshua B. Tenenbaum, Brad Love, Zachary A. Pardos, Adrian Weller, Thomas L. Griffiths

A good teacher should not only be knowledgeable; but should be able to communicate in a way that the student understands -- to share the student's representation of the world.

Variance-Reducing Couplings for Random Features: Perspectives from Optimal Transport

no code implementations26 May 2024 Isaac Reid, Stratis Markou, Krzysztof Choromanski, Richard E. Turner, Adrian Weller

Random features (RFs) are a popular technique to scale up kernel methods in machine learning, replacing exact kernel evaluations with stochastic Monte Carlo estimates.

Gaussian Processes

Estimation of Concept Explanations Should be Uncertainty Aware

1 code implementation13 Dec 2023 Vihari Piratla, Juyeon Heo, Katherine M. Collins, Sukriti Singh, Adrian Weller

We believe the improved quality of uncertainty-aware concept explanations make them a strong candidate for more reliable model interpretation.

Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization

1 code implementation10 Nov 2023 Weiyang Liu, Zeju Qiu, Yao Feng, Yuliang Xiu, Yuxuan Xue, Longhui Yu, Haiwen Feng, Zhen Liu, Juyeon Heo, Songyou Peng, Yandong Wen, Michael J. Black, Adrian Weller, Bernhard Schölkopf

We apply this parameterization to OFT, creating a novel parameter-efficient finetuning method, called Orthogonal Butterfly (BOFT).

Amortised Inference in Neural Networks for Small-Scale Probabilistic Meta-Learning

no code implementations24 Oct 2023 Matthew Ashman, Tommy Rochussen, Adrian Weller

The global inducing point variational approximation for BNNs is based on using a set of inducing inputs to construct a series of conditional distributions that accurately approximate the conditionals of the true posterior distribution.

Bayesian Inference Meta-Learning

AI for Mathematics: A Cognitive Science Perspective

no code implementations19 Oct 2023 Cedegao E. Zhang, Katherine M. Collins, Adrian Weller, Joshua B. Tenenbaum

Mathematics is one of the most powerful conceptual systems developed and used by the human species.

Repelling Random Walks

no code implementations7 Oct 2023 Isaac Reid, Eli Berger, Krzysztof Choromanski, Adrian Weller

We present a novel quasi-Monte Carlo mechanism to improve graph-based sampling, coined repelling random walks.

General Graph Random Features

no code implementations7 Oct 2023 Isaac Reid, Krzysztof Choromanski, Eli Berger, Adrian Weller

This includes many of the most popular examples of kernels defined on the nodes of a graph.

Node Clustering

Learning to Receive Help: Intervention-Aware Concept Embedding Models

1 code implementation NeurIPS 2023 Mateo Espinosa Zarlenga, Katherine M. Collins, Krishnamurthy Dvijotham, Adrian Weller, Zohreh Shams, Mateja Jamnik

To address this, we propose Intervention-aware Concept Embedding models (IntCEMs), a novel CBM-based architecture and training paradigm that improves a model's receptiveness to test-time interventions.

Identifying and Mitigating Privacy Risks Stemming from Language Models: A Survey

no code implementations27 Sep 2023 Victoria Smith, Ali Shahin Shamsabadi, Carolyn Ashurst, Adrian Weller

To help researchers and policymakers understand the state of knowledge around privacy attacks and mitigations, including where more work is needed, we present the first technical survey on LM privacy.

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

1 code implementation21 Sep 2023 Longhui Yu, Weisen Jiang, Han Shi, Jincheng Yu, Zhengying Liu, Yu Zhang, James T. Kwok, Zhenguo Li, Adrian Weller, Weiyang Liu

Our MetaMath-7B model achieves 66. 4% on GSM8K and 19. 4% on MATH, exceeding the state-of-the-art models of the same size by 11. 5% and 8. 7%.

Ranked #54 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +4

Selective Concept Models: Permitting Stakeholder Customisation at Test-Time

no code implementations14 Jun 2023 Matthew Barker, Katherine M. Collins, Krishnamurthy Dvijotham, Adrian Weller, Umang Bhatt

Concept-based models perform prediction using a set of concepts that are interpretable to stakeholders.

Controlling Text-to-Image Diffusion by Orthogonal Finetuning

1 code implementation NeurIPS 2023 Zeju Qiu, Weiyang Liu, Haiwen Feng, Yuxuan Xue, Yao Feng, Zhen Liu, Dan Zhang, Adrian Weller, Bernhard Schölkopf

To tackle this challenge, we introduce a principled finetuning method -- Orthogonal Finetuning (OFT), for adapting text-to-image diffusion models to downstream tasks.

Learning Personalized Decision Support Policies

no code implementations13 Apr 2023 Umang Bhatt, Valerie Chen, Katherine M. Collins, Parameswaran Kamalaruban, Emma Kallina, Adrian Weller, Ameet Talwalkar

$\texttt{Modiste}$ leverages stochastic contextual bandit techniques to personalize a decision support policy for each decision-maker and supports extensions to the multi-objective setting to account for auxiliary objectives like the cost of support.

Language Modelling Large Language Model +1

Human Uncertainty in Concept-Based AI Systems

no code implementations22 Mar 2023 Katherine M. Collins, Matthew Barker, Mateo Espinosa Zarlenga, Naveen Raman, Umang Bhatt, Mateja Jamnik, Ilia Sucholutsky, Adrian Weller, Krishnamurthy Dvijotham

We study how existing concept-based models deal with uncertain interventions from humans using two novel datasets: UMNIST, a visual dataset with controlled simulated uncertainty based on the MNIST dataset, and CUB-S, a relabeling of the popular CUB concept dataset with rich, densely-annotated soft labels from humans.

Decision Making

Generalizing and Decoupling Neural Collapse via Hyperspherical Uniformity Gap

1 code implementation11 Mar 2023 Weiyang Liu, Longhui Yu, Adrian Weller, Bernhard Schölkopf

We then use hyperspherical uniformity (which characterizes the degree of uniformity on the unit hypersphere) as a unified framework to quantify these two objectives.

Use Perturbations when Learning from Explanations

1 code implementation NeurIPS 2023 Juyeon Heo, Vihari Piratla, Matthew Wicker, Adrian Weller

Machine learning from explanations (MLX) is an approach to learning that uses human-provided explanations of relevant or irrelevant features for each input to ensure that model predictions are right for the right reasons.

Scalable Infomin Learning

1 code implementation21 Feb 2023 Yanzhi Chen, Weihao Sun, Yingzhen Li, Adrian Weller

The task of infomin learning aims to learn a representation with high utility while being uninformative about a specified target, with the latter achieved by minimising the mutual information between the representation and the target.

Domain Adaptation Fairness +1

Optimising Human-Machine Collaboration for Efficient High-Precision Information Extraction from Text Documents

no code implementations18 Feb 2023 Bradley Butcher, Miri Zilka, Darren Cook, Jiri Hron, Adrian Weller

We argue for the utility of a human-in-the-loop approach in applications where high precision is required, but purely manual extraction is infeasible.

FAVOR#: Sharp Attention Kernel Approximations via New Classes of Positive Random Features

no code implementations1 Feb 2023 Valerii Likhosherstov, Krzysztof Choromanski, Avinava Dubey, Frederick Liu, Tamas Sarlos, Adrian Weller

The problem of efficient approximation of a linear operator induced by the Gaussian or softmax kernel is often addressed using random features (RFs) which yield an unbiased approximation of the operator's result.

Simplex Random Features

1 code implementation31 Jan 2023 Isaac Reid, Krzysztof Choromanski, Valerii Likhosherstov, Adrian Weller

We present Simplex Random Features (SimRFs), a new random feature (RF) mechanism for unbiased approximation of the softmax and Gaussian kernels by geometrical correlation of random projection vectors.

Towards Robust Metrics for Concept Representation Evaluation

1 code implementation25 Jan 2023 Mateo Espinosa Zarlenga, Pietro Barbiero, Zohreh Shams, Dmitry Kazhdan, Umang Bhatt, Adrian Weller, Mateja Jamnik

In this paper, we show that such metrics are not appropriate for concept learning and propose novel metrics for evaluating the purity of concept representations in both approaches.

Benchmarking Disentanglement

Robust Explanation Constraints for Neural Networks

1 code implementation16 Dec 2022 Matthew Wicker, Juyeon Heo, Luca Costabello, Adrian Weller

Post-hoc explanation methods are used with the intent of providing insights about neural networks and are sometimes said to help engender trust in their outputs.

Towards More Robust Interpretation via Local Gradient Alignment

1 code implementation29 Nov 2022 Sunghwan Joo, Seokhyeon Jeong, Juyeon Heo, Adrian Weller, Taesup Moon

However, the lack of considering the normalization of the attributions, which is essential in their visualizations, has been an obstacle to understanding and improving the robustness of feature attribution methods.

Computational Efficiency Network Interpretation

Human-in-the-Loop Mixup

1 code implementation2 Nov 2022 Katherine M. Collins, Umang Bhatt, Weiyang Liu, Vihari Piratla, Ilia Sucholutsky, Bradley Love, Adrian Weller

We focus on the synthetic data used in mixup: a powerful regularizer shown to improve model robustness, generalization, and calibration.

Iterative Teaching by Data Hallucination

1 code implementation31 Oct 2022 Zeju Qiu, Weiyang Liu, Tim Z. Xiao, Zhen Liu, Umang Bhatt, Yucen Luo, Adrian Weller, Bernhard Schölkopf

We consider the problem of iterative machine teaching, where a teacher sequentially provides examples based on the status of a learner under a discrete input space (i. e., a pool of finite samples), which greatly limits the teacher's capability.

Hallucination

Continual Learning by Modeling Intra-Class Variation

1 code implementation11 Oct 2022 Longhui Yu, Tianyang Hu, Lanqing Hong, Zhen Liu, Adrian Weller, Weiyang Liu

It has been observed that neural networks perform poorly when the data or tasks are presented sequentially.

Continual Learning

Structural Causal 3D Reconstruction

no code implementations20 Jul 2022 Weiyang Liu, Zhen Liu, Liam Paull, Adrian Weller, Bernhard Schölkopf

This paper considers the problem of unsupervised 3D object reconstruction from in-the-wild single-view images.

3D Object Reconstruction 3D Reconstruction +2

Eliciting and Learning with Soft Labels from Every Annotator

1 code implementation2 Jul 2022 Katherine M. Collins, Umang Bhatt, Adrian Weller

Our elicitation methodology therefore shows nuanced promise in enabling practitioners to enjoy the benefits of improved model performance and reliability with fewer annotators, and serves as a guide for future dataset curators on the benefits of leveraging richer information, such as categorical uncertainty, from individual annotators.

Measuring Representational Robustness of Neural Networks Through Shared Invariances

1 code implementation23 Jun 2022 Vedant Nanda, Till Speicher, Camila Kolling, John P. Dickerson, Krishna P. Gummadi, Adrian Weller

Our work offers a new view on robustness by using another reference NN to define the set of perturbations a given NN should be invariant to, thus generalizing the reliance on a reference ``human NN'' to any NN.

Chefs' Random Tables: Non-Trigonometric Random Features

1 code implementation30 May 2022 Valerii Likhosherstov, Krzysztof Choromanski, Avinava Dubey, Frederick Liu, Tamas Sarlos, Adrian Weller

We introduce chefs' random tables (CRTs), a new class of non-trigonometric random features (RFs) to approximate Gaussian and softmax kernels.

Multi-disciplinary fairness considerations in machine learning for clinical trials

no code implementations18 May 2022 Isabel Chien, Nina Deliu, Richard E. Turner, Adrian Weller, Sofia S. Villar, Niki Kilbertus

While interest in the application of machine learning to improve healthcare has grown tremendously in recent years, a number of barriers prevent deployment in medical practice.

BIG-bench Machine Learning Fairness

Perspectives on Incorporating Expert Feedback into Model Updates

no code implementations13 May 2022 Valerie Chen, Umang Bhatt, Hoda Heidari, Adrian Weller, Ameet Talwalkar

A practitioner may receive feedback from an expert at the observation- or domain-level, and convert this feedback into updates to the dataset, loss function, or parameter space.

Synthetic Data -- what, why and how?

no code implementations6 May 2022 James Jordon, Lukasz Szpruch, Florimond Houssiau, Mirko Bottarelli, Giovanni Cherubin, Carsten Maple, Samuel N. Cohen, Adrian Weller

This explainer document aims to provide an overview of the current state of the rapidly expanding work on synthetic data technologies, with a particular focus on privacy.

Robust Learning from Observation with Model Misspecification

1 code implementation12 Feb 2022 Luca Viano, Yu-Ting Huang, Parameswaran Kamalaruban, Craig Innes, Subramanian Ramamoorthy, Adrian Weller

Imitation learning (IL) is a popular paradigm for training policies in robotic systems when specifying the reward function is difficult.

Continuous Control Imitation Learning +1

Approximating Full Conformal Prediction at Scale via Influence Functions

1 code implementation2 Feb 2022 Javier Abad, Umang Bhatt, Adrian Weller, Giovanni Cherubin

We prove that our method is a consistent approximation of full CP, and empirically show that the approximation error becomes smaller as the training set increases; e. g., for $10^{3}$ training points the two methods output p-values that are $<10^{-3}$ apart: a negligible error for any practical application.

Conformal Prediction

Diverse, Global and Amortised Counterfactual Explanations for Uncertainty Estimates

no code implementations5 Dec 2021 Dan Ley, Umang Bhatt, Adrian Weller

To interpret uncertainty estimates from differentiable probabilistic models, recent work has proposed generating a single Counterfactual Latent Uncertainty Explanation (CLUE) for a given data point where the model is uncertain, identifying a single, on-manifold change to the input such that the model becomes more certain in its prediction.

counterfactual

Towards Principled Disentanglement for Domain Generalization

1 code implementation CVPR 2022 HANLIN ZHANG, Yi-Fan Zhang, Weiyang Liu, Adrian Weller, Bernhard Schölkopf, Eric P. Xing

To tackle this challenge, we first formalize the OOD generalization problem as constrained optimization, called Disentanglement-constrained Domain Generalization (DDG).

Disentanglement Domain Generalization

PolyViT: Co-training Vision Transformers on Images, Videos and Audio

no code implementations25 Nov 2021 Valerii Likhosherstov, Anurag Arnab, Krzysztof Choromanski, Mario Lucic, Yi Tay, Adrian Weller, Mostafa Dehghani

Can we train a single transformer model capable of processing multiple modalities and datasets, whilst sharing almost all of its learnable parameters?

Audio Classification

Iterative Teaching by Label Synthesis

no code implementations NeurIPS 2021 Weiyang Liu, Zhen Liu, Hanchen Wang, Liam Paull, Bernhard Schölkopf, Adrian Weller

In this paper, we consider the problem of iterative machine teaching, where a teacher provides examples sequentially based on the current iterative learner.

Hybrid Random Features

1 code implementation ICLR 2022 Krzysztof Choromanski, Haoxian Chen, Han Lin, Yuanzhe Ma, Arijit Sehanobish, Deepali Jain, Michael S Ryoo, Jake Varley, Andy Zeng, Valerii Likhosherstov, Dmitry Kalashnikov, Vikas Sindhwani, Adrian Weller

We propose a new class of random feature methods for linearizing softmax and Gaussian kernels called hybrid random features (HRFs) that automatically adapt the quality of kernel estimation to provide most accurate approximation in the defined regions of interest.

Benchmarking

SphereFace Revived: Unifying Hyperspherical Face Recognition

1 code implementation12 Sep 2021 Weiyang Liu, Yandong Wen, Bhiksha Raj, Rita Singh, Adrian Weller

As one of the earliest works in hyperspherical face recognition, SphereFace explicitly proposed to learn face embeddings with large inter-class angular margin.

Face Recognition

SphereFace2: Binary Classification is All You Need for Deep Face Recognition

no code implementations ICLR 2022 Yandong Wen, Weiyang Liu, Adrian Weller, Bhiksha Raj, Rita Singh

In this paper, we start by identifying the discrepancy between training and evaluation in the existing multi-class classification framework and then discuss the potential limitations caused by the "competitive" nature of softmax normalization.

Binary Classification Classification +2

From block-Toeplitz matrices to differential equations on graphs: towards a general theory for scalable masked Transformers

1 code implementation16 Jul 2021 Krzysztof Choromanski, Han Lin, Haoxian Chen, Tianyi Zhang, Arijit Sehanobish, Valerii Likhosherstov, Jack Parker-Holder, Tamas Sarlos, Adrian Weller, Thomas Weingarten

In this paper we provide, to the best of our knowledge, the first comprehensive approach for incorporating various masking mechanisms into Transformers architectures in a scalable way.

Graph Attention

DIVINE: Diverse Influential Training Points for Data Visualization and Model Refinement

1 code implementation13 Jul 2021 Umang Bhatt, Isabel Chien, Muhammad Bilal Zafar, Adrian Weller

In this work, we take a step towards finding influential training points that also represent the training data well.

Data Visualization Fairness

On the Expressive Power of Self-Attention Matrices

no code implementations7 Jun 2021 Valerii Likhosherstov, Krzysztof Choromanski, Adrian Weller

Our proof is constructive, enabling us to propose an algorithm for finding adaptive inputs and fixed self-attention parameters in order to approximate a given matrix.

LEMMA

Debiasing a First-order Heuristic for Approximate Bi-level Optimization

1 code implementation4 Jun 2021 Valerii Likhosherstov, Xingyou Song, Krzysztof Choromanski, Jared Davis, Adrian Weller

Approximate bi-level optimization (ABLO) consists of (outer-level) optimization problems, involving numerical (inner-level) optimization loops.

Do Concept Bottleneck Models Learn as Intended?

no code implementations10 May 2021 Andrei Margeloiu, Matthew Ashman, Umang Bhatt, Yanzhi Chen, Mateja Jamnik, Adrian Weller

Concept bottleneck models map from raw inputs to concepts, and then from concepts to targets.

CrossWalk: Fairness-enhanced Node Representation Learning

1 code implementation6 May 2021 Ahmad Khajehnejad, Moein Khajehnejad, Mahmoudreza Babaei, Krishna P. Gummadi, Adrian Weller, Baharan Mirzasoleiman

The potential for machine learning systems to amplify social inequities and unfairness is receiving increasing popular and academic attention.

Fairness Link Prediction +2

Is Disentanglement all you need? Comparing Concept-based & Disentanglement Approaches

1 code implementation14 Apr 2021 Dmitry Kazhdan, Botty Dimanov, Helena Andres Terre, Mateja Jamnik, Pietro Liò, Adrian Weller

Concept-based explanations have emerged as a popular way of extracting human-interpretable representations from deep discriminative models.

Disentanglement

δ-CLUE: Diverse Sets of Explanations for Uncertainty Estimates

no code implementations13 Apr 2021 Dan Ley, Umang Bhatt, Adrian Weller

To interpret uncertainty estimates from differentiable probabilistic models, recent work has proposed generating Counterfactual Latent Uncertainty Explanations (CLUEs).

counterfactual

Learning with Hyperspherical Uniformity

1 code implementation2 Mar 2021 Weiyang Liu, Rongmei Lin, Zhen Liu, Li Xiong, Bernhard Schölkopf, Adrian Weller

Due to the over-parameterization nature, neural networks are a powerful tool for nonlinear function approximation.

Inductive Bias L2 Regularization

Sub-Linear Memory: How to Make Performers SLiM

2 code implementations NeurIPS 2021 Valerii Likhosherstov, Krzysztof Choromanski, Jared Davis, Xingyou Song, Adrian Weller

Recent works proposed various linear self-attention mechanisms, scaling only as $O(L)$ for serial computation.

Improving Interpretability in Medical Imaging Diagnosis using Adversarial Training

1 code implementation2 Dec 2020 Andrei Margeloiu, Nikola Simidjievski, Mateja Jamnik, Adrian Weller

We investigate the influence of adversarial training on the interpretability of convolutional neural networks (CNNs), specifically applied to diagnosing skin cancer.

Ode to an ODE

no code implementations NeurIPS 2020 Krzysztof M. Choromanski, Jared Quincy Davis, Valerii Likhosherstov, Xingyou Song, Jean-Jacques Slotine, Jacob Varley, Honglak Lee, Adrian Weller, Vikas Sindhwani

We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the orthogonal group O(d).

Now You See Me (CME): Concept-based Model Extraction

1 code implementation25 Oct 2020 Dmitry Kazhdan, Botty Dimanov, Mateja Jamnik, Pietro Liò, Adrian Weller

Deep Neural Networks (DNNs) have achieved remarkable performance on a range of tasks.

Model extraction

Rethinking Attention with Performers

12 code implementations ICLR 2021 Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, Afroz Mohiuddin, Lukasz Kaiser, David Belanger, Lucy Colwell, Adrian Weller

We introduce Performers, Transformer architectures which can estimate regular (softmax) full-rank-attention Transformers with provable accuracy, but using only linear (as opposed to quadratic) space and time complexity, without relying on any priors such as sparsity or low-rankness.

D4RL Image Generation +2

Machine Learning Explainability for External Stakeholders

no code implementations10 Jul 2020 Umang Bhatt, McKane Andrus, Adrian Weller, Alice Xiang

As machine learning is increasingly deployed in high-stakes contexts affecting people's livelihoods, there have been growing calls to open the black box and to make machine learning algorithms more explainable.

BIG-bench Machine Learning

An Ode to an ODE

no code implementations NeurIPS 2020 Krzysztof Choromanski, Jared Quincy Davis, Valerii Likhosherstov, Xingyou Song, Jean-Jacques Slotine, Jacob Varley, Honglak Lee, Adrian Weller, Vikas Sindhwani

We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the orthogonal group O(d).

UFO-BLO: Unbiased First-Order Bilevel Optimization

no code implementations5 Jun 2020 Valerii Likhosherstov, Xingyou Song, Krzysztof Choromanski, Jared Davis, Adrian Weller

Bilevel optimization (BLO) is a popular approach with many applications including hyperparameter optimization, neural architecture search, adversarial robustness and model-agnostic meta-learning.

Adversarial Robustness Bilevel Optimization +4

Time Dependence in Non-Autonomous Neural ODEs

no code implementations ICLR Workshop DeepDiffEq 2019 Jared Quincy Davis, Krzysztof Choromanski, Jake Varley, Honglak Lee, Jean-Jacques Slotine, Valerii Likhosterov, Adrian Weller, Ameesh Makadia, Vikas Sindhwani

Neural Ordinary Differential Equations (ODEs) are elegant reinterpretations of deep networks where continuous time can replace the discrete notion of depth, ODE solvers perform forward propagation, and the adjoint method enables efficient, constant memory backpropagation.

Image Classification Video Prediction

Dimensions of Diversity in Human Perceptions of Algorithmic Fairness

no code implementations2 May 2020 Nina Grgić-Hlača, Gabriel Lima, Adrian Weller, Elissa M. Redmiles

A growing number of oversight boards and regulatory bodies seek to monitor and govern algorithms that make decisions about people's lives.

Decision Making Fairness

Evaluating and Aggregating Feature-based Model Explanations

no code implementations1 May 2020 Umang Bhatt, Adrian Weller, José M. F. Moura

A feature-based model explanation denotes how much each input feature contributes to a model's output for a given data point.

CWY Parametrization: a Solution for Parallelized Optimization of Orthogonal and Stiefel Matrices

no code implementations18 Apr 2020 Valerii Likhosherstov, Jared Davis, Krzysztof Choromanski, Adrian Weller

We introduce an efficient approach for optimization over orthogonal groups on highly parallel computation units such as GPUs or TPUs.

Machine Translation Translation +1

Stochastic Flows and Geometric Optimization on the Orthogonal Group

no code implementations ICML 2020 Krzysztof Choromanski, David Cheikhi, Jared Davis, Valerii Likhosherstov, Achille Nazaret, Achraf Bahamou, Xingyou Song, Mrugank Akarte, Jack Parker-Holder, Jacob Bergquist, Yuan Gao, Aldo Pacchiano, Tamas Sarlos, Adrian Weller, Vikas Sindhwani

We present a new class of stochastic, geometrically-driven optimization algorithms on the orthogonal group $O(d)$ and naturally reductive homogeneous manifolds obtained from the action of the rotation group $SO(d)$.

Metric Learning Stochastic Optimization

DADI: Dynamic Discovery of Fair Information with Adversarial Reinforcement Learning

no code implementations30 Oct 2019 Michiel A. Bakker, Duy Patrick Tu, Humberto Riverón Valdés, Krishna P. Gummadi, Kush R. Varshney, Adrian Weller, Alex Pentland

We introduce a framework for dynamic adversarial discovery of information (DADI), motivated by a scenario where information (a feature set) is used by third parties with unknown objectives.

Fairness reinforcement-learning +2

An Empirical Study on Learning Fairness Metrics for COMPAS Data with Human Supervision

1 code implementation22 Oct 2019 Hanchen Wang, Nina Grgic-Hlaca, Preethi Lahoti, Krishna P. Gummadi, Adrian Weller

We do not provide a way to directly learn a similarity metric satisfying the individual fairness, but to provide an empirical study on how to derive the similarity metric from human supervisors, then future work can use this as a tool to understand human supervision.

Fairness Metric Learning

Exploring Properties of the Deep Image Prior

no code implementations NeurIPS Workshop Deep_Invers 2019 Andreas Kattamis, Tameem Adel, Adrian Weller

Finally, we examine the adversarial invariancy of the early DIP outputs, and hypothesize that these outputs may remove non-robust image features.

The Sensitivity of Counterfactual Fairness to Unmeasured Confounding

1 code implementation1 Jul 2019 Niki Kilbertus, Philip J. Ball, Matt J. Kusner, Adrian Weller, Ricardo Silva

We demonstrate our new sensitivity analysis tools in real-world fairness scenarios to assess the bias arising from confounding.

counterfactual Fairness

Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models: Extension

1 code implementation NeurIPS 2019 Yunfei Teng, Wenbo Gao, Francois Chalus, Anna Choromanska, Donald Goldfarb, Adrian Weller

Finally, we implement an asynchronous version of our algorithm and extend it to the multi-leader setting, where we form groups of workers, each represented by its own local leader (the best performer in a group), and update each worker with a corrective direction comprised of two attractive forces: one to the local, and one to the global leader (the best performer among all workers).

Distributed Optimization

Self-Guided Belief Propagation -- A Homotopy Continuation Method

no code implementations4 Dec 2018 Christian Knoll, Adrian Weller, Franz Pernkopf

Belief propagation (BP) is a popular method for performing probabilistic inference on graphical models.

Proceedings of the 2018 ICML Workshop on Human Interpretability in Machine Learning (WHI 2018)

no code implementations3 Jul 2018 Been Kim, Kush R. Varshney, Adrian Weller

This is the Proceedings of the 2018 ICML Workshop on Human Interpretability in Machine Learning (WHI 2018), which was held in Stockholm, Sweden, July 14, 2018.

BIG-bench Machine Learning

A Unified Approach to Quantifying Algorithmic Unfairness: Measuring Individual & Group Unfairness via Inequality Indices

no code implementations2 Jul 2018 Till Speicher, Hoda Heidari, Nina Grgic-Hlaca, Krishna P. Gummadi, Adish Singla, Adrian Weller, Muhammad Bilal Zafar

Further, our work reveals overlooked tradeoffs between different fairness notions: using our proposed measures, the overall individual-level unfairness of an algorithm can be decomposed into a between-group and a within-group component.

Decision Making Fairness

Discovering Interpretable Representations for Both Deep Generative and Discriminative Models

no code implementations ICML 2018 Tameem Adel, Zoubin Ghahramani, Adrian Weller

We use a generative model which takes as input the representation in an existing (generative or discriminative) model, weakly supervised by limited side information.

Active Learning

Blind Justice: Fairness with Encrypted Sensitive Attributes

1 code implementation ICML 2018 Niki Kilbertus, Adrià Gascón, Matt J. Kusner, Michael Veale, Krishna P. Gummadi, Adrian Weller

Recent work has explored how to train machine learning models which do not discriminate against any subgroup of the population as determined by sensitive attributes such as gender or race.

Fairness

Structured Evolution with Compact Architectures for Scalable Policy Optimization

no code implementations ICML 2018 Krzysztof Choromanski, Mark Rowland, Vikas Sindhwani, Richard E. Turner, Adrian Weller

We present a new method of blackbox optimization via gradient approximation with the use of structured random orthogonal matrices, providing more accurate estimators than baselines and with provable theoretical guarantees.

OpenAI Gym Text-to-Image Generation

Human Perceptions of Fairness in Algorithmic Decision Making: A Case Study of Criminal Risk Prediction

no code implementations26 Feb 2018 Nina Grgić-Hlača, Elissa M. Redmiles, Krishna P. Gummadi, Adrian Weller

As algorithms are increasingly used to make important decisions that affect human lives, ranging from social benefit assignment to predicting risk of criminal recidivism, concerns have been raised about the fairness of algorithmic decision making.

Decision Making Fairness

Gauged Mini-Bucket Elimination for Approximate Inference

no code implementations5 Jan 2018 Sungsoo Ahn, Michael Chertkov, Jinwoo Shin, Adrian Weller

Recently, so-called gauge transformations were used to improve variational lower bounds on $Z$.

Uprooting and Rerooting Higher-Order Graphical Models

no code implementations NeurIPS 2017 Mark Rowland, Adrian Weller

The idea of uprooting and rerooting graphical models was introduced specifically for binary pairwise models by Weller (2016) as a way to transform a model to any of a whole equivalence class of related models, such that inference on any one model yields inference results for all others.

Proceedings of the 2017 ICML Workshop on Human Interpretability in Machine Learning (WHI 2017)

no code implementations8 Aug 2017 Been Kim, Dmitry M. Malioutov, Kush R. Varshney, Adrian Weller

This is the Proceedings of the 2017 ICML Workshop on Human Interpretability in Machine Learning (WHI 2017), which was held in Sydney, Australia, August 10, 2017.

BIG-bench Machine Learning

Challenges for Transparency

no code implementations29 Jul 2017 Adrian Weller

Transparency is often deemed critical to enable effective real-world deployment of intelligent systems.

Computers and Society

From Parity to Preference-based Notions of Fairness in Classification

1 code implementation NeurIPS 2017 Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, Krishna P. Gummadi, Adrian Weller

The adoption of automated, data-driven decision making in an ever expanding range of applications has raised concerns about its potential unfairness towards certain social groups.

Classification Decision Making +2

On Fairness, Diversity and Randomness in Algorithmic Decision Making

no code implementations30 Jun 2017 Nina Grgić-Hlača, Muhammad Bilal Zafar, Krishna P. Gummadi, Adrian Weller

Consider a binary decision making process where a single machine learning classifier replaces a multitude of humans.

Decision Making Fairness

Lost Relatives of the Gumbel Trick

1 code implementation ICML 2017 Matej Balog, Nilesh Tripuraneni, Zoubin Ghahramani, Adrian Weller

We show how a subfamily of our new methods adapts to this setting, proving new upper and lower bounds on the log partition function and deriving a family of sequential samplers for the Gibbs distribution.

The Unreasonable Effectiveness of Structured Random Orthogonal Embeddings

2 code implementations NeurIPS 2017 Krzysztof Choromanski, Mark Rowland, Adrian Weller

We examine a class of embeddings based on structured random matrices with orthogonal rows which can be applied in many machine learning applications including dimensionality reduction and kernel approximation.

BIG-bench Machine Learning Dimensionality Reduction

Train and Test Tightness of LP Relaxations in Structured Prediction

no code implementations4 Nov 2015 Ofer Meshi, Mehrdad Mahdavi, Adrian Weller, David Sontag

Structured prediction is used in areas such as computer vision and natural language processing to predict structured outputs such as segmentations or parse trees.

Structured Prediction

Clamping Improves TRW and Mean Field Approximations

no code implementations1 Oct 2015 Adrian Weller, Justin Domke

We examine the effect of clamping variables for approximate inference in undirected graphical models with pairwise relationships and discrete variables.

Clamping Variables and Approximate Inference

no code implementations NeurIPS 2014 Adrian Weller, Tony Jebara

It was recently proved using graph covers (Ruozzi, 2012) that the Bethe partition function is upper bounded by the true partition function for a binary pairwise model that is attractive.

Approximating the Bethe partition function

no code implementations30 Dec 2013 Adrian Weller, Tony Jebara

When belief propagation (BP) converges, it does so to a stationary point of the Bethe free energy $F$, and is often strikingly accurate.

On MAP Inference by MWSS on Perfect Graphs

no code implementations26 Sep 2013 Adrian Weller, Tony S. Jebara

Finding the most likely (MAP) configuration of a Markov random field (MRF) is NP-hard in general.

Cannot find the paper you are looking for? You can Submit a new open access paper.