Search Results for author: Michael I. Jordan

Found 129 papers, 27 papers with code

ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning

1 code implementation • 11 Dec 2021 • Xiao-Yang Liu, Zechu Li, Zhuoran Yang, Jiahao Zheng, Zhaoran Wang, Anwar Walid, Jian Guo, Michael I. Jordan

In this paper, we present a scalable and elastic library ElegantRL-podracer for cloud-native deep reinforcement learning, which efficiently supports millions of GPU cores to carry out massively parallel training at multiple levels.

reinforcement-learning Reinforcement Learning (RL) +1

3,424

Paper
Code

Distribution-Free, Risk-Controlling Prediction Sets

3 code implementations • 7 Jan 2021 • Stephen Bates, Anastasios Angelopoulos, Lihua Lei, Jitendra Malik, Michael I. Jordan

While improving prediction accuracy has been the focus of machine learning in recent years, this alone does not suffice for reliable decision-making.

BIG-bench Machine Learning Classification +9

1,149

Paper
Code

Ranking and Tuning Pre-trained Models: A New Paradigm for Exploiting Model Hubs

1 code implementation • 20 Oct 2021 • Kaichao You, Yong liu, Ziyang Zhang, Jianmin Wang, Michael I. Jordan, Mingsheng Long

(2) The best ranked PTM can either be fine-tuned and deployed if we have no preference for the model's architecture or the target PTM can be tuned by the top $K$ ranked PTMs via a Bayesian procedure that we propose.

198

Paper
Code

Prediction-Powered Inference

2 code implementations • 23 Jan 2023 • Anastasios N. Angelopoulos, Stephen Bates, Clara Fannjiang, Michael I. Jordan, Tijana Zrnic

Prediction-powered inference is a framework for performing valid statistical inference when an experimental dataset is supplemented with predictions from a machine-learning system.

Astronomy regression +1

165

Paper
Code

NumS: Scalable Array Programming for the Cloud

1 code implementation • 28 Jun 2022 • Melih Elibol, Vinamra Benara, Samyu Yagati, Lianmin Zheng, Alvin Cheung, Michael I. Jordan, Ion Stoica

LSHS is a local search method which optimizes operator placement by minimizing maximum memory and network load on any given node within a distributed system.

regression Scheduling

131

Paper
Code

Do Offline Metrics Predict Online Performance in Recommender Systems?

1 code implementation • 7 Nov 2020 • Karl Krauth, Sarah Dean, Alex Zhao, Wenshuo Guo, Mihaela Curmei, Benjamin Recht, Michael I. Jordan

We observe that offline metrics are correlated with online performance over a range of environments.

Recommendation Systems

Paper
Code

Fine-Tuning Language Models with Advantage-Induced Policy Alignment

1 code implementation • 4 Jun 2023 • Banghua Zhu, Hiteshi Sharma, Felipe Vieira Frujeri, Shi Dong, Chenguang Zhu, Michael I. Jordan, Jiantao Jiao

Reinforcement learning from human feedback (RLHF) has emerged as a reliable approach to aligning large language models (LLMs) to human preferences.

Paper
Code

Learn then Test: Calibrating Predictive Algorithms to Achieve Risk Control

1 code implementation • 3 Oct 2021 • Anastasios N. Angelopoulos, Stephen Bates, Emmanuel J. Candès, Michael I. Jordan, Lihua Lei

We introduce a framework for calibrating machine learning models so that their predictions satisfy explicit, finite-sample statistical guarantees.

BIG-bench Machine Learning Instance Segmentation +3

Paper
Code

Byzantine-Robust Federated Learning with Optimal Statistical Rates and Privacy Guarantees

2 code implementations • 24 May 2022 • Banghua Zhu, Lun Wang, Qi Pang, Shuai Wang, Jiantao Jiao, Dawn Song, Michael I. Jordan

In contrast to prior work, our proposed protocols improve the dimension dependence and achieve a tight statistical rate in terms of all the parameters for strongly convex losses.

Federated Learning

Paper
Code

Desiderata for Representation Learning: A Causal Perspective

1 code implementation • 8 Sep 2021 • Yixin Wang, Michael I. Jordan

Representation learning constructs low-dimensional representations to summarize essential features of high-dimensional data.

counterfactual Disentanglement

Paper
Code

Conformal prediction for the design problem

1 code implementation • 8 Feb 2022 • Clara Fannjiang, Stephen Bates, Anastasios N. Angelopoulos, Jennifer Listgarten, Michael I. Jordan

This is challenging because of a characteristic type of distribution shift between the training and test data in the design setting -- one in which the training and test data are statistically dependent, as the latter is chosen based on the former.

Conformal Prediction

Paper
Code

Tactical Optimism and Pessimism for Deep Reinforcement Learning

2 code implementations • NeurIPS 2021 • Ted Moskovitz, Jack Parker-Holder, Aldo Pacchiano, Michael Arbel, Michael I. Jordan

In recent years, deep off-policy actor-critic algorithms have become a dominant approach to reinforcement learning for continuous control.

Continuous Control reinforcement-learning +1

Paper
Code

Private Prediction Sets

1 code implementation • 11 Feb 2021 • Anastasios N. Angelopoulos, Stephen Bates, Tijana Zrnic, Michael I. Jordan

Our method follows the general approach of split conformal prediction; we use holdout data to calibrate the size of the prediction sets but preserve privacy by using a privatized quantile subroutine.

Conformal Prediction Decision Making +1

Paper
Code

On Optimal Caching and Model Multiplexing for Large Model Inference

1 code implementation • 3 Jun 2023 • Banghua Zhu, Ying Sheng, Lianmin Zheng, Clark Barrett, Michael I. Jordan, Jiantao Jiao

Theoretically, we provide an optimal algorithm for jointly optimizing both approaches to reduce the inference cost in both offline and online tabular settings.

Paper
Code

Latent Dirichlet Allocation

2 code implementations • 1 Jan 2003 • David M. Blei, Andrew Y. Ng, Michael I. Jordan

Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities.

Collaborative Filtering Text Categorization +2

Paper
Code

Federated Conformal Predictors for Distributed Uncertainty Quantification

1 code implementation • 27 May 2023 • Charles Lu, Yaodong Yu, Sai Praneeth Karimireddy, Michael I. Jordan, Ramesh Raskar

Conformal prediction is emerging as a popular paradigm for providing rigorous uncertainty quantification in machine learning since it can be easily applied as a post-processing step to already trained models.

Conformal Prediction Federated Learning +1

Paper
Code

On-Demand Sampling: Learning Optimally from Multiple Distributions

1 code implementation • 22 Oct 2022 • Nika Haghtalab, Michael I. Jordan, Eric Zhao

This improves upon the best known sample complexity bounds for fair federated learning by Mohri et al. and collaborative learning by Nguyen and Zakynthinou by multiplicative factors of $n$ and $\log(n)/\epsilon^3$, respectively.

Fairness Federated Learning +1

Paper
Code

Class-Conditional Conformal Prediction with Many Classes

1 code implementation • NeurIPS 2023 • Tiffany Ding, Anastasios N. Angelopoulos, Stephen Bates, Michael I. Jordan, Ryan J. Tibshirani

Standard conformal prediction methods provide a marginal coverage guarantee, which means that for a random test point, the conformal prediction set contains the true label with a user-specified probability.

Conformal Prediction

Paper
Code

Representation Matters: Assessing the Importance of Subgroup Allocations in Training Data

1 code implementation • 5 Mar 2021 • Esther Rolf, Theodora Worledge, Benjamin Recht, Michael I. Jordan

Collecting more diverse and representative training data is often touted as a remedy for the disparate performance of machine learning predictors across subpopulations.

Paper
Code

TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent Kernels

1 code implementation • 13 Jul 2022 • Yaodong Yu, Alexander Wei, Sai Praneeth Karimireddy, Yi Ma, Michael I. Jordan

Leveraging this observation, we propose a Train-Convexify-Train (TCT) procedure to sidestep this issue: first, learn features using off-the-shelf methods (e. g., FedAvg); then, optimize a convexified problem obtained from the network's empirical neural tangent kernel approximation.

Federated Learning

Paper
Code

Bayesian Nonparametric Inference of Switching Linear Dynamical Systems

1 code implementation • 19 Mar 2010 • Emily B. Fox, Erik B. Sudderth, Michael I. Jordan, Alan S. Willsky

Many complex dynamical phenomena can be effectively modeled by a system that switches among a set of conditionally linear dynamical modes.

Paper
Code

Solving Constrained Variational Inequalities via a First-order Interior Point-based Method

1 code implementation • 21 Jun 2022 • Tong Yang, Michael I. Jordan, Tatjana Chavdarova

We provide convergence guarantees for ACVI in two general classes of problems: (i) when the operator is $\xi$-monotone, and (ii) when it is monotone, some constraints are active and the game is not purely rotational.

Paper
Code

Optimal Data Selection: An Online Distributed View

1 code implementation • 25 Jan 2022 • Mariel Werner, Anastasios Angelopoulos, Stephen Bates, Michael I. Jordan

The blessing of ubiquitous data also comes with a curse: the communication, storage, and labeling of massive, mostly redundant datasets.

Active Learning

Paper
Code

A Primal-dual Approach for Solving Variational Inequalities with General-form Constraints

1 code implementation • 27 Oct 2022 • Tatjana Chavdarova, Matteo Pagliardini, Tong Yang, Michael I. Jordan

We prove its convergence and show that the gap function of the last iterate of this inexact-ACVI method decreases at a rate of $\mathcal{O}(\frac{1}{\sqrt{K}})$ when the operator is $L$-Lipschitz and monotone, provided that the errors decrease at appropriate rates.

Paper
Code

A Statistical Analysis of Polyak-Ruppert Averaged Q-learning

1 code implementation • 29 Dec 2021 • Xiang Li, Wenhao Yang, Jiadong Liang, Zhihua Zhang, Michael I. Jordan

We study Q-learning with Polyak-Ruppert averaging in a discounted Markov decision process in synchronous and tabular settings.

Q-Learning

Paper
Code

AutoEval Done Right: Using Synthetic Data for Model Evaluation

1 code implementation • 9 Mar 2024 • Pierre Boyeau, Anastasios N. Angelopoulos, Nir Yosef, Jitendra Malik, Michael I. Jordan

The evaluation of machine learning models using human-labeled validation data can be expensive and time-consuming.

Paper
Code

Posterior Distribution for the Number of Clusters in Dirichlet Process Mixture Models

no code implementations • 23 May 2019 • Chiao-Yu Yang, Eric Xia, Nhat Ho, Michael I. Jordan

In this work, we provide a rigorous study for the posterior distribution of the number of clusters in DPMM under different prior distributions on the parameters and constraints on the distributions of the data.

Clustering

Paper
Add Code

Resource Allocation in Multi-armed Bandit Exploration: Overcoming Sublinear Scaling with Adaptive Parallelism

no code implementations • 31 Oct 2020 • Brijen Thananjeyan, Kirthevasan Kandasamy, Ion Stoica, Michael I. Jordan, Ken Goldberg, Joseph E. Gonzalez

Second, we present an algorithm for a fixed deadline setting, where we are given a time deadline and need to maximize the probability of finding the best arm.

Distributed Computing Multi-Armed Bandits

Paper
Add Code

Learning Strategies in Decentralized Matching Markets under Uncertain Preferences

no code implementations • 29 Oct 2020 • Xiaowu Dai, Michael I. Jordan

We study the problem of decision-making in the setting of a scarcity of shared resources when the preferences of agents are unknown a priori and must be learned from data.

Decision Making Fairness

Paper
Add Code

Efficient Methods for Structured Nonconvex-Nonconcave Min-Max Optimization

no code implementations • 31 Oct 2020 • Jelena Diakonikolas, Constantinos Daskalakis, Michael I. Jordan

The use of min-max optimization in adversarial training of deep neural network classifiers and training of generative adversarial networks has motivated the study of nonconvex-nonconcave optimization objectives, which frequently arise in these applications.

Paper
Add Code

On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces

no code implementations • 9 Nov 2020 • Zhuoran Yang, Chi Jin, Zhaoran Wang, Mengdi Wang, Michael I. Jordan

The classical theory of reinforcement learning (RL) has focused on tabular and linear representations of value functions.

Reinforcement Learning (RL)

Paper
Add Code

Optimal Mean Estimation without a Variance

no code implementations • 24 Nov 2020 • Yeshwanth Cherapanamjeri, Nilesh Tripuraneni, Peter L. Bartlett, Michael I. Jordan

Concretely, given a sample $\mathbf{X} = \{X_i\}_{i = 1}^n$ from a distribution $\mathcal{D}$ over $\mathbb{R}^d$ with mean $\mu$ which satisfies the following \emph{weak-moment} assumption for some ${\alpha \in [0, 1]}$: \begin{equation*} \forall \|v\| = 1: \mathbb{E}_{X \thicksim \mathcal{D}}[\lvert \langle X - \mu, v\rangle \rvert^{1 + \alpha}] \leq 1, \end{equation*} and given a target failure probability, $\delta$, our goal is to design an estimator which attains the smallest possible confidence interval as a function of $n, d,\delta$.

Paper
Add Code

Bandit Learning in Decentralized Matching Markets

no code implementations • 14 Dec 2020 • Lydia T. Liu, Feng Ruan, Horia Mania, Michael I. Jordan

We study two-sided matching markets in which one side of the market (the players) does not have a priori knowledge about its preferences for the other side (the arms) and is required to learn its preferences from experience.

Paper
Add Code

Stochastic Approximation for Online Tensorial Independent Component Analysis

no code implementations • 28 Dec 2020 • Chris Junchi Li, Michael I. Jordan

For estimating one component, we provide a dynamics-based analysis to prove that our online tensorial ICA algorithm with a specific choice of stepsize achieves a sharp finite-sample error bound.

Dimensionality Reduction

Paper
Add Code

Learning in Multi-Stage Decentralized Matching Markets

no code implementations • NeurIPS 2021 • Xiaowu Dai, Michael I. Jordan

Matching markets are often organized in a multi-stage and decentralized manner.

Fairness

Paper
Add Code

A Variational Inequality Approach to Bayesian Regression Games

no code implementations • 24 Mar 2021 • Wenshuo Guo, Michael I. Jordan, Tianyi Lin

Bayesian regression games are a special class of two-player general-sum Bayesian games in which the learner is partially informed about the adversary's objective through a Bayesian prior.

regression Stochastic Optimization

Paper
Add Code

On the Stability of Nonlinear Receding Horizon Control: A Geometric Perspective

no code implementations • 27 Mar 2021 • Tyler Westenbroek, Max Simchowitz, Michael I. Jordan, S. Shankar Sastry

Crucially, this guarantee requires that state costs applied to the planning problems are in a certain sense `compatible' with the global geometry of the system, and a simple counter-example demonstrates the necessity of this condition.

Paper
Add Code

Multi-Source Causal Inference Using Control Variates

no code implementations • 30 Mar 2021 • Wenshuo Guo, Serena Wang, Peng Ding, Yixin Wang, Michael I. Jordan

Across simulations and two case studies with real data, we show that this control variate can significantly reduce the variance of the ATE estimate.

Causal Inference Epidemiology +2

Paper
Add Code

Fast Distributionally Robust Learning with Variance Reduced Min-Max Optimization

no code implementations • 27 Apr 2021 • Yaodong Yu, Tianyi Lin, Eric Mazumdar, Michael I. Jordan

Distributionally robust supervised learning (DRSL) is emerging as a key paradigm for building reliable machine learning systems for real-world applications -- reflecting the need for classifiers and predictive models that are robust to the distribution shifts that arise from phenomena such as selection bias or nonstationarity.

BIG-bench Machine Learning Selection bias

Paper
Add Code

Parallelizing Contextual Bandits

no code implementations • 21 May 2021 • Jeffrey Chan, Aldo Pacchiano, Nilesh Tripuraneni, Yun S. Song, Peter Bartlett, Michael I. Jordan

Standard approaches to decision-making under uncertainty focus on sequential exploration of the space of decisions.

Decision Making Decision Making Under Uncertainty +1

Paper
Add Code

On the Theory of Reinforcement Learning with Once-per-Episode Feedback

no code implementations • NeurIPS 2021 • Niladri S. Chatterji, Aldo Pacchiano, Peter L. Bartlett, Michael I. Jordan

We study a theory of reinforcement learning (RL) in which the learner receives binary feedback only once at the end of an episode.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

PAC Best Arm Identification Under a Deadline

no code implementations • 6 Jun 2021 • Brijen Thananjeyan, Kirthevasan Kandasamy, Ion Stoica, Michael I. Jordan, Ken Goldberg, Joseph E. Gonzalez

In this work, the decision-maker is given a deadline of $T$ rounds, where, on each round, it can adaptively choose which arms to pull and how many times to pull them; this distinguishes the number of decisions made (i. e., time or number of rounds) from the number of samples acquired (cost).

Paper
Add Code

Learning Competitive Equilibria in Exchange Economies with Bandit Feedback

no code implementations • 11 Jun 2021 • Wenshuo Guo, Kirthevasan Kandasamy, Joseph E Gonzalez, Michael I. Jordan, Ion Stoica

The allocations at a CE are Pareto efficient and fair.

Paper
Add Code

Taming Nonconvexity in Kernel Feature Selection -- Favorable Properties of the Laplace Kernel

no code implementations • 17 Jun 2021 • Feng Ruan, Keli Liu, Michael I. Jordan

Kernel-based feature selection is an important tool in nonparametric statistics.

feature selection Model Selection

Paper
Add Code

Test-time Collective Prediction

no code implementations • NeurIPS 2021 • Celestine Mendler-Dünner, Wenshuo Guo, Stephen Bates, Michael I. Jordan

An increasingly common setting in machine learning involves multiple parties, each with their own data, who want to jointly make predictions on future test points.

Paper
Add Code

Who Leads and Who Follows in Strategic Classification?

no code implementations • NeurIPS 2021 • Tijana Zrnic, Eric Mazumdar, S. Shankar Sastry, Michael I. Jordan

In particular, by generalizing the standard model to allow both players to learn over time, we show that a decision-maker that makes updates faster than the agents can reverse the order of play, meaning that the agents lead and the decision-maker follows.

Classification

Paper
Add Code

The Stereotyping Problem in Collaboratively Filtered Recommender Systems

no code implementations • 23 Jun 2021 • Wenshuo Guo, Karl Krauth, Michael I. Jordan, Nikhil Garg

First, we introduce a notion of joint accessibility, which measures the extent to which a set of items can jointly be accessed by users.

Collaborative Filtering Recommendation Systems

Paper
Add Code

Instance-optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning

no code implementations • 28 Jun 2021 • Koulik Khamaru, Eric Xia, Martin J. Wainwright, Michael I. Jordan

Various algorithms in reinforcement learning exhibit dramatic variability in their convergence rates and ultimate accuracy as a function of the problem structure.

Q-Learning

Paper
Add Code

On component interactions in two-stage recommender systems

no code implementations • 28 Jun 2021 • Jiri Hron, Karl Krauth, Michael I. Jordan, Niki Kilbertus

Thanks to their scalability, two-stage recommenders are used by many of today's largest online platforms, including YouTube, LinkedIn, and Pinterest.

Recommendation Systems Vocal Bursts Valence Prediction

Paper
Add Code

Variational Refinement for Importance Sampling Using the Forward Kullback-Leibler Divergence

no code implementations • 30 Jun 2021 • Ghassen Jerfel, Serena Wang, Clara Fannjiang, Katherine A. Heller, Yian Ma, Michael I. Jordan

We thus propose a novel combination of optimization and sampling techniques for approximate Bayesian inference by constructing an IS proposal distribution through the minimization of a forward KL (FKL) divergence.

Bayesian Inference Variational Inference

Paper
Add Code

On the Convergence of Stochastic Extragradient for Bilinear Games using Restarted Iteration Averaging

no code implementations • 30 Jun 2021 • Chris Junchi Li, Yaodong Yu, Nicolas Loizou, Gauthier Gidel, Yi Ma, Nicolas Le Roux, Michael I. Jordan

We study the stochastic bilinear minimax optimization problem, presenting an analysis of the same-sample Stochastic ExtraGradient (SEG) method with constant step size, and presenting variations of the method that yield favorable convergence.

Paper
Add Code

Evaluating Sensitivity to the Stick-Breaking Prior in Bayesian Nonparametrics

no code implementations • 8 Jul 2021 • Ryan Giordano, Runjing Liu, Michael I. Jordan, Tamara Broderick

Bayesian models based on the Dirichlet process and other stick-breaking priors have been proposed as core ingredients for clustering, topic modeling, and other unsupervised learning tasks.

Paper
Add Code

Robust Learning of Optimal Auctions

no code implementations • NeurIPS 2021 • Wenshuo Guo, Michael I. Jordan, Manolis Zampetakis

The proposed algorithms operate beyond the setting of bounded distributions that have been studied in prior works, and are guaranteed to obtain a fraction $1-O(\alpha)$ of the optimal revenue under the true distribution when the distributions are MHR.

Paper
Add Code

Data Sharing Markets

no code implementations • 19 Jul 2021 • Mohammad Rasouli, Michael I. Jordan

We model bilateral sharing as a network formation game and show the existence of strongly stable outcome under the top agents property by allowing limited complementarity.

Paper
Add Code

On Constraints in First-Order Optimization: A View from Non-Smooth Dynamical Systems

no code implementations • 17 Jul 2021 • Michael Muehlebach, Michael I. Jordan

We introduce a class of first-order methods for smooth constrained optimization that are based on an analogy to non-smooth dynamical systems.

Paper
Add Code

Optimization on manifolds: A symplectic approach

no code implementations • 23 Jul 2021 • Guilherme França, Alessandro Barp, Mark Girolami, Michael I. Jordan

Optimization tasks are crucial in statistical machine learning.

Paper
Add Code

Learning Equilibria in Matching Markets from Bandit Feedback

no code implementations • NeurIPS 2021 • Meena Jagadeesan, Alexander Wei, Yixin Wang, Michael I. Jordan, Jacob Steinhardt

Large-scale, two-sided matching platforms must find market outcomes that align with user preferences while simultaneously learning these preferences from data.

Paper
Add Code

SOUL: An Energy-Efficient Unsupervised Online Learning Seizure Detection Classifier

no code implementations • 1 Oct 2021 • Adelson Chua, Michael I. Jordan, Rikky Muller

Implantable devices that record neural activity and detect seizures have been adopted to issue warnings or trigger neurostimulation to suppress epileptic seizures.

EEG Seizure Detection +1

Paper
Add Code

On the Self-Penalization Phenomenon in Feature Selection

no code implementations • 12 Oct 2021 • Michael I. Jordan, Keli Liu, Feng Ruan

We describe an implicit sparsity-inducing mechanism based on minimization over a family of kernels: \begin{equation*} \min_{\beta, f}~\widehat{\mathbb{E}}[L(Y, f(\beta^{1/q} \odot X)] + \lambda_n \|f\|_{\mathcal{H}_q}^2~~\text{subject to}~~\beta \ge 0, \end{equation*} where $L$ is the loss, $\odot$ is coordinate-wise multiplication and $\mathcal{H}_q$ is the reproducing kernel Hilbert space based on the kernel $k_q(x, x') = h(\|x-x'\|_q^q)$, where $\|\cdot\|_q$ is the $\ell_q$ norm.

feature selection

Paper
Add Code

Cluster-and-Conquer: A Framework For Time-Series Forecasting

no code implementations • 26 Oct 2021 • Reese Pathak, Rajat Sen, Nikhil Rao, N. Benjamin Erichson, Michael I. Jordan, Inderjit S. Dhillon

Our framework -- which we refer to as "cluster-and-conquer" -- is highly general, allowing for any time-series forecasting and clustering method to be used in each step.

Time Series Time Series Forecasting

Paper
Add Code

An Instance-Dependent Analysis for the Cooperative Multi-Player Multi-Armed Bandit

no code implementations • 8 Nov 2021 • Aldo Pacchiano, Peter Bartlett, Michael I. Jordan

We study the problem of information sharing and cooperation in Multi-Player Multi-Armed bandits.

Multi-Armed Bandits

Paper
Add Code

Behavior-Guided Reinforcement Learning

no code implementations • 25 Sep 2019 • Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang, Anna Choromanska, Krzysztof Choromanski, Michael I. Jordan

We introduce a new approach for comparing reinforcement learning policies, using Wasserstein distances (WDs) in a newly defined latent behavioral space.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Meta-Analysis of Randomized Experiments with Applications to Heavy-Tailed Response Data

no code implementations • 14 Dec 2021 • Nilesh Tripuraneni, Dhruv Madeka, Dean Foster, Dominique Perrault-Joncas, Michael I. Jordan

The key insight of our procedure is that the noisy (but unbiased) difference-of-means estimate can be used as a ground truth ``label" on a portion of the RCT, to test the performance of an estimator trained on the other portion.

Paper
Add Code

Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector Problems

no code implementations • 29 Dec 2021 • Chris Junchi Li, Michael I. Jordan

Motivated by the problem of online canonical correlation analysis, we propose the \emph{Stochastic Scaled-Gradient Descent} (SSGD) algorithm for minimizing the expectation of a stochastic function over a generic Riemannian manifold.

Paper
Add Code

Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopic Followers?

no code implementations • 27 Dec 2021 • Han Zhong, Zhuoran Yang, Zhaoran Wang, Michael I. Jordan

We develop sample-efficient reinforcement learning (RL) algorithms for solving for an SNE in both online and offline settings.

Reinforcement Learning (RL)

Paper
Add Code

Last-Iterate Convergence of Saddle-Point Optimizers via High-Resolution Differential Equations

no code implementations • 27 Dec 2021 • Tatjana Chavdarova, Michael I. Jordan, Manolis Zampetakis

However, the convergence properties of these methods are qualitatively different, even on simple bilinear games.

Paper
Add Code

Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic

no code implementations • NeurIPS 2021 • Yufeng Zhang, Siyu Chen, Zhuoran Yang, Michael I. Jordan, Zhaoran Wang

Specifically, we consider a version of AC where the actor and critic are represented by overparameterized two-layer neural networks and are updated with two-timescale learning rates.

Representation Learning

Paper
Add Code

Instance-Dependent Confidence and Early Stopping for Reinforcement Learning

no code implementations • 21 Jan 2022 • Koulik Khamaru, Eric Xia, Martin J. Wainwright, Michael I. Jordan

As a consequence, we propose a data-dependent stopping rule for instance-optimal algorithms.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Optimal variance-reduced stochastic approximation in Banach spaces

no code implementations • 21 Jan 2022 • Wenlong Mou, Koulik Khamaru, Martin J. Wainwright, Peter L. Bartlett, Michael I. Jordan

We study the problem of estimating the fixed point of a contractive operator defined on a separable Banach space.

Q-Learning

Paper
Add Code

Reinforcement Learning with Heterogeneous Data: Estimation and Inference

no code implementations • 31 Jan 2022 • Elynn Y. Chen, Rui Song, Michael I. Jordan

Reinforcement Learning (RL) has the promise of providing data-driven support for decision-making in a wide range of problems in healthcare, education, business, and other domains.

Decision Making reinforcement-learning +1

Paper
Add Code

Robust Estimation for Nonparametric Families via Generative Adversarial Networks

no code implementations • 2 Feb 2022 • Banghua Zhu, Jiantao Jiao, Michael I. Jordan

Prior work focus on the problem of robust mean and covariance estimation when the true distribution lies in the family of Gaussian distributions or elliptical distributions, and analyze depth or scoring rule based GAN losses for the problem.

Paper
Add Code

Transferred Q-learning

no code implementations • 9 Feb 2022 • Elynn Y. Chen, Michael I. Jordan, Sai Li

We consider $Q$-learning with knowledge transfer, using samples from a target reinforcement learning (RL) task as well as source samples from different but related RL tasks.

Offline RL Q-Learning +2

Paper
Add Code

Improving Generalization via Uncertainty Driven Perturbations

no code implementations • 11 Feb 2022 • Matteo Pagliardini, Gilberto Manunza, Martin Jaggi, Michael I. Jordan, Tatjana Chavdarova

We show that UDP is guaranteed to achieve the maximum margin decision boundary on linear models and that it notably increases it on challenging simulated datasets.

Paper
Add Code

Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning

no code implementations • 22 Feb 2022 • Jibang Wu, Zixuan Zhang, Zhe Feng, Zhaoran Wang, Zhuoran Yang, Michael I. Jordan, Haifeng Xu

This paper proposes a novel model of sequential information design, namely the Markov persuasion processes (MPPs), where a sender, with informational advantage, seeks to persuade a stream of myopic receivers to take actions that maximizes the sender's cumulative utilities in a finite horizon Markovian environment with varying prior and utility functions.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

No-Regret Learning in Partially-Informed Auctions

no code implementations • 22 Feb 2022 • Wenshuo Guo, Michael I. Jordan, Ellen Vitercik

We formalize this problem as an online learning task where the goal is to have low regret with respect to a myopic oracle that has perfect knowledge of the distribution over items and the seller's masking function.

Paper
Add Code

Partial Identification with Noisy Covariates: A Robust Optimization Approach

no code implementations • 22 Feb 2022 • Wenshuo Guo, Mingzhang Yin, Yixin Wang, Michael I. Jordan

Directly adjusting for these imperfect measurements of the covariates can lead to biased causal estimates.

Causal Inference

Paper
Add Code

Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement Learning Approach

no code implementations • 25 Feb 2022 • Shuang Qiu, Boxiang Lyu, Qinglin Meng, Zhaoran Wang, Zhuoran Yang, Michael I. Jordan

Dynamic mechanism design studies how mechanism designers should allocate resources among agents in a time-varying environment.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Off-Policy Evaluation with Policy-Dependent Optimization Response

no code implementations • 25 Feb 2022 • Wenshuo Guo, Michael I. Jordan, Angela Zhou

Under this framework, a decision-maker's utility depends on the policy-dependent optimization, which introduces a fundamental challenge of \textit{optimization} bias even for the case of policy evaluation.

Causal Inference Decision Making +1

Paper
Add Code

Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets

no code implementations • 7 Mar 2022 • Yifei Min, Tianhao Wang, Ruitu Xu, Zhaoran Wang, Michael I. Jordan, Zhuoran Yang

We study a Markov matching market involving a planner and a set of strategic agents on the two sides of the market.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Geometric Methods for Sampling, Optimisation, Inference and Adaptive Agents

no code implementations • 20 Mar 2022 • Alessandro Barp, Lancelot Da Costa, Guilherme França, Karl Friston, Mark Girolami, Michael I. Jordan, Grigorios A. Pavliotis

In this chapter, we identify fundamental geometric structures that underlie the problems of sampling, optimisation, inference and adaptive decision-making.

counterfactual Decision Making

Paper
Add Code

First-Order Algorithms for Nonlinear Generalized Nash Equilibrium Problems

no code implementations • 7 Apr 2022 • Michael I. Jordan, Tianyi Lin, Manolis Zampetakis

We consider the problem of computing an equilibrium in a class of \textit{nonlinear generalized Nash equilibrium problems (NGNEPs)} in which the strategy sets for each player are defined by equality and inequality constraints that may depend on the choices of rival players.

Paper
Add Code

Principal-Agent Hypothesis Testing

no code implementations • 13 May 2022 • Stephen Bates, Michael I. Jordan, Michael Sklar, Jake A. Soloff

The efficacy of the drug is not known to the regulator, so the pharmaceutical company must run a costly trial to prove efficacy to the regulator.

Paper
Add Code

Online Nonsubmodular Minimization with Delayed Costs: From Full Information to Bandit Feedback

no code implementations • 15 May 2022 • Tianyi Lin, Aldo Pacchiano, Yaodong Yu, Michael I. Jordan

Motivated by applications to online learning in sparse estimation and Bayesian optimization, we consider the problem of online unconstrained nonsubmodular minimization with delayed costs in both full information and bandit feedback settings.

Bayesian Optimization

Paper
Add Code

Robust Calibration with Multi-domain Temperature Scaling

no code implementations • 6 Jun 2022 • Yaodong Yu, Stephen Bates, Yi Ma, Michael I. Jordan

Uncertainty quantification is essential for the reliable deployment of machine learning models to high-stakes application domains.

Uncertainty Quantification

Paper
Add Code

First-Order Algorithms for Min-Max Optimization in Geodesic Metric Spaces

no code implementations • 4 Jun 2022 • Michael I. Jordan, Tianyi Lin, Emmanouil-Vasileios Vlatakis-Gkaragkounis

From optimal transport to robust dimensionality reduction, a plethora of machine learning applications can be cast into the min-max optimization problems over Riemannian manifolds.

Dimensionality Reduction

Paper
Add Code

Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization

no code implementations • 17 Jun 2022 • Simon S. Du, Gauthier Gidel, Michael I. Jordan, Chris Junchi Li

We consider the smooth convex-concave bilinearly-coupled saddle-point problem, $\min_{\mathbf{x}}\max_{\mathbf{y}}~F(\mathbf{x}) + H(\mathbf{x},\mathbf{y}) - G(\mathbf{y})$, where one has access to stochastic first-order oracles for $F$, $G$ as well as the bilinear coupling function $H$.

Paper
Add Code

Modeling Content Creator Incentives on Algorithm-Curated Platforms

no code implementations • 27 Jun 2022 • Jiri Hron, Karl Krauth, Michael I. Jordan, Niki Kilbertus, Sarah Dean

To this end, we propose tools for numerically finding equilibria in exposure games, and illustrate results of an audit on the MovieLens and LastFM datasets.

Paper
Add Code

Breaking Feedback Loops in Recommender Systems with Causal Inference

no code implementations • 4 Jul 2022 • Karl Krauth, Yixin Wang, Michael I. Jordan

Our main observation is that a recommender system does not suffer from feedback loops if it reasons about causal quantities, namely the intervention distributions of recommendations on user ratings.

Causal Inference Recommendation Systems

Paper
Add Code

Recommendation Systems with Distribution-Free Reliability Guarantees

no code implementations • 4 Jul 2022 • Anastasios N. Angelopoulos, Karl Krauth, Stephen Bates, Yixin Wang, Michael I. Jordan

Building from a pre-trained ranking model, we show how to return a set of items that is rigorously guaranteed to contain mostly good items.

Learning-To-Rank Recommendation Systems

Paper
Add Code

Mechanisms that Incentivize Data Sharing in Federated Learning

no code implementations • 10 Jul 2022 • Sai Praneeth Karimireddy, Wenshuo Guo, Michael I. Jordan

Federated learning is typically considered a beneficial technology which allows multiple agents to collaborate with each other, improve the accuracy of their models, and solve problems which are otherwise too data-intensive / expensive to be solved individually.

Federated Learning

Paper
Add Code

Continuous-time Analysis for Variational Inequalities: An Overview and Desiderata

no code implementations • 14 Jul 2022 • Tatjana Chavdarova, Ya-Ping Hsieh, Michael I. Jordan

Algorithms that solve zero-sum games, multi-objective agent objectives, or, more generally, variational inequality (VI) problems are notoriously unstable on general problems.

Paper
Add Code

Learning Two-Player Mixture Markov Games: Kernel Function Approximation and Correlated Equilibrium

no code implementations • 10 Aug 2022 • Chris Junchi Li, Dongruo Zhou, Quanquan Gu, Michael I. Jordan

We consider learning Nash equilibria in two-player zero-sum Markov Games with nonlinear function approximation, where the action-value function is approximated by a function in a Reproducing Kernel Hilbert Space (RKHS).

Paper
Add Code

Valid Inference after Causal Discovery

no code implementations • 11 Aug 2022 • Paula Gradu, Tijana Zrnic, Yixin Wang, Michael I. Jordan

Causal discovery and causal effect estimation are two fundamental tasks in causal inference.

Causal Discovery Causal Inference +1

Paper
Add Code

Data-Driven Influence Functions for Optimization-Based Causal Inference

no code implementations • 29 Aug 2022 • Michael I. Jordan, Yixin Wang, Angela Zhou

We study a constructive algorithm that approximates Gateaux derivatives for statistical functionals by finite differencing, with a focus on functionals that arise in causal inference.

Causal Inference

Paper
Add Code

Competition, Alignment, and Equilibria in Digital Marketplaces

no code implementations • 30 Aug 2022 • Meena Jagadeesan, Michael I. Jordan, Nika Haghtalab

Nonetheless, the data sharing assumptions impact what mechanism drives misalignment and also affect the specific form of misalignment (e. g. the quality of the best-case and worst-case market outcomes).

Paper
Add Code

Gradient-Free Methods for Deterministic and Stochastic Nonsmooth Nonconvex Optimization

no code implementations • 12 Sep 2022 • Tianyi Lin, Zeyu Zheng, Michael I. Jordan

Nonsmooth nonconvex optimization problems broadly emerge in machine learning and business decision making, whereas two core challenges impede the development of efficient solution methods with finite-time convergence guarantee: the lack of computationally tractable optimality criterion and the lack of computationally powerful oracles.

Decision Making

Paper
Add Code

A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning

no code implementations • 30 Sep 2022 • Zixiang Chen, Chris Junchi Li, Angela Yuan, Quanquan Gu, Michael I. Jordan

With the increasing need for handling large state and action spaces, general function approximation has become a key technique in reinforcement learning (RL).

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

QuTE: decentralized multiple testing on sensor networks with false discovery rate control

no code implementations • 9 Oct 2022 • Aaditya Ramdas, Jianbo Chen, Martin J. Wainwright, Michael I. Jordan

We consider the setting where distinct agents reside on the nodes of an undirected graph, and each agent possesses p-values corresponding to one or more hypotheses local to its node.

Paper
Add Code

A Reinforcement Learning Approach in Multi-Phase Second-Price Auction Design

no code implementations • 19 Oct 2022 • Rui Ai, Boxiang Lyu, Zhaoran Wang, Zhuoran Yang, Michael I. Jordan

First, from the seller's perspective, we need to efficiently explore the environment in the presence of potentially nontruthful bidders who aim to manipulates seller's policy.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee

no code implementations • 23 Oct 2022 • Tianyi Lin, Panayotis Mertikopoulos, Michael I. Jordan

We propose and analyze exact and inexact regularized Newton-type methods for finding a global saddle point of \emph{convex-concave} unconstrained min-max optimization problems.

Second-order methods

Paper
Add Code

Nesterov Meets Optimism: Rate-Optimal Separable Minimax Optimization

no code implementations • 31 Oct 2022 • Chris Junchi Li, Angela Yuan, Gauthier Gidel, Quanquan Gu, Michael I. Jordan

AG-OG is the first single-call algorithm with optimal convergence rates in both deterministic and stochastic settings for bilinearly coupled minimax optimization problems.

Paper
Add Code

The Sample Complexity of Online Contract Design

no code implementations • 10 Nov 2022 • Banghua Zhu, Stephen Bates, Zhuoran Yang, Yixin Wang, Jiantao Jiao, Michael I. Jordan

This result shows that exponential-in-$m$ samples are sufficient and necessary to learn a near-optimal contract, resolving an open problem on the hardness of online contract design.

Paper
Add Code

Incentive-Aware Recommender Systems in Two-Sided Markets

no code implementations • 23 Nov 2022 • Xiaowu Dai, YUAN, QI, Michael I. Jordan

Online platforms in the Internet Economy commonly incorporate recommender systems that recommend arms (e. g., products) to agents (e. g., users).

Fairness Recommendation Systems +1

Paper
Add Code

Principled Reinforcement Learning with Human Feedback from Pairwise or $K$-wise Comparisons

no code implementations • 26 Jan 2023 • Banghua Zhu, Jiantao Jiao, Michael I. Jordan

Our analysis shows that when the true reward function is linear, the widely used maximum likelihood estimator (MLE) converges under both the Bradley-Terry-Luce (BTL) model and the Plackett-Luce (PL) model.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Online Learning in Stackelberg Games with an Omniscient Follower

no code implementations • 27 Jan 2023 • Geng Zhao, Banghua Zhu, Jiantao Jiao, Michael I. Jordan

We analyze the sample complexity of regret minimization in this repeated Stackelberg game.

Paper
Add Code

Accelerated First-Order Optimization under Nonlinear Constraints

no code implementations • 1 Feb 2023 • Michael Muehlebach, Michael I. Jordan

We exploit analogies between first-order algorithms for constrained optimization and non-smooth dynamical systems to design a new class of accelerated first-order algorithms for constrained optimization.

Paper
Add Code

Deterministic Nonsmooth Nonconvex Optimization

no code implementations • 16 Feb 2023 • Michael I. Jordan, Guy Kornowski, Tianyi Lin, Ohad Shamir, Manolis Zampetakis

In particular, we prove a lower bound of $\Omega(d)$ for any deterministic algorithm.

Open-Ended Question Answering

Paper
Add Code

Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models with Reinforcement Learning

no code implementations • 24 Feb 2023 • Ruitu Xu, Yifei Min, Tianhao Wang, Zhaoran Wang, Michael I. Jordan, Zhuoran Yang

We study a heterogeneous agent macroeconomic model with an infinite number of households and firms competing in a labor market.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Online Learning in a Creator Economy

no code implementations • 19 May 2023 • Banghua Zhu, Sai Praneeth Karimireddy, Jiantao Jiao, Michael I. Jordan

In this paper, we initiate the study of online learning in the creator economy by modeling the creator economy as a three-party game between the users, platform, and content creators, with the platform interacting with the content creator under a principal-agent model through contracts to encourage better content.

Recommendation Systems

Paper
Add Code

Operationalizing Counterfactual Metrics: Incentives, Ranking, and Information Asymmetry

no code implementations • 24 May 2023 • Serena Wang, Stephen Bates, P. M. Aronow, Michael I. Jordan

From the social sciences to machine learning, it has been well documented that metrics to be optimized are not always aligned with social welfare.

Causal Inference counterfactual

Paper
Add Code

Evaluating and Incentivizing Diverse Data Contributions in Collaborative Learning

no code implementations • 8 Jun 2023 • Baihe Huang, Sai Praneeth Karimireddy, Michael I. Jordan

This creates a tension between the principal (the FL platform designer) who cares about global performance and the agents (the data collectors) who care about local performance.

Federated Learning

Paper
Add Code

Incentivizing High-Quality Content in Online Recommender Systems

no code implementations • 13 Jun 2023 • Xinyan Hu, Meena Jagadeesan, Michael I. Jordan, Jacob Steinhardt

For content recommender systems such as TikTok and YouTube, the platform's decision algorithm shapes the incentives of content producers, including how much effort the content producers invest in the quality of their content.

Recommendation Systems

Paper
Add Code

Curvature-Independent Last-Iterate Convergence for Games on Riemannian Manifolds

no code implementations • 29 Jun 2023 • Yang Cai, Michael I. Jordan, Tianyi Lin, Argyris Oikonomou, Emmanouil-Vasileios Vlatakis-Gkaragkounis

Numerous applications in machine learning and data analytics can be formulated as equilibrium computation over Riemannian manifolds.

Paper
Add Code

Accelerating Inexact HyperGradient Descent for Bilevel Optimization

no code implementations • 30 Jun 2023 • Haikuo Yang, Luo Luo, Chris Junchi Li, Michael I. Jordan

We present a method for solving general nonconvex-strongly-convex bilevel optimization problems.

Bilevel Optimization

Paper
Add Code

Incentive-Theoretic Bayesian Inference for Collaborative Science

no code implementations • 7 Jul 2023 • Stephen Bates, Michael I. Jordan, Michael Sklar, Jake A. Soloff

We show how the principal can conduct statistical inference that leverages the information that is revealed by an agent's strategic behavior -- their choice to run a trial or not.

Bayesian Inference

Paper
Add Code

Scaff-PD: Communication Efficient Fair and Robust Federated Learning

no code implementations • 25 Jul 2023 • Yaodong Yu, Sai Praneeth Karimireddy, Yi Ma, Michael I. Jordan

We present Scaff-PD, a fast and communication-efficient algorithm for distributionally robust federated learning.

Fairness Federated Learning

Paper
Add Code

Delegating Data Collection in Decentralized Machine Learning

no code implementations • 4 Sep 2023 • Nivasini Ananthakrishnan, Stephen Bates, Michael I. Jordan, Nika Haghtalab

To address the lack of a priori knowledge regarding the optimal performance, we give a convex program that can adaptively and efficiently compute the optimal contract.

Paper
Add Code

A Gentle Introduction to Gradient-Based Optimization and Variational Inequalities for Machine Learning

no code implementations • 9 Sep 2023 • Neha S. Wadia, Yatin Dandi, Michael I. Jordan

The rapid progress in machine learning in recent years has been based on a highly productive connection to gradient-based optimization.

Decision Making

Paper
Add Code

Conformal Decision Theory: Safe Autonomous Decisions from Imperfect Predictions

no code implementations • 9 Oct 2023 • Jordan Lekeufack, Anastasios N. Angelopoulos, Andrea Bajcsy, Michael I. Jordan, Jitendra Malik

We introduce Conformal Decision Theory, a framework for producing safe autonomous decisions despite imperfect machine learning predictions.

Conformal Prediction Motion Planning

Paper
Add Code

A Specialized Semismooth Newton Method for Kernel-Based Optimal Transport

no code implementations • 21 Oct 2023 • Tianyi Lin, Marco Cuturi, Michael I. Jordan

Kernel-based optimal transport (OT) estimators offer an alternative, functional estimation procedure to address OT problems from samples.

Paper
Add Code

Adaptive, Doubly Optimal No-Regret Learning in Strongly Monotone and Exp-Concave Games with Gradient Feedback

no code implementations • 21 Oct 2023 • Michael I. Jordan, Tianyi Lin, Zhengyuan Zhou

Online gradient descent (OGD) is well known to be doubly optimal under strong convexity or monotonicity assumptions: (1) in the single-agent setting, it achieves an optimal regret of $\Theta(\log T)$ for strongly convex cost functions; and (2) in the multi-agent setting of strongly monotone games, with each agent employing OGD, we obtain last-iterate convergence of the joint action to a unique Nash equilibrium at an optimal rate of $\Theta(\frac{1}{T})$.

Paper
Add Code

A Quadratic Speedup in Finding Nash Equilibria of Quantum Zero-Sum Games

no code implementations • 17 Nov 2023 • Francisca Vasconcelos, Emmanouil-Vasileios Vlatakis-Gkaragkounis, Panayotis Mertikopoulos, Georgios Piliouras, Michael I. Jordan

In 2008, Jain and Watrous proposed the first classical algorithm for computing equilibria in quantum zero-sum games using the Matrix Multiplicative Weight Updates (MMWU) method to achieve a convergence rate of $\mathcal{O}(d/\epsilon^2)$ iterations to $\epsilon$-Nash equilibria in the $4^d$-dimensional spectraplex.

Paper
Add Code

Improved Bayes Risk Can Yield Reduced Social Welfare Under Competition

1 code implementation • NeurIPS 2023 • Meena Jagadeesan, Michael I. Jordan, Jacob Steinhardt, Nika Haghtalab

As the scale of machine learning models increases, trends such as scaling laws anticipate consistent downstream improvements in predictive accuracy.

Paper
Code

Towards Optimal Statistical Watermarking

no code implementations • 13 Dec 2023 • Baihe Huang, Hanlin Zhu, Banghua Zhu, Kannan Ramchandran, Michael I. Jordan, Jason D. Lee, Jiantao Jiao

Key to our formulation is a coupling of the output tokens and the rejection region, realized by pseudo-random generators in practice, that allows non-trivial trade-offs between the Type I error and Type II error.

Paper
Add Code

Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF

no code implementations • 29 Jan 2024 • Banghua Zhu, Michael I. Jordan, Jiantao Jiao

Reinforcement Learning from Human Feedback (RLHF) is a pivotal technique that aligns language models closely with human-centric values.

Paper
Add Code

The Limits of Price Discrimination Under Privacy Constraints

no code implementations • 13 Feb 2024 • Alireza Fallah, Michael I. Jordan, Ali Makhdoumi, Azarakhsh Malekian

We consider a privacy mechanism that provides a degree of protection by probabilistically masking each market segment, and we establish that the resultant set of all consumer-producer utilities forms a convex polygon, characterized explicitly as a linear mapping of a certain high-dimensional convex polytope into $\mathbb{R}^2$.

Paper
Add Code

On Three-Layer Data Markets

no code implementations • 15 Feb 2024 • Alireza Fallah, Michael I. Jordan, Ali Makhdoumi, Azarakhsh Malekian

We study a three-layer data market comprising users (data owners), platforms, and a data buyer.

Paper
Add Code

Incentivized Learning in Principal-Agent Bandit Games

no code implementations • 6 Mar 2024 • Antoine Scheid, Daniil Tiapkin, Etienne Boursier, Aymeric Capitaine, El Mahdi El Mhamdi, Eric Moulines, Michael I. Jordan, Alain Durmus

This work considers a repeated principal-agent bandit game, where the principal can only interact with her environment through the agent.

Paper
Add Code

Data-Adaptive Tradeoffs among Multiple Risks in Distribution-Free Prediction

no code implementations • 28 Mar 2024 • Drew T. Nguyen, Reese Pathak, Anastasios N. Angelopoulos, Stephen Bates, Michael I. Jordan

Decision-making pipelines are generally characterized by tradeoffs among various risk functions.

Decision Making Uncertainty Quantification +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.