Search Results for author: Paul Mineiro

Found 21 papers, 4 papers with code

Bellman-consistent Pessimism for Offline Reinforcement Learning

no code implementations13 Jun 2021 Tengyang Xie, Ching-An Cheng, Nan Jiang, Paul Mineiro, Alekh Agarwal

The use of pessimism, when reasoning about datasets lacking exhaustive exploration has recently gained prominence in offline reinforcement learning.

ChaCha for Online AutoML

1 code implementation9 Jun 2021 Qingyun Wu, Chi Wang, John Langford, Paul Mineiro, Marco Rossi

We propose the ChaCha (Champion-Challengers) algorithm for making an online choice of hyperparameters in online learning settings.

AutoML

Interaction-Grounded Learning

no code implementations9 Jun 2021 Tengyang Xie, John Langford, Paul Mineiro, Ida Momennejad

We propose Interaction-Grounded Learning for this novel setting, in which a learner's goal is to interact with the environment with no grounding or explicit reward to optimize its policies.

Off-policy Confidence Sequences

no code implementations18 Feb 2021 Nikos Karampatziakis, Paul Mineiro, Aaditya Ramdas

We develop confidence bounds that hold uniformly over time for off-policy evaluation in the contextual bandit setting.

Empirical Likelihood for Contextual Bandits

1 code implementation NeurIPS 2020 Nikos Karampatziakis, John Langford, Paul Mineiro

We propose an estimator and confidence interval for computing the value of a policy from off-policy data in the contextual bandit setting.

Multi-Armed Bandits

Lessons from Contextual Bandit Learning in a Customer Support Bot

no code implementations6 May 2019 Nikos Karampatziakis, Sebastian Kochman, Jade Huang, Paul Mineiro, Kathy Osborne, Weizhu Chen

In this work, we describe practical lessons we have learned from successfully using contextual bandits (CBs) to improve key business metrics of the Microsoft Virtual Agent for customer support.

Information Retrieval Multi-Armed Bandits +1

Contextual Memory Trees

no code implementations17 Jul 2018 Wen Sun, Alina Beygelzimer, Hal Daumé III, John Langford, Paul Mineiro

We design and study a Contextual Memory Tree (CMT), a learning memory controller that inserts new memories into an experience store of unbounded size.

Classification General Classification +2

Logarithmic Time One-Against-Some

no code implementations ICML 2017 Hal Daume III, Nikos Karampatziakis, John Langford, Paul Mineiro

Compared to previous approaches, we obtain substantially better statistical performance for two reasons: First, we prove a tighter and more complete boosting theorem, and second we translate the results more directly into an algorithm.

Classification General Classification

Active Information Acquisition

no code implementations5 Feb 2016 He He, Paul Mineiro, Nikos Karampatziakis

We propose a general framework for sequential and dynamic acquisition of useful information in order to solve a particular task.

General Reinforcement Learning Sentiment Analysis

A Hierarchical Spectral Method for Extreme Classification

no code implementations10 Nov 2015 Paul Mineiro, Nikos Karampatziakis

Extreme classification problems are multiclass and multilabel classification problems where the number of outputs is so large that straightforward strategies are neither statistically nor computationally viable.

Classification General Classification

Fast Label Embeddings for Extremely Large Output Spaces

no code implementations30 Mar 2015 Paul Mineiro, Nikos Karampatziakis

Many modern multiclass and multilabel problems are characterized by increasingly large output spaces.

Learning Reductions that Really Work

no code implementations9 Feb 2015 Alina Beygelzimer, Hal Daumé III, John Langford, Paul Mineiro

We provide a summary of the mathematical and computational techniques that have enabled learning reductions to effectively address a wide class of problems, and show that this approach to solving machine learning problems can be broadly useful.

Scalable Multilabel Prediction via Randomized Methods

1 code implementation9 Feb 2015 Nikos Karampatziakis, Paul Mineiro

In this work we show that a generic regularized nonlinearity mapping independent predictions to joint predictions is sufficient to achieve state-of-the-art performance on a variety of benchmark problems.

General Classification

Fast Label Embeddings via Randomized Linear Algebra

no code implementations19 Dec 2014 Paul Mineiro, Nikos Karampatziakis

Many modern multiclass and multilabel problems are characterized by increasingly large output spaces.

A Randomized Algorithm for CCA

no code implementations13 Nov 2014 Paul Mineiro, Nikos Karampatziakis

We present RandomizedCCA, a randomized algorithm for computing canonical analysis, suitable for large datasets stored either out of core or on a distributed file system.

Normalized Online Learning

no code implementations9 Aug 2014 Stephane Ross, Paul Mineiro, John Langford

We introduce online learning algorithms which are independent of feature scales, proving regret bounds dependent on the ratio of scales existent in the data rather than the absolute scale.

Combining Structured and Unstructured Randomness in Large Scale PCA

no code implementations23 Oct 2013 Nikos Karampatziakis, Paul Mineiro

Principal Component Analysis (PCA) is a ubiquitous tool with many applications in machine learning including feature construction, subspace embedding, and outlier detection.

Outlier Detection

Discriminative Features via Generalized Eigenvectors

no code implementations7 Oct 2013 Nikos Karampatziakis, Paul Mineiro

Representing examples in a way that is compatible with the underlying classifier can greatly enhance the performance of a learning system.

General Classification

Loss-Proportional Subsampling for Subsequent ERM

no code implementations7 Jun 2013 Paul Mineiro, Nikos Karampatziakis

We propose a sampling scheme suitable for reducing a data set prior to selecting a hypothesis with minimum empirical risk.

Normalized Online Learning

1 code implementation28 May 2013 Stephane Ross, Paul Mineiro, John Langford

We introduce online learning algorithms which are independent of feature scales, proving regret bounds dependent on the ratio of scales existent in the data rather than the absolute scale.

Cannot find the paper you are looking for? You can Submit a new open access paper.