Search Results for author: Stephane Ross

Found 8 papers, 2 papers with code

Normalized Online Learning

no code implementations9 Aug 2014 Stephane Ross, Paul Mineiro, John Langford

We introduce online learning algorithms which are independent of feature scales, proving regret bounds dependent on the ratio of scales existent in the data rather than the absolute scale.

Reinforcement and Imitation Learning via Interactive No-Regret Learning

no code implementations23 Jun 2014 Stephane Ross, J. Andrew Bagnell

Recent work has demonstrated that problems-- particularly imitation learning and structured prediction-- where a learner's predictions influence the input-distribution it is tested on can be naturally addressed by an interactive approach and analyzed using no-regret online learning.

Imitation Learning reinforcement-learning +2

A Credit Assignment Compiler for Joint Prediction

no code implementations NeurIPS 2016 Kai-Wei Chang, He He, Hal Daumé III, John Langford, Stephane Ross

Many machine learning applications involve jointly predicting multiple mutually dependent output variables.

Normalized Online Learning

1 code implementation28 May 2013 Stephane Ross, Paul Mineiro, John Langford

We introduce online learning algorithms which are independent of feature scales, proving regret bounds dependent on the ratio of scales existent in the data rather than the absolute scale.

Learning Policies for Contextual Submodular Prediction

no code implementations11 May 2013 Stephane Ross, Jiaji Zhou, Yisong Yue, Debadeepta Dey, J. Andrew Bagnell

Many prediction domains, such as ad placement, recommendation, trajectory prediction, and document summarization, require predicting a set or list of options.

Document Summarization News Recommendation +1

A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning

3 code implementations2 Nov 2010 Stephane Ross, Geoffrey J. Gordon, J. Andrew Bagnell

Sequential prediction problems such as imitation learning, where future observations depend on previous predictions (actions), violate the common i. i. d.

Imitation Learning Structured Prediction

Theoretical Analysis of Heuristic Search Methods for Online POMDPs

no code implementations NeurIPS 2007 Stephane Ross, Joelle Pineau, Brahim Chaib-Draa

The algorithm uses search heuristics based on an error analysis of lookahead search, to guide the online search towards reachable beliefs with the most potential to reduce error.

Cannot find the paper you are looking for? You can Submit a new open access paper.