Search Results for author: Georgios Theocharous

Found 20 papers, 7 papers with code

Distributional Off-Policy Evaluation for Slate Recommendations

1 code implementation27 Aug 2023 Shreyas Chaudhari, David Arbour, Georgios Theocharous, Nikos Vlassis

Prior work has developed estimators that leverage the structure in slates to estimate the expected off-policy performance, but the estimation of the entire performance distribution remains elusive.

Fairness Off-policy evaluation

Coagent Networks: Generalized and Scaled

no code implementations16 May 2023 James E. Kostas, Scott M. Jordan, Yash Chandak, Georgios Theocharous, Dhawal Gupta, Martha White, Bruno Castro da Silva, Philip S. Thomas

However, the coagent framework is not just an alternative to BDL; the two approaches can be blended: BDL can be combined with coagent learning rules to create architectures with the advantages of both approaches.

MuJoCo Reinforcement Learning (RL)

Explaining RL Decisions with Trajectories

2 code implementations6 May 2023 Shripad Vilasrao Deshmukh, Arpan Dasgupta, Balaji Krishnamurthy, Nan Jiang, Chirag Agarwal, Georgios Theocharous, Jayakumar Subramanian

To do so, we encode trajectories in offline training data individually as well as collectively (encoding a set of trajectories).

Attribute continuous-control +5

Personalized Detection of Cognitive Biases in Actions of Users from Their Logs: Anchoring and Recency Biases

no code implementations30 Jun 2022 Atanu R Sinha, Navita Goyal, Sunny Dhamnani, Tanay Asija, Raja K Dubey, M V Kaarthik Raja, Georgios Theocharous

The recognition of cognitive bias in computer science is largely in the domain of information retrieval, and bias is identified at an aggregate level with the help of annotated data.

Bias Detection Ethics +3

Smoothed Online Combinatorial Optimization Using Imperfect Predictions

no code implementations23 Apr 2022 Kai Wang, Zhao Song, Georgios Theocharous, Sridhar Mahadevan

Smoothed online combinatorial optimization considers a learner who repeatedly chooses a combinatorial decision to minimize an unknown changing cost function with a penalty on switching decisions in consecutive rounds.

Combinatorial Optimization

Off-Policy Evaluation in Embedded Spaces

no code implementations5 Mar 2022 Jaron J. R. Lee, David Arbour, Georgios Theocharous

Second, many recommendation systems are not probabilistic and so having access to logging and target policy densities may not be feasible.

Density Ratio Estimation Off-policy evaluation +1

Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning

1 code implementation30 Dec 2021 Tong Mu, Georgios Theocharous, David Arbour, Emma Brunskill

Online reinforcement learning (RL) algorithms are often difficult to deploy in complex human-facing applications as they may learn slowly and have poor early performance.

reinforcement-learning Reinforcement Learning +1

Edge-Compatible Reinforcement Learning for Recommendations

no code implementations10 Dec 2021 James E. Kostas, Philip S. Thomas, Georgios Theocharous

In this work, we build on asynchronous coagent policy gradient algorithms \citep{kostas2020asynchronous} to propose a principled solution to this problem.

Edge-computing Recommendation Systems +3

Multiscale Manifold Warping

no code implementations19 Sep 2021 Sridhar Mahadevan, Anup Rao, Georgios Theocharous, Jennifer Healey

Many real-world applications require aligning two temporal sequences, including bioinformatics, handwriting recognition, activity recognition, and human-robot coordination.

Activity Recognition Dynamic Time Warping +2

Reinforcement Learning for Strategic Recommendations

no code implementations15 Sep 2020 Georgios Theocharous, Yash Chandak, Philip S. Thomas, Frits de Nijs

Strategic recommendations (SR) refer to the problem where an intelligent agent observes the sequential behaviors and activities of users and decides when and how to interact with them to optimize some long-term objectives, both for the user and the business.

reinforcement-learning Reinforcement Learning +1

Optimizing for the Future in Non-Stationary MDPs

1 code implementation ICML 2020 Yash Chandak, Georgios Theocharous, Shiv Shankar, Martha White, Sridhar Mahadevan, Philip S. Thomas

Most reinforcement learning methods are based upon the key assumption that the transition dynamics and reward functions are fixed, that is, the underlying Markov decision process is stationary.

Lifelong Learning with a Changing Action Set

1 code implementation5 Jun 2019 Yash Chandak, Georgios Theocharous, Chris Nota, Philip S. Thomas

have been well-studied in the lifelong learning literature, the setting where the action set changes remains unaddressed.

Decision Making Sequential Decision Making

Reinforcement Learning When All Actions are Not Always Available

1 code implementation5 Jun 2019 Yash Chandak, Georgios Theocharous, Blossom Metevier, Philip S. Thomas

The Markov decision process (MDP) formulation used to model many real-world sequential decision making problems does not efficiently capture the setting where the set of available decisions (actions) at each time step is stochastic.

Decision Making reinforcement-learning +3

Learning Action Representations for Reinforcement Learning

no code implementations1 Feb 2019 Yash Chandak, Georgios Theocharous, James Kostas, Scott Jordan, Philip S. Thomas

Most model-free reinforcement learning methods leverage state representations (embeddings) for generalization, but either ignore structure in the space of actions or assume the structure is provided a priori.

reinforcement-learning Reinforcement Learning +1

Scalar Posterior Sampling with Applications

no code implementations NeurIPS 2018 Georgios Theocharous, Zheng Wen, Yasin Abbasi, Nikos Vlassis

Our algorithm termed deterministic schedule PSRL (DS-PSRL) is efficient in terms of time, sample, and space complexity.

Posterior Sampling for Large Scale Reinforcement Learning

no code implementations21 Nov 2017 Georgios Theocharous, Zheng Wen, Yasin Abbasi-Yadkori, Nikos Vlassis

Our algorithm termed deterministic schedule PSRL (DS-PSRL) is efficient in terms of time, sample, and space complexity.

reinforcement-learning Reinforcement Learning +1

Personalized Advertisement Recommendation: A Ranking Approach to Address the Ubiquitous Click Sparsity Problem

no code implementations6 Mar 2016 Sougata Chaudhuri, Georgios Theocharous, Mohammad Ghavamzadeh

We study the problem of personalized advertisement recommendation (PAR), which consist of a user visiting a system (website) and the system displaying one of $K$ ads to the user.

Graphical Model Sketch

no code implementations9 Feb 2016 Branislav Kveton, Hung Bui, Mohammad Ghavamzadeh, Georgios Theocharous, S. Muthukrishnan, Siqi Sun

Graphical models are a popular approach to modeling structured data but they are unsuitable for high-cardinality variables.


Policy Evaluation Using the Ω-Return

no code implementations NeurIPS 2015 Philip S. Thomas, Scott Niekum, Georgios Theocharous, George Konidaris

The benefit of the Ω-return is that it accounts for the correlation of different length returns.

Cannot find the paper you are looking for? You can Submit a new open access paper.