Search Results for author: Martin Mladenov

Found 21 papers, 3 papers with code

Demystifying Embedding Spaces using Large Language Models

no code implementations • 6 Oct 2023 • Guy Tennenholtz, Yinlam Chow, Chih-Wei Hsu, Jihwan Jeong, Lior Shani, Azamat Tulepbergenov, Deepak Ramachandran, Martin Mladenov, Craig Boutilier

Embeddings have become a pivotal means to represent complex, multi-faceted information about entities, concepts, and relationships in a condensed and useful format.

Dimensionality Reduction Recommendation Systems

Paper
Add Code

Modeling Recommender Ecosystems: Research Challenges at the Intersection of Mechanism Design, Reinforcement Learning and Generative Models

no code implementations • 8 Sep 2023 • Craig Boutilier, Martin Mladenov, Guy Tennenholtz

Modern recommender systems lie at the heart of complex ecosystems that couple the behavior of users, content providers, advertisers, and other actors.

Recommendation Systems

Paper
Add Code

Content Prompting: Modeling Content Provider Dynamics to Improve User Welfare in Recommender Ecosystems

no code implementations • 2 Sep 2023 • Siddharth Prasad, Martin Mladenov, Craig Boutilier

A prompting policy is a sequence of such prompts that is responsive to the dynamics of a provider's beliefs, skills and incentives.

Recommendation Systems

Paper
Add Code

Ranking with Popularity Bias: User Welfare under Self-Amplification Dynamics

no code implementations • 24 May 2023 • Guy Tennenholtz, Martin Mladenov, Nadav Merlis, Robert L. Axtell, Craig Boutilier

We highlight the importance of exploration, not to eliminate popularity bias, but to mitigate its negative impact on welfare.

Paper
Add Code

Reinforcement Learning with History-Dependent Dynamic Contexts

no code implementations • 4 Feb 2023 • Guy Tennenholtz, Nadav Merlis, Lior Shani, Martin Mladenov, Craig Boutilier

We introduce Dynamic Contextual Markov Decision Processes (DCMDPs), a novel reinforcement learning framework for history-dependent environments that generalizes the contextual MDP framework to handle non-Markov environments, where contexts change over time.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

pyRDDLGym: From RDDL to Gym Environments

2 code implementations • 11 Nov 2022 • Ayal Taitler, Michael Gimelfarb, Jihwan Jeong, Sriram Gopalakrishnan, Martin Mladenov, Xiaotian Liu, Scott Sanner

We present pyRDDLGym, a Python framework for auto-generation of OpenAI Gym environments from RDDL declerative description.

OpenAI Gym

Paper
Code

Towards Content Provider Aware Recommender Systems: A Simulation Study on the Interplay between User and Provider Utilities

no code implementations • 6 May 2021 • Ruohan Zhan, Konstantina Christakopoulou, Ya Le, Jayden Ooi, Martin Mladenov, Alex Beutel, Craig Boutilier, Ed H. Chi, Minmin Chen

We then build a REINFORCE recommender agent, coined EcoAgent, to optimize a joint objective of user utility and the counterfactual utility lift of the provider associated with the recommended content, which we show to be equivalent to maximizing overall user utility and the utilities of all providers on the platform under some mild assumptions.

counterfactual Recommendation Systems

Paper
Add Code

RecSim NG: Toward Principled Uncertainty Modeling for Recommender Ecosystems

1 code implementation • 14 Mar 2021 • Martin Mladenov, Chih-Wei Hsu, Vihan Jain, Eugene Ie, Christopher Colby, Nicolas Mayoraz, Hubert Pham, Dustin Tran, Ivan Vendrov, Craig Boutilier

The development of recommender systems that optimize multi-turn interaction with users, and model the interactions of different agents (e. g., users, content providers, vendors) in the recommender ecosystem have drawn increasing attention in recent years.

counterfactual Probabilistic Programming +1

115

Paper
Code

Meta-Thompson Sampling

no code implementations • 11 Feb 2021 • Branislav Kveton, Mikhail Konobeev, Manzil Zaheer, Chih-Wei Hsu, Martin Mladenov, Craig Boutilier, Csaba Szepesvari

Efficient exploration in bandits is a fundamental online learning problem.

Efficient Exploration Meta-Learning +2

Paper
Add Code

Differentiable Meta-Learning of Bandit Policies

no code implementations • NeurIPS 2020 • Craig Boutilier, Chih-Wei Hsu, Branislav Kveton, Martin Mladenov, Csaba Szepesvari, Manzil Zaheer

Exploration policies in Bayesian bandits maximize the average reward over problem instances drawn from some distribution P. In this work, we learn such policies for an unknown distribution P using samples from P. Our approach is a form of meta-learning and exploits properties of P without making strong assumptions about its form.

Meta-Learning

Paper
Add Code

Optimizing Long-term Social Welfare in Recommender Systems: A Constrained Matching Approach

no code implementations • ICML 2020 • Martin Mladenov, Elliot Creager, Omer Ben-Porat, Kevin Swersky, Richard Zemel, Craig Boutilier

We develop several scalable techniques to solve the matching problem, and also draw connections to various notions of user regret and fairness, arguing that these outcomes are fairer in a utilitarian sense.

Fairness Recommendation Systems

Paper
Add Code

Meta-Learning Bandit Policies by Gradient Ascent

no code implementations • 9 Jun 2020 • Branislav Kveton, Martin Mladenov, Chih-Wei Hsu, Manzil Zaheer, Csaba Szepesvari, Craig Boutilier

Most bandit policies are designed to either minimize regret in any problem instance, making very few assumptions about the underlying environment, or in a Bayesian sense, assuming a prior distribution over environment parameters.

Meta-Learning Multi-Armed Bandits

Paper
Add Code

Differentiable Bandit Exploration

no code implementations • NeurIPS 2020 • Craig Boutilier, Chih-Wei Hsu, Branislav Kveton, Martin Mladenov, Csaba Szepesvari, Manzil Zaheer

In this work, we learn such policies for an unknown distribution $\mathcal{P}$ using samples from $\mathcal{P}$.

Meta-Learning

Paper
Add Code

RecSim: A Configurable Simulation Platform for Recommender Systems

1 code implementation • 11 Sep 2019 • Eugene Ie, Chih-Wei Hsu, Martin Mladenov, Vihan Jain, Sanmit Narvekar, Jing Wang, Rui Wu, Craig Boutilier

We propose RecSim, a configurable platform for authoring simulation environments for recommender systems (RSs) that naturally supports sequential interaction with users.

Recommendation Systems reinforcement-learning +1

719

Paper
Code

Advantage Amplification in Slowly Evolving Latent-State Environments

no code implementations • 29 May 2019 • Martin Mladenov, Ofer Meshi, Jayden Ooi, Dale Schuurmans, Craig Boutilier

Latent-state environments with long horizons, such as those faced by recommender systems, pose significant challenges for reinforcement learning (RL).

Recommendation Systems reinforcement-learning +1

Paper
Add Code

Empirical Bayes Regret Minimization

no code implementations • 4 Apr 2019 • Chih-Wei Hsu, Branislav Kveton, Ofer Meshi, Martin Mladenov, Csaba Szepesvari

In this work, we pioneer the idea of algorithm design by minimizing the empirical Bayes regret, the average regret over problem instances sampled from a known distribution.

Paper
Add Code

Planning and Learning with Stochastic Action Sets

no code implementations • 7 May 2018 • Craig Boutilier, Alon Cohen, Amit Daniely, Avinatan Hassidim, Yishay Mansour, Ofer Meshi, Martin Mladenov, Dale Schuurmans

From an RL perspective, we show that Q-learning with sampled action sets is sound.

Q-Learning Reinforcement Learning (RL)

Paper
Add Code

Lifted Convex Quadratic Programming

no code implementations • 14 Jun 2016 • Martin Mladenov, Leonard Kleinhans, Kristian Kersting

Symmetry is the essential element of lifted inference that has recently demon- strated the possibility to perform very efficient inference in highly-connected, but symmetric probabilistic models models.

Paper
Add Code

The Symbolic Interior Point Method

no code implementations • 26 May 2016 • Martin Mladenov, Vaishak Belle, Kristian Kersting

A recent trend in probabilistic inference emphasizes the codification of models in a formal syntax, with suitable high-level features such as individuals, relations, and connectives, enabling descriptive clarity, succinctness and circumventing the need for the modeler to engineer a custom solver.

Decision Making Descriptive

Paper
Add Code

Relational Linear Programs

no code implementations • 12 Oct 2014 • Kristian Kersting, Martin Mladenov, Pavel Tokmakov

A relational linear program (RLP) is a declarative LP template defining the objective and the constraints through the logical concepts of objects, relations, and quantified variables.

Paper
Add Code

Dimension Reduction via Colour Refinement

no code implementations • 22 Jul 2013 • Martin Grohe, Kristian Kersting, Martin Mladenov, Erkal Selman

We demonstrate empirically that colour refinement can indeed greatly reduce the cost of solving linear programs.

Dimensionality Reduction Isomorphism Testing +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.