Search Results for author: Martin Mladenov

Found 21 papers, 3 papers with code

Demystifying Embedding Spaces using Large Language Models

no code implementations6 Oct 2023 Guy Tennenholtz, Yinlam Chow, Chih-Wei Hsu, Jihwan Jeong, Lior Shani, Azamat Tulepbergenov, Deepak Ramachandran, Martin Mladenov, Craig Boutilier

Embeddings have become a pivotal means to represent complex, multi-faceted information about entities, concepts, and relationships in a condensed and useful format.

Dimensionality Reduction Recommendation Systems

Modeling Recommender Ecosystems: Research Challenges at the Intersection of Mechanism Design, Reinforcement Learning and Generative Models

no code implementations8 Sep 2023 Craig Boutilier, Martin Mladenov, Guy Tennenholtz

Modern recommender systems lie at the heart of complex ecosystems that couple the behavior of users, content providers, advertisers, and other actors.

Recommendation Systems

Content Prompting: Modeling Content Provider Dynamics to Improve User Welfare in Recommender Ecosystems

no code implementations2 Sep 2023 Siddharth Prasad, Martin Mladenov, Craig Boutilier

A prompting policy is a sequence of such prompts that is responsive to the dynamics of a provider's beliefs, skills and incentives.

Recommendation Systems

Ranking with Popularity Bias: User Welfare under Self-Amplification Dynamics

no code implementations24 May 2023 Guy Tennenholtz, Martin Mladenov, Nadav Merlis, Robert L. Axtell, Craig Boutilier

We highlight the importance of exploration, not to eliminate popularity bias, but to mitigate its negative impact on welfare.

Reinforcement Learning with History-Dependent Dynamic Contexts

no code implementations4 Feb 2023 Guy Tennenholtz, Nadav Merlis, Lior Shani, Martin Mladenov, Craig Boutilier

We introduce Dynamic Contextual Markov Decision Processes (DCMDPs), a novel reinforcement learning framework for history-dependent environments that generalizes the contextual MDP framework to handle non-Markov environments, where contexts change over time.

reinforcement-learning Reinforcement Learning (RL)

pyRDDLGym: From RDDL to Gym Environments

2 code implementations11 Nov 2022 Ayal Taitler, Michael Gimelfarb, Jihwan Jeong, Sriram Gopalakrishnan, Martin Mladenov, Xiaotian Liu, Scott Sanner

We present pyRDDLGym, a Python framework for auto-generation of OpenAI Gym environments from RDDL declerative description.

OpenAI Gym

Towards Content Provider Aware Recommender Systems: A Simulation Study on the Interplay between User and Provider Utilities

no code implementations6 May 2021 Ruohan Zhan, Konstantina Christakopoulou, Ya Le, Jayden Ooi, Martin Mladenov, Alex Beutel, Craig Boutilier, Ed H. Chi, Minmin Chen

We then build a REINFORCE recommender agent, coined EcoAgent, to optimize a joint objective of user utility and the counterfactual utility lift of the provider associated with the recommended content, which we show to be equivalent to maximizing overall user utility and the utilities of all providers on the platform under some mild assumptions.

counterfactual Recommendation Systems

RecSim NG: Toward Principled Uncertainty Modeling for Recommender Ecosystems

1 code implementation14 Mar 2021 Martin Mladenov, Chih-Wei Hsu, Vihan Jain, Eugene Ie, Christopher Colby, Nicolas Mayoraz, Hubert Pham, Dustin Tran, Ivan Vendrov, Craig Boutilier

The development of recommender systems that optimize multi-turn interaction with users, and model the interactions of different agents (e. g., users, content providers, vendors) in the recommender ecosystem have drawn increasing attention in recent years.

counterfactual Probabilistic Programming +1

Differentiable Meta-Learning of Bandit Policies

no code implementations NeurIPS 2020 Craig Boutilier, Chih-Wei Hsu, Branislav Kveton, Martin Mladenov, Csaba Szepesvari, Manzil Zaheer

Exploration policies in Bayesian bandits maximize the average reward over problem instances drawn from some distribution P. In this work, we learn such policies for an unknown distribution P using samples from P. Our approach is a form of meta-learning and exploits properties of P without making strong assumptions about its form.

Meta-Learning

Optimizing Long-term Social Welfare in Recommender Systems: A Constrained Matching Approach

no code implementations ICML 2020 Martin Mladenov, Elliot Creager, Omer Ben-Porat, Kevin Swersky, Richard Zemel, Craig Boutilier

We develop several scalable techniques to solve the matching problem, and also draw connections to various notions of user regret and fairness, arguing that these outcomes are fairer in a utilitarian sense.

Fairness Recommendation Systems

Meta-Learning Bandit Policies by Gradient Ascent

no code implementations9 Jun 2020 Branislav Kveton, Martin Mladenov, Chih-Wei Hsu, Manzil Zaheer, Csaba Szepesvari, Craig Boutilier

Most bandit policies are designed to either minimize regret in any problem instance, making very few assumptions about the underlying environment, or in a Bayesian sense, assuming a prior distribution over environment parameters.

Meta-Learning Multi-Armed Bandits

Differentiable Bandit Exploration

no code implementations NeurIPS 2020 Craig Boutilier, Chih-Wei Hsu, Branislav Kveton, Martin Mladenov, Csaba Szepesvari, Manzil Zaheer

In this work, we learn such policies for an unknown distribution $\mathcal{P}$ using samples from $\mathcal{P}$.

Meta-Learning

RecSim: A Configurable Simulation Platform for Recommender Systems

1 code implementation11 Sep 2019 Eugene Ie, Chih-Wei Hsu, Martin Mladenov, Vihan Jain, Sanmit Narvekar, Jing Wang, Rui Wu, Craig Boutilier

We propose RecSim, a configurable platform for authoring simulation environments for recommender systems (RSs) that naturally supports sequential interaction with users.

Recommendation Systems reinforcement-learning +1

Advantage Amplification in Slowly Evolving Latent-State Environments

no code implementations29 May 2019 Martin Mladenov, Ofer Meshi, Jayden Ooi, Dale Schuurmans, Craig Boutilier

Latent-state environments with long horizons, such as those faced by recommender systems, pose significant challenges for reinforcement learning (RL).

Recommendation Systems reinforcement-learning +1

Empirical Bayes Regret Minimization

no code implementations4 Apr 2019 Chih-Wei Hsu, Branislav Kveton, Ofer Meshi, Martin Mladenov, Csaba Szepesvari

In this work, we pioneer the idea of algorithm design by minimizing the empirical Bayes regret, the average regret over problem instances sampled from a known distribution.

Lifted Convex Quadratic Programming

no code implementations14 Jun 2016 Martin Mladenov, Leonard Kleinhans, Kristian Kersting

Symmetry is the essential element of lifted inference that has recently demon- strated the possibility to perform very efficient inference in highly-connected, but symmetric probabilistic models models.

The Symbolic Interior Point Method

no code implementations26 May 2016 Martin Mladenov, Vaishak Belle, Kristian Kersting

A recent trend in probabilistic inference emphasizes the codification of models in a formal syntax, with suitable high-level features such as individuals, relations, and connectives, enabling descriptive clarity, succinctness and circumventing the need for the modeler to engineer a custom solver.

Decision Making Descriptive

Relational Linear Programs

no code implementations12 Oct 2014 Kristian Kersting, Martin Mladenov, Pavel Tokmakov

A relational linear program (RLP) is a declarative LP template defining the objective and the constraints through the logical concepts of objects, relations, and quantified variables.

Dimension Reduction via Colour Refinement

no code implementations22 Jul 2013 Martin Grohe, Kristian Kersting, Martin Mladenov, Erkal Selman

We demonstrate empirically that colour refinement can indeed greatly reduce the cost of solving linear programs.

Dimensionality Reduction Isomorphism Testing +1

Cannot find the paper you are looking for? You can Submit a new open access paper.