Search Results for author: Dimitri Bertsekas

Found 14 papers, 1 papers with code

Most Likely Sequence Generation for $n$-Grams, Transformers, HMMs, and Markov Chains, by Using Rollout Algorithms

no code implementations19 Mar 2024 Yuchao Li, Dimitri Bertsekas

We consider methods for computing word sequences that are highly likely, based on these probabilities.

Approximate Multiagent Reinforcement Learning for On-Demand Urban Mobility Problem on a Large Map (extended version)

no code implementations2 Nov 2023 Daniel Garces, Sushmita Bhattacharya, Dimitri Bertsekas, Stephanie Gil

We provide two main theoretical results: 1) characterize the number of taxis $m$ that is sufficient for IA to be stable; 2) derive a necessary condition on $m$ to maintain stability for IA as time goes to infinity.

Rollout Algorithms and Approximate Dynamic Programming for Bayesian Optimization and Sequential Estimation

no code implementations15 Dec 2022 Dimitri Bertsekas

We provide a unifying approximate dynamic programming framework that applies to a broad variety of problems involving sequential estimation.

Bayesian Optimization

Multiagent Reinforcement Learning for Autonomous Routing and Pickup Problem with Adaptation to Variable Demand

no code implementations28 Nov 2022 Daniel Garces, Sushmita Bhattacharya, Stephanie Gil, Dimitri Bertsekas

We propose a mechanism for switching the originally trained offline approximation when the current demand is outside the original validity region.

Autonomous Vehicles

Reinforcement Learning Methods for Wordle: A POMDP/Adaptive Control Approach

no code implementations15 Nov 2022 Siddhant Bhambri, Amrita Bhattacharjee, Dimitri Bertsekas

In this paper we address the solution of the popular Wordle puzzle, using new reinforcement learning methods, which apply more generally to adaptive control of dynamic systems and to classes of Partially Observable Markov Decision Process (POMDP) problems.

reinforcement-learning Reinforcement Learning (RL)

New Auction Algorithms for Path Planning, Network Transport, and Reinforcement Learning

no code implementations19 Jul 2022 Dimitri Bertsekas

We consider some classical optimization problems in path planning and network transport, and we introduce new auction-based algorithms for their optimal and suboptimal solution.

reinforcement-learning Reinforcement Learning (RL)

Lessons from AlphaZero for Optimal, Model Predictive, and Adaptive Control

no code implementations20 Aug 2021 Dimitri Bertsekas

In this paper we aim to provide analysis and insights (often based on visualization), which explain the beneficial effects of on-line decision making on top of off-line training.

Bayesian Optimization Decision Making +1

On-Line Policy Iteration for Infinite Horizon Dynamic Programming

no code implementations1 Jun 2021 Dimitri Bertsekas

This allows the continuous updating/improvement of the current policy, thus resulting in a form of on-line PI that incorporates the improved controls into the current policy as new states and controls are generated.

Multiagent Value Iteration Algorithms in Dynamic Programming and Reinforcement Learning

no code implementations4 May 2020 Dimitri Bertsekas

In this paper, this result is extended to value iteration and optimistic versions of policy iteration, as well as to more general DP problems where the Bellman operator is a contraction mapping, such as stochastic shortest path problems with all policies being proper.

reinforcement-learning Reinforcement Learning (RL)

Constrained Multiagent Rollout and Multidimensional Assignment with the Auction Algorithm

no code implementations18 Feb 2020 Dimitri Bertsekas

Under suitable assumptions, we show that if the base heuristic produces a feasible solution, the rollout algorithm has a cost improvement property: it produces a feasible solution, whose cost is no worse than the base heuristic's cost.

Combinatorial Optimization

Reinforcement Learning for POMDP: Partitioned Rollout and Policy Iteration with Application to Autonomous Sequential Repair Problems

no code implementations11 Feb 2020 Sushmita Bhattacharya, Sahil Badyal, Thomas Wheeler, Stephanie Gil, Dimitri Bertsekas

In this paper we consider infinite horizon discounted dynamic programming problems with finite state and control spaces, and partial state observations.

Biased Aggregation, Rollout, and Enhanced Policy Improvement for Reinforcement Learning

no code implementations6 Oct 2019 Dimitri Bertsekas

When $V$ is equal to the cost function $J_{\mu}$ of some known policy $\mu$ and there is only one aggregate state, our scheme is equivalent to the rollout algorithm based on $\mu$ (i. e., the result of a single policy improvement starting with the policy $\mu$).

reinforcement-learning Reinforcement Learning (RL)

Multiagent Rollout Algorithms and Reinforcement Learning

1 code implementation30 Sep 2019 Dimitri Bertsekas

The amount of local computation required at every stage by each agent is independent of the number of agents, while the amount of total computation (over all agents) grows linearly with the number of agents.

Computational Efficiency reinforcement-learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.