Search Results for author: Dimitri Bertsekas

Found 14 papers, 1 papers with code

Most Likely Sequence Generation for $n$-Grams, Transformers, HMMs, and Markov Chains, by Using Rollout Algorithms

no code implementations • 19 Mar 2024 • Yuchao Li, Dimitri Bertsekas

We consider methods for computing word sequences that are highly likely, based on these probabilities.

Paper
Add Code

Approximate Multiagent Reinforcement Learning for On-Demand Urban Mobility Problem on a Large Map (extended version)

no code implementations • 2 Nov 2023 • Daniel Garces, Sushmita Bhattacharya, Dimitri Bertsekas, Stephanie Gil

We provide two main theoretical results: 1) characterize the number of taxis $m$ that is sufficient for IA to be stable; 2) derive a necessary condition on $m$ to maintain stability for IA as time goes to infinity.

Paper
Add Code

Rollout Algorithms and Approximate Dynamic Programming for Bayesian Optimization and Sequential Estimation

no code implementations • 15 Dec 2022 • Dimitri Bertsekas

We provide a unifying approximate dynamic programming framework that applies to a broad variety of problems involving sequential estimation.

Bayesian Optimization

Paper
Add Code

Multiagent Reinforcement Learning for Autonomous Routing and Pickup Problem with Adaptation to Variable Demand

no code implementations • 28 Nov 2022 • Daniel Garces, Sushmita Bhattacharya, Stephanie Gil, Dimitri Bertsekas

We propose a mechanism for switching the originally trained offline approximation when the current demand is outside the original validity region.

Autonomous Vehicles

Paper
Add Code

Reinforcement Learning Methods for Wordle: A POMDP/Adaptive Control Approach

no code implementations • 15 Nov 2022 • Siddhant Bhambri, Amrita Bhattacharjee, Dimitri Bertsekas

In this paper we address the solution of the popular Wordle puzzle, using new reinforcement learning methods, which apply more generally to adaptive control of dynamic systems and to classes of Partially Observable Markov Decision Process (POMDP) problems.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

New Auction Algorithms for Path Planning, Network Transport, and Reinforcement Learning

no code implementations • 19 Jul 2022 • Dimitri Bertsekas

We consider some classical optimization problems in path planning and network transport, and we introduce new auction-based algorithms for their optimal and suboptimal solution.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Lessons from AlphaZero for Optimal, Model Predictive, and Adaptive Control

no code implementations • 20 Aug 2021 • Dimitri Bertsekas

In this paper we aim to provide analysis and insights (often based on visualization), which explain the beneficial effects of on-line decision making on top of off-line training.

Bayesian Optimization Decision Making +1

Paper
Add Code

On-Line Policy Iteration for Infinite Horizon Dynamic Programming

no code implementations • 1 Jun 2021 • Dimitri Bertsekas

This allows the continuous updating/improvement of the current policy, thus resulting in a form of on-line PI that incorporates the improved controls into the current policy as new states and controls are generated.

Paper
Add Code

Multiagent Rollout and Policy Iteration for POMDP with Application to Multi-Robot Repair Problems

no code implementations • 9 Nov 2020 • Sushmita Bhattacharya, Siva Kailas, Sahil Badyal, Stephanie Gil, Dimitri Bertsekas

Our methods specifically address the computational challenges of partially observable multiagent problems.

Paper
Add Code

Multiagent Value Iteration Algorithms in Dynamic Programming and Reinforcement Learning

no code implementations • 4 May 2020 • Dimitri Bertsekas

In this paper, this result is extended to value iteration and optimistic versions of policy iteration, as well as to more general DP problems where the Bellman operator is a contraction mapping, such as stochastic shortest path problems with all policies being proper.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Constrained Multiagent Rollout and Multidimensional Assignment with the Auction Algorithm

no code implementations • 18 Feb 2020 • Dimitri Bertsekas

Under suitable assumptions, we show that if the base heuristic produces a feasible solution, the rollout algorithm has a cost improvement property: it produces a feasible solution, whose cost is no worse than the base heuristic's cost.

Combinatorial Optimization

Paper
Add Code

Reinforcement Learning for POMDP: Partitioned Rollout and Policy Iteration with Application to Autonomous Sequential Repair Problems

no code implementations • 11 Feb 2020 • Sushmita Bhattacharya, Sahil Badyal, Thomas Wheeler, Stephanie Gil, Dimitri Bertsekas

In this paper we consider infinite horizon discounted dynamic programming problems with finite state and control spaces, and partial state observations.

Paper
Add Code

Biased Aggregation, Rollout, and Enhanced Policy Improvement for Reinforcement Learning

no code implementations • 6 Oct 2019 • Dimitri Bertsekas

When $V$ is equal to the cost function $J_{\mu}$ of some known policy $\mu$ and there is only one aggregate state, our scheme is equivalent to the rollout algorithm based on $\mu$ (i. e., the result of a single policy improvement starting with the policy $\mu$).

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Multiagent Rollout Algorithms and Reinforcement Learning

1 code implementation • 30 Sep 2019 • Dimitri Bertsekas

The amount of local computation required at every stage by each agent is independent of the number of agents, while the amount of total computation (over all agents) grows linearly with the number of agents.

Computational Efficiency reinforcement-learning +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.