Search Results for author: Alfredo Garcia

Found 19 papers, 4 papers with code

Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment

no code implementations28 May 2024 Jiaxiang Li, Siliang Zeng, Hoi-To Wai, Chenliang Li, Alfredo Garcia, Mingyi Hong

Moreover, we identify a connection between the proposed IRL based approach, and certain self-play approach proposed recently, and showed that self-play is a special case of modeling a reward-learning agent.

reinforcement-learning Reinforcement Learning (RL)

Global Convergence of Decentralized Retraction-Free Optimization on the Stiefel Manifold

no code implementations19 May 2024 Youbang Sun, Shixiang Chen, Alfredo Garcia, Shahin Shahrampour

Many classical and modern machine learning algorithms require solving optimization tasks under orthogonal constraints.

Regularized Q-Learning with Linear Function Approximation

no code implementations26 Jan 2024 Jiachen Xi, Alfredo Garcia, Petar Momcilovic

Several successful reinforcement learning algorithms make use of regularization to promote multi-modal policies that exhibit enhanced exploration and robustness.

Q-Learning

Resolving uncertainty on the fly: Modeling adaptive driving behavior as active inference

no code implementations10 Nov 2023 Johan Engström, Ran Wei, Anthony McDonald, Alfredo Garcia, Matt O'Kelly, Leif Johnson

Understanding adaptive human driving behavior, in particular how drivers manage uncertainty, is of key importance for developing simulated human driver models that can be used in the evaluation and development of autonomous vehicles.

Autonomous Vehicles

A Unified View on Solving Objective Mismatch in Model-Based Reinforcement Learning

1 code implementation10 Oct 2023 Ran Wei, Nathan Lambert, Anthony McDonald, Alfredo Garcia, Roberto Calandra

Model-based Reinforcement Learning (MBRL) aims to make agents more sample-efficient, adaptive, and explainable by learning an explicit model of the environment.

Model-based Reinforcement Learning

An active inference model of car following: Advantages and applications

no code implementations27 Mar 2023 Ran Wei, Anthony D. McDonald, Alfredo Garcia, Gustav Markkula, Johan Engstrom, Matthew O'Kelly

We assessed the proposed model, the Active Inference Driving Agent (AIDA), through a benchmark analysis against the rule-based Intelligent Driver Model, and two neural network Behavior Cloning models.

Decision Making

When Demonstrations Meet Generative World Models: A Maximum Likelihood Framework for Offline Inverse Reinforcement Learning

1 code implementation NeurIPS 2023 Siliang Zeng, Chenliang Li, Alfredo Garcia, Mingyi Hong

Offline inverse reinforcement learning (Offline IRL) aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent.

Autonomous Driving Continuous Control +2

Structural Estimation of Markov Decision Processes in High-Dimensional State Space with Finite-Time Guarantees

no code implementations4 Oct 2022 Siliang Zeng, Mingyi Hong, Alfredo Garcia

Other approaches in the inverse reinforcement learning (IRL) literature emphasize policy estimation at the expense of reduced reward estimation accuracy.

Imitation Learning

Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time Guarantees

no code implementations4 Oct 2022 Siliang Zeng, Chenliang Li, Alfredo Garcia, Mingyi Hong

To reduce the computational burden of a nested loop, novel methods such as SQIL [1] and IQ-Learn [2] emphasize policy estimation at the expense of reward estimation accuracy.

counterfactual Imitation Learning +2

Learning to Coordinate in Multi-Agent Systems: A Coordinated Actor-Critic Algorithm and Finite-Time Guarantees

no code implementations11 Oct 2021 Siliang Zeng, Tianyi Chen, Alfredo Garcia, Mingyi Hong

The flexibility in our design allows the proposed MARL-CAC algorithm to be used in a {\it fully decentralized} setting, where the agents can only communicate with their neighbors, as well as a {\it federated} setting, where the agents occasionally communicate with a server while optimizing their (partially personalized) local models.

Multi-agent Reinforcement Learning

Decentralized Riemannian Gradient Descent on the Stiefel Manifold

1 code implementation14 Feb 2021 Shixiang Chen, Alfredo Garcia, Mingyi Hong, Shahin Shahrampour

The global function is represented as a finite sum of smooth local functions, where each local function is associated with one agent and agents communicate with each other over an undirected connected graph.

Distributed Optimization

On the Local Linear Rate of Consensus on the Stiefel Manifold

no code implementations22 Jan 2021 Shixiang Chen, Alfredo Garcia, Mingyi Hong, Shahin Shahrampour

We study the convergence properties of Riemannian gradient method for solving the consensus problem (for an undirected connected graph) over the Stiefel manifold.

Structural Estimation of Partially Observable Markov Decision Processes

no code implementations2 Aug 2020 Yanling Chang, Alfredo Garcia, Zhide Wang, Lu Sun

In this context, replacement decisions must be made under partial/imperfect information on the true state (i. e. condition of the equipment).

Distributed Networked Learning with Correlated Data

no code implementations28 Oct 2019 Lingzhou Hong, Alfredo Garcia, Ceyhun Eksin

We consider a distributed estimation method in a setting with heterogeneous streams of correlated data distributed across nodes in a network.

Federated Learning

Swarming for Faster Convergence in Stochastic Optimization

no code implementations11 Jun 2018 Shi Pu, Alfredo Garcia

We study a distributed framework for stochastic optimization which is inspired by models of collective motion found in nature (e. g., swarming) with mild communication requirements.

Stochastic Optimization

Zeroth Order Nonconvex Multi-Agent Optimization over Networks

no code implementations27 Oct 2017 Davood Hajinezhad, Mingyi Hong, Alfredo Garcia

In this paper, we consider distributed optimization problems over a multi-agent network, where each agent can only partially evaluate the objective function, and it is allowed to exchange messages with its immediate neighbors.

Distributed Optimization

Cannot find the paper you are looking for? You can Submit a new open access paper.