no code implementations • 26 Feb 2024 • Han Wang, Zhe Fu, Jonathan Lee, Hossein Nick Zinat Matin, Arwa Alanqary, Daniel Urieli, Sharon Hornstein, Abdul Rahman Kreidieh, Raphael Chekroun, William Barbour, William A. Richardson, Dan Work, Benedetto Piccoli, Benjamin Seibold, Jonathan Sprinkle, Alexandre M. Bayen, Maria Laura Delle Monache
The Speed Planner comprises two modules: a TSE enhancement module and a target speed design module.
no code implementations • 8 Feb 2024 • Raphael Chekroun, Han Wang, Jonathan Lee, Marin Toromanoff, Sascha Hornauer, Fabien Moutarde, Maria Laura Delle Monache
Accurate real-time traffic state forecasting plays a pivotal role in traffic control research.
no code implementations • 30 Aug 2023 • Neha Sengupta, Sunil Kumar Sahu, Bokang Jia, Satheesh Katipomu, Haonan Li, Fajri Koto, William Marshall, Gurpreet Gosal, Cynthia Liu, Zhiming Chen, Osama Mohammed Afzal, Samta Kamboj, Onkar Pandit, Rahul Pal, Lalit Pradhan, Zain Muhammad Mujahid, Massa Baali, Xudong Han, Sondos Mahmoud Bsharat, Alham Fikri Aji, Zhiqiang Shen, Zhengzhong Liu, Natalia Vassilieva, Joel Hestness, Andy Hock, Andrew Feldman, Jonathan Lee, Andrew Jackson, Hector Xuguang Ren, Preslav Nakov, Timothy Baldwin, Eric Xing
We release two open versions of the model -- the foundation Jais model, and an instruction-tuned Jais-chat variant -- with the aim of promoting research on Arabic LLMs.
no code implementations • journal 2022 • TING-JEN CHANG, SAMANTHA LEE, Jonathan Lee, AND CHI-JIE LU
In this study, an interval-valued time series forecasting scheme based on probability distribution information features of interval-valued data with machine learning algorithms is proposed to enhance electric power generation forecasting.
no code implementations • 8 Nov 2021 • Aldo Pacchiano, Aadirupa Saha, Jonathan Lee
We consider the problem of preference based reinforcement learning (PbRL), where, unlike traditional reinforcement learning, an agent receives feedback only in terms of a 1 bit (0/1) preference over a trajectory pair instead of absolute rewards for them.
no code implementations • NeurIPS 2021 • Andrea Zanette, Kefan Dong, Jonathan Lee, Emma Brunskill
In the stochastic linear contextual bandit setting there exist several minimax procedures for exploration with policies that are reactive to the data being acquired.
no code implementations • NeurIPS 2021 • Aldo Pacchiano, Jonathan Lee, Peter Bartlett, Ofir Nachum
Since its introduction a decade ago, \emph{relative entropy policy search} (REPS) has demonstrated successful policy learning on a number of simulated and real-world robotic domains, not to mention providing algorithmic components used by many recently proposed reinforcement learning (RL) algorithms.
no code implementations • 22 Sep 2020 • Kushagra Rastogi, Jonathan Lee, Fabrice Harel-Canada, Aditya Joglekar
This work extends the analysis of the theoretical results presented within the paper Is Q-Learning Provably Efficient?
no code implementations • 3 Dec 2019 • Jonathan Lee, Ching-An Cheng, Ken Goldberg, Byron Boots
We prove that there is a fundamental equivalence between achieving sublinear dynamic regret in COL and solving certain EPs, and we present a reduction from dynamic regret to both static regret and convergence rate of the associated EP.
no code implementations • 8 Jul 2019 • Ashwin Balakrishna, Brijen Thananjeyan, Jonathan Lee, Felix Li, Arsh Zahed, Joseph E. Gonzalez, Ken Goldberg
Existing on-policy imitation learning algorithms, such as DAgger, assume access to a fixed supervisor.
no code implementations • 19 Feb 2019 • Ching-An Cheng, Jonathan Lee, Ken Goldberg, Byron Boots
Furthermore, we show for COL a reduction from dynamic regret to both static regret and convergence in the associated EP, allowing us to analyze the dynamic regret of many existing algorithms.
2 code implementations • 27 Mar 2017 • Michael Laskey, Jonathan Lee, Roy Fox, Anca Dragan, Ken Goldberg
One approach to Imitation Learning is Behavior Cloning, in which a robot observes a supervisor and infers a control policy.
no code implementations • 4 Oct 2016 • Michael Laskey, Caleb Chuck, Jonathan Lee, Jeffrey Mahler, Sanjay Krishnan, Kevin Jamieson, Anca Dragan, Ken Goldberg
Although policies learned with RC sampling can be superior to HC sampling for standard learning models such as linear SVMs, policies learned with HC sampling may be comparable with highly-expressive learning models such as deep learning and hyper-parametric decision trees, which have little model error.