no code implementations • NeurIPS 2017 • Yi Ouyang, Mukul Gagrani, Ashutosh Nayyar, Rahul Jain
This regret bound matches the best available bound for weakly communicating MDPs.
no code implementations • 24 May 2019 • Yi Ouyang, Bin Guo, Xing Tang, Xiuqiang He, Jian Xiong, Zhiwen Yu
In fact, user's behaviors from different domains regarding the same items are usually relevant.
no code implementations • 17 Oct 2019 • Huidong Gao, Yi Ouyang, Masayoshi Tomizuka
In this paper, we propose a combined prediction model and an online learning framework for planar push prediction.
1 code implementation • 19 Nov 2019 • Yi Ouyang, Richard Y. Zhang, Javad Lavaei, Pravin Varaiya
The offset optimization problem seeks to coordinate and synchronize the timing of traffic signals throughout a network in order to enhance traffic flow and reduce stops and delays.
Optimization and Control Systems and Control Systems and Control
no code implementations • 9 Dec 2019 • Aaron Havens, Yi Ouyang, Prabhat Nagarajan, Yasuhiro Fujita
The latent representation is learned exclusively from multi-step reward prediction which we show to be the only necessary information for successful planning.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • 27 Jan 2020 • Seyed Mohammad Asghari, Yi Ouyang, Ashutosh Nayyar
This allows the agents to achieve a regret within $O(\sqrt{T})$ of the regret of the auxiliary single-agent problem.
no code implementations • 20 Feb 2020 • Chao-Han Huck Yang, Jun Qi, Pin-Yu Chen, Yi Ouyang, I-Te Danny Hung, Chin-Hui Lee, Xiaoli Ma
Recent deep neural networks based techniques, especially those equipped with the ability of self-adaptation in the system level such as deep reinforcement learning (DRL), are shown to possess many advantages of optimizing robot learning systems (e. g., autonomous navigation and continuous robot arm control.)
no code implementations • 9 Nov 2020 • Mukul Gagrani, Sagar Sudhakara, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang
We consider optimal control of an unknown multi-agent linear quadratic (LQ) system where the dynamics and the cost are coupled across the agents through the mean-field (i. e., empirical mean) of the states and controls.
no code implementations • 24 Nov 2020 • Daisuke Nishiyama, Mario Ynocente Castro, Shirou Maruyama, Shinya Shiroshita, Karim Hamzaoui, Yi Ouyang, Guy Rosman, Jonathan DeCastro, Kuan-Hui Lee, Adrien Gaidon
Automated Vehicles require exhaustive testing in simulation to detect as many safety-critical failures as possible before deployment on public roads.
no code implementations • 1 Jan 2021 • Shin-ichi Maeda, Hayato Watahiki, Yi Ouyang, Shintarou Okada, Masanori Koyama
In this study, we consider a situation in which the agent has access to the generative model which provides us with a next state sample for any given state-action pair, and propose a model to solve a CMDP problem by decomposing the CMDP into a pair of MDPs; \textit{reconnaissance} MDP (R-MDP) and \textit{planning} MDP (P-MDP).
1 code implementation • 18 Feb 2021 • Chao-Han Huck Yang, I-Te Danny Hung, Yi Ouyang, Pin-Yu Chen
Deep reinforcement learning (DRL) has demonstrated impressive performance in various gaming simulators and real-world applications.
no code implementations • 18 Aug 2021 • Sagar Sudhakara, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang
We consider the problem of controlling an unknown linear quadratic Gaussian (LQG) system consisting of multiple subsystems connected over a network.
no code implementations • 19 Aug 2021 • Mukul Gagrani, Sagar Sudhakara, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang
The regret bound of the algorithm was derived under a technical assumption on the induced norm of the closed loop system.
no code implementations • 15 Aug 2022 • Gedi Liu, Yifeng Jiang, Yi Ouyang, Keyang Zhong, Yang Wang
Time series underwent the transition from statistics to deep learning, as did many other machine learning fields.
1 code implementation • AAAI 2023 • Sheng Xiang, Mingzhi Zhu, Dawei Cheng, Enxia Li, Ruihui Zhao, Yi Ouyang, Ling Chen, Yefeng Zheng
Then we pass messages among the nodes through a Gated Temporal Attention Network (GTAN) to learn the transaction representation.
Ranked #1 on Node Classification on Amazon-Fraud
1 code implementation • 19 Dec 2023 • Yi Cheng, Wenge Liu, Jian Wang, Chak Tou Leong, Yi Ouyang, Wenjie Li, Xian Wu, Yefeng Zheng
In recent years, there has been a growing interest in exploring dialogues with more complex goals, such as negotiation, persuasion, and emotional support, which go beyond traditional service-focused dialogue systems.
no code implementations • 13 Feb 2024 • Berk Bozkurt, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang
How well does an optimal policy $\hat{\pi}^{\star}$ of the approximate model perform when used in the original model $\mathcal{M}$?
no code implementations • 15 Mar 2024 • Rui Zhang, Dawei Cheng, Xin Liu, Jie Yang, Yi Ouyang, Xian Wu, Yefeng Zheng
We find that in graph anomaly detection, the homophily distribution differences between different classes are significantly greater than those in homophilic and heterophilic graphs.