1 code implementation • 5 Feb 2024 • Shengyi Huang, Quentin Gallouédec, Florian Felten, Antonin Raffin, Rousslan Fernand Julien Dossa, Yanxiao Zhao, Ryan Sullivan, Viktor Makoviychuk, Denys Makoviichuk, Mohamad H. Danesh, Cyril Roumégous, Jiayi Weng, Chufan Chen, Md Masudur Rahman, João G. M. Araújo, Guorui Quan, Daniel Tan, Timo Klein, Rujikorn Charakorn, Mark Towers, Yann Berthelot, Kinal Mehta, Dipam Chakraborty, Arjun KG, Valentin Charraut, Chang Ye, Zichen Liu, Lucas N. Alegre, Alexander Nikulin, Xiao Hu, Tianlin Liu, Jongwook Choi, Brent Yi
As a result, it is usually necessary to reproduce the experiments from scratch, which can be time-consuming and error-prone.
no code implementations • 29 Aug 2023 • Md Masudur Rahman, Yexiang Xue
An additional goal of the generator is to perturb the observation, which maximizes the agent's probability of taking a different action.
no code implementations • 27 Apr 2023 • Md Masudur Rahman, Yexiang Xue
Data augmentation can provide a performance boost to RL agents by mitigating the effect of overfitting.
no code implementations • 2 Feb 2023 • Md Masudur Rahman, Yexiang Xue
Our approach is to estimate the value function from prior computations, such as from the Q-network learned in DQN or the value function trained for different but related environments.
1 code implementation • 14 Dec 2022 • Md Masudur Rahman, Yexiang Xue
We observed that in many settings, RPO increases the policy entropy early in training and then maintains a certain level of entropy throughout the training period.
1 code implementation • 13 Oct 2022 • Md Masudur Rahman, Yexiang Xue
Unlike using data augmentation on the input to learn value and policy function as existing methods use, our method uses data augmentation to compute a bootstrap advantage estimation.
1 code implementation • 15 Jul 2022 • Md Masudur Rahman, Yexiang Xue
Deep Reinforcement Learning (RL) agents often overfit the training environment, leading to poor generalization performance.
no code implementations • 29 Sep 2021 • Md Masudur Rahman, Yexiang Xue
An additional goal of the generator is to perturb the observation, which maximizes the agent's probability of taking a different action.
1 code implementation • 3 Mar 2019 • Naveen Madapana, Md Masudur Rahman, Natalia Sanchez-Tamayo, Mythra V. Balakuntala, Glebys Gonzalez, Jyothsna Padmakumar Bindu, L. N. Vishnunandan Venkatesh, Xingguang Zhang, Juan Barragan Noguera, Thomas Low, Richard Voyles, Yexiang Xue, Juan Wachs
It comprises a set of surgical robotic skills collected during a surgical training task using three robotic platforms: the Taurus II robot, Taurus II simulated robot, and the YuMi robot.
Robotics
no code implementations • 8 Aug 2018 • Md Masudur Rahman, Saikat Chakraborty, Gail Kaiser, Baishakhi Ray
In particular, we analyze two previously proposed tools for project recommendation and bug localization tasks, which leverage diverse software artifacts, and observe that an informed choice of similarity measure indeed leads to improved performance of the existing SE tools.