1 code implementation • 28 Mar 2024 • Pingcheng Dong, Yonghao Tan, Dong Zhang, Tianwei Ni, Xuejiao Liu, Yu Liu, Peng Luo, Luhong Liang, Shih-Yang Liu, Xijie Huang, Huaiyu Zhu, Yun Pan, Fengwei An, Kwang-Ting Cheng
Non-linear functions are prevalent in Transformers and their lightweight variants, incurring substantial and frequently underestimated hardware costs.
no code implementations • 7 Feb 2024 • Michel Ma, Tianwei Ni, Clement Gehring, Pierluca D'Oro, Pierre-Luc Bacon
We integrate such AWMs into a policy gradient framework that underscores the relationship between network architectures and the policy gradient updates they inherently represent.
1 code implementation • 17 Jan 2024 • Tianwei Ni, Benjamin Eysenbach, Erfan Seyedsalehi, Michel Ma, Clement Gehring, Aditya Mahajan, Pierre-Luc Bacon
These findings culminate in a set of preliminary guidelines for RL practitioners.
2 code implementations • NeurIPS 2023 • Tianwei Ni, Michel Ma, Benjamin Eysenbach, Pierre-Luc Bacon
The Transformer architecture has been very successful to solve problems that involve long-term dependencies, including in the RL domain.
1 code implementation • 17 Dec 2021 • Tianwei Ni, Kiana Ehsani, Luca Weihs, Jordi Salvador
In this paper, we study the problem of training agents to complete the task of visual mobile manipulation in the ManipulaTHOR environment while avoiding unnecessary collision (disturbance) with objects.
2 code implementations • 11 Oct 2021 • Tianwei Ni, Benjamin Eysenbach, Ruslan Salakhutdinov
However, prior work has found that such recurrent model-free RL methods tend to perform worse than more specialized algorithms that are designed for specific types of POMDPs.
no code implementations • 7 Mar 2021 • Tianwei Ni, Huao Li, Siddharth Agrawal, Suhas Raja, Fan Jia, Yikang Gui, Dana Hughes, Michael Lewis, Katia Sycara
Previous human-human team research have shown complementary policies in TSF game and diversity in human players' skill, which encourages us to relax the assumptions on human policy.
1 code implementation • 9 Nov 2020 • Tianwei Ni, Harshit Sikchi, YuFei Wang, Tejus Gupta, Lisa Lee, Benjamin Eysenbach
Our method outperforms adversarial imitation learning methods in terms of sample efficiency and the required number of expert trajectories on IRL benchmarks.
1 code implementation • 3 Jul 2020 • Yufei Wang, Tianwei Ni
Our method is built upon the Soft Actor-Critic (SAC) algorithm, which uses an "entropy temperature" that balances the original task reward and the policy entropy, and hence controls the trade-off between exploitation and exploration.
2 code implementations • CVPR 2019 • Tianwei Ni, Lingxi Xie, Huangjie Zheng, Elliot K. Fishman, Alan L. Yuille
The key observation is that, although the object is a 3D volume, what we really need in segmentation is to find its boundary which is a 2D surface.
no code implementations • 28 Nov 2018 • Huangjie Zheng, Lingxi Xie, Tianwei Ni, Ya zhang, Yan-Feng Wang, Qi Tian, Elliot K. Fishman, Alan L. Yuille
However, in medical image analysis, fusing prediction from two phases is often difficult, because (i) there is a domain gap between two phases, and (ii) the semantic labels are not pixel-wise corresponded even for images scanned from the same patient.