Search Results for author: Tianwei Ni

Found 11 papers, 8 papers with code

Do Transformer World Models Give Better Policy Gradients?

no code implementations7 Feb 2024 Michel Ma, Tianwei Ni, Clement Gehring, Pierluca D'Oro, Pierre-Luc Bacon

We integrate such AWMs into a policy gradient framework that underscores the relationship between network architectures and the policy gradient updates they inherently represent.

Navigate

When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment

2 code implementations NeurIPS 2023 Tianwei Ni, Michel Ma, Benjamin Eysenbach, Pierre-Luc Bacon

The Transformer architecture has been very successful to solve problems that involve long-term dependencies, including in the RL domain.

Reinforcement Learning (RL)

Towards Disturbance-Free Visual Mobile Manipulation

1 code implementation17 Dec 2021 Tianwei Ni, Kiana Ehsani, Luca Weihs, Jordi Salvador

In this paper, we study the problem of training agents to complete the task of visual mobile manipulation in the ManipulaTHOR environment while avoiding unnecessary collision (disturbance) with objects.

Collision Avoidance Knowledge Distillation +1

Recurrent Model-Free RL Can Be a Strong Baseline for Many POMDPs

2 code implementations11 Oct 2021 Tianwei Ni, Benjamin Eysenbach, Ruslan Salakhutdinov

However, prior work has found that such recurrent model-free RL methods tend to perform worse than more specialized algorithms that are designed for specific types of POMDPs.

Adaptive Agent Architecture for Real-time Human-Agent Teaming

no code implementations7 Mar 2021 Tianwei Ni, Huao Li, Siddharth Agrawal, Suhas Raja, Fan Jia, Yikang Gui, Dana Hughes, Michael Lewis, Katia Sycara

Previous human-human team research have shown complementary policies in TSF game and diversity in human players' skill, which encourages us to relax the assumptions on human policy.

Space Fortress

f-IRL: Inverse Reinforcement Learning via State Marginal Matching

1 code implementation9 Nov 2020 Tianwei Ni, Harshit Sikchi, YuFei Wang, Tejus Gupta, Lisa Lee, Benjamin Eysenbach

Our method outperforms adversarial imitation learning methods in terms of sample efficiency and the required number of expert trajectories on IRL benchmarks.

Imitation Learning reinforcement-learning +1

Meta-SAC: Auto-tune the Entropy Temperature of Soft Actor-Critic via Metagradient

1 code implementation3 Jul 2020 Yufei Wang, Tianwei Ni

Our method is built upon the Soft Actor-Critic (SAC) algorithm, which uses an "entropy temperature" that balances the original task reward and the policy entropy, and hence controls the trade-off between exploitation and exploration.

Benchmarking

Elastic Boundary Projection for 3D Medical Image Segmentation

2 code implementations CVPR 2019 Tianwei Ni, Lingxi Xie, Huangjie Zheng, Elliot K. Fishman, Alan L. Yuille

The key observation is that, although the object is a 3D volume, what we really need in segmentation is to find its boundary which is a 2D surface.

3D Medical Imaging Segmentation Image Segmentation +3

Phase Collaborative Network for Two-Phase Medical Image Segmentation

no code implementations28 Nov 2018 Huangjie Zheng, Lingxi Xie, Tianwei Ni, Ya zhang, Yan-Feng Wang, Qi Tian, Elliot K. Fishman, Alan L. Yuille

However, in medical image analysis, fusing prediction from two phases is often difficult, because (i) there is a domain gap between two phases, and (ii) the semantic labels are not pixel-wise corresponded even for images scanned from the same patient.

Image Segmentation Medical Image Segmentation +3

Cannot find the paper you are looking for? You can Submit a new open access paper.