no code implementations • 18 Mar 2024 • Yizheng Wang, Xiang Li, Ziming Yan, Yuqing Du, Jinshuai Bai, Bokai Liu, Timon Rabczuk, Yinghua Liu
Homogenization is an essential tool for studying multiscale physical phenomena.
no code implementations • 7 Mar 2024 • Alex Havrilla, Yuqing Du, Sharath Chandra Raparthy, Christoforos Nalmpantis, Jane Dwivedi-Yu, Maksym Zhuravinskyi, Eric Hambro, Sainbayar Sukhbaatar, Roberta Raileanu
Surprisingly, we find the sample complexity of Expert Iteration is similar to that of PPO, requiring at most on the order of $10^6$ samples to converge from a pretrained checkpoint.
no code implementations • 31 Jul 2023 • Jessy Lin, Yuqing Du, Olivia Watkins, Danijar Hafner, Pieter Abbeel, Dan Klein, Anca Dragan
To interact with humans in the world, agents need to understand the diverse types of language that people use, relate them to the visual world, and act based on them.
2 code implementations • 25 May 2023 • Ying Fan, Olivia Watkins, Yuqing Du, Hao liu, MoonKyung Ryu, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Kangwook Lee, Kimin Lee
We focus on diffusion models, defining the fine-tuning task as an RL problem, and updating the pre-trained text-to-image diffusion models using policy gradient to maximize the feedback-trained reward.
no code implementations • 13 Mar 2023 • Yuqing Du, Ksenia Konyushkova, Misha Denil, Akhil Raju, Jessica Landon, Felix Hill, Nando de Freitas, Serkan Cabi
Detecting successful behaviour is crucial for training intelligent agents.
no code implementations • 23 Feb 2023 • Kimin Lee, Hao liu, MoonKyung Ryu, Olivia Watkins, Yuqing Du, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Shixiang Shane Gu
Our results demonstrate the potential for learning from human feedback to significantly improve text-to-image models.
1 code implementation • 13 Feb 2023 • Yuqing Du, Olivia Watkins, Zihan Wang, Cédric Colas, Trevor Darrell, Pieter Abbeel, Abhishek Gupta, Jacob Andreas
Reinforcement learning algorithms typically struggle in the absence of a dense, well-shaped reward function.
no code implementations • 22 Feb 2022 • Yuqing Du, Pieter Abbeel, Aditya Grover
Training such agents efficiently requires automatic generation of a goal curriculum.
no code implementations • 15 Feb 2022 • Yuqing Du, Daniel Ho, Alexander A. Alemi, Eric Jang, Mohi Khansari
In this work we investigate and demonstrate benefits of a Bayesian approach to imitation learning from multiple sensor inputs, as applied to the task of opening office doors with a mobile manipulator.
no code implementations • 3 Feb 2022 • Mohi Khansari, Daniel Ho, Yuqing Du, Armando Fuentes, Matthew Bennice, Nicolas Sievers, Sean Kirmani, Yunfei Bai, Eric Jang
To the best of our knowledge, this is the first work to tackle latched door opening from a purely end-to-end learning approach, where the task of navigation and manipulation are jointly modeled by a single neural network.
1 code implementation • ICLR 2022 • Yuqing Du, Pieter Abbeel, Aditya Grover
We are interested in training general-purpose reinforcement learning agents that can solve a wide variety of goals.
1 code implementation • 15 Apr 2021 • Yuqing Du, Olivia Watkins, Trevor Darrell, Pieter Abbeel, Deepak Pathak
Policies trained in simulation often fail when transferred to the real world due to the `reality gap' where the simulator is unable to accurately capture the dynamics and visual properties of the real world.
no code implementations • 24 Feb 2021 • Qunsong Zeng, Yuqing Du, Kaibin Huang
To derive guidelines on deploying the resultant wirelessly powered FEEL (WP-FEEL) system, this work aims at the derivation of the tradeoff between the model convergence and the settings of power sources in two scenarios: 1) the transmission power and density of power-beacons (dedicated charging stations) if they are deployed, or otherwise 2) the transmission power of a server (access-point).
1 code implementation • 4 Aug 2020 • Eugene Vinitsky, Yuqing Du, Kanaad Parvate, Kathy Jang, Pieter Abbeel, Alexandre Bayen
Reinforcement Learning (RL) is an effective tool for controller design but can struggle with issues of robustness, failing catastrophically when the underlying system dynamics are perturbed.
Out-of-Distribution Generalization reinforcement-learning +1
no code implementations • 14 Jul 2020 • Qunsong Zeng, Yuqing Du, Kaibin Huang, Kin K. Leung
Among others, the framework of federated edge learning (FEEL) is popular for its data-privacy preservation.
Information Theory Signal Processing Information Theory
1 code implementation • NeurIPS 2020 • Yuqing Du, Stas Tiomkin, Emre Kiciman, Daniel Polani, Pieter Abbeel, Anca Dragan
One difficulty in using artificial agents for human-assistive applications lies in the challenge of accurately assisting with a person's goal(s).
no code implementations • 16 Jan 2020 • Guangxu Zhu, Yuqing Du, Deniz Gunduz, Kaibin Huang
We provide a comprehensive analysis of the effects of wireless channel hostilities (channel noise, fading, and channel estimation errors) on the convergence rate of the proposed FEEL scheme.
Information Theory Distributed, Parallel, and Cluster Computing Networking and Internet Architecture Signal Processing Information Theory
no code implementations • 3 Dec 2019 • Qiao Lan, Zezhong Zhang, Yuqing Du, Zhenyi Lin, Kaibin Huang
The main theme in the area is to design new communication techniques and protocols for efficient implementation of different distributed learning frameworks (i. e., federated learning) in wireless networks.
Information Theory Signal Processing Information Theory
no code implementations • 9 Oct 2019 • Yuqing Du, Sheng Yang, Kaibin Huang
First, the framework features a practical hierarchical architecture for decomposing the stochastic gradient into its norm and normalized block gradients, and efficiently quantizes them using a uniform quantizer and a low-dimensional codebook on a Grassmann manifold, respectively.
no code implementations • 13 Jul 2019 • Qunsong Zeng, Yuqing Du, Kin K. Leung, Kaibin Huang
To reduce devices' energy consumption, we propose energy-efficient strategies for bandwidth allocation and scheduling.
no code implementations • 2 Sep 2018 • Guangxu Zhu, Dongzhu Liu, Yuqing Du, Changsheng You, Jun Zhang, Kaibin Huang
Accordingly, a new research area, called edge learning, emerges, which crosses and revolutionizes two disciplines: wireless communication and machine learning.