Search Results for author: Yuqing Du

Found 21 papers, 6 papers with code

Teaching Large Language Models to Reason with Reinforcement Learning

no code implementations7 Mar 2024 Alex Havrilla, Yuqing Du, Sharath Chandra Raparthy, Christoforos Nalmpantis, Jane Dwivedi-Yu, Maksym Zhuravinskyi, Eric Hambro, Sainbayar Sukhbaatar, Roberta Raileanu

Surprisingly, we find the sample complexity of Expert Iteration is similar to that of PPO, requiring at most on the order of $10^6$ samples to converge from a pretrained checkpoint.

reinforcement-learning

Learning to Model the World with Language

no code implementations31 Jul 2023 Jessy Lin, Yuqing Du, Olivia Watkins, Danijar Hafner, Pieter Abbeel, Dan Klein, Anca Dragan

To interact with humans in the world, agents need to understand the diverse types of language that people use, relate them to the visual world, and act based on them.

Future prediction General Knowledge +1

DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models

2 code implementations25 May 2023 Ying Fan, Olivia Watkins, Yuqing Du, Hao liu, MoonKyung Ryu, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Kangwook Lee, Kimin Lee

We focus on diffusion models, defining the fine-tuning task as an RL problem, and updating the pre-trained text-to-image diffusion models using policy gradient to maximize the feedback-trained reward.

reinforcement-learning Reinforcement Learning (RL)

Aligning Text-to-Image Models using Human Feedback

no code implementations23 Feb 2023 Kimin Lee, Hao liu, MoonKyung Ryu, Olivia Watkins, Yuqing Du, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Shixiang Shane Gu

Our results demonstrate the potential for learning from human feedback to significantly improve text-to-image models.

Image Generation

It Takes Four to Tango: Multiagent Selfplay for Automatic Curriculum Generation

no code implementations22 Feb 2022 Yuqing Du, Pieter Abbeel, Aditya Grover

Training such agents efficiently requires automatic generation of a goal curriculum.

Bayesian Imitation Learning for End-to-End Mobile Manipulation

no code implementations15 Feb 2022 Yuqing Du, Daniel Ho, Alexander A. Alemi, Eric Jang, Mohi Khansari

In this work we investigate and demonstrate benefits of a Bayesian approach to imitation learning from multiple sensor inputs, as applied to the task of opening office doors with a mobile manipulator.

Imitation Learning

Practical Imitation Learning in the Real World via Task Consistency Loss

no code implementations3 Feb 2022 Mohi Khansari, Daniel Ho, Yuqing Du, Armando Fuentes, Matthew Bennice, Nicolas Sievers, Sean Kirmani, Yunfei Bai, Eric Jang

To the best of our knowledge, this is the first work to tackle latched door opening from a purely end-to-end learning approach, where the task of navigation and manipulation are jointly modeled by a single neural network.

Domain Adaptation Imitation Learning

It Takes Four to Tango: Multiagent Self Play for Automatic Curriculum Generation

1 code implementation ICLR 2022 Yuqing Du, Pieter Abbeel, Aditya Grover

We are interested in training general-purpose reinforcement learning agents that can solve a wide variety of goals.

Auto-Tuned Sim-to-Real Transfer

1 code implementation15 Apr 2021 Yuqing Du, Olivia Watkins, Trevor Darrell, Pieter Abbeel, Deepak Pathak

Policies trained in simulation often fail when transferred to the real world due to the `reality gap' where the simulator is unable to accurately capture the dynamics and visual properties of the real world.

Wirelessly Powered Federated Edge Learning: Optimal Tradeoffs Between Convergence and Power Transfer

no code implementations24 Feb 2021 Qunsong Zeng, Yuqing Du, Kaibin Huang

To derive guidelines on deploying the resultant wirelessly powered FEEL (WP-FEEL) system, this work aims at the derivation of the tradeoff between the model convergence and the settings of power sources in two scenarios: 1) the transmission power and density of power-beacons (dedicated charging stations) if they are deployed, or otherwise 2) the transmission power of a server (access-point).

Robust Reinforcement Learning using Adversarial Populations

1 code implementation4 Aug 2020 Eugene Vinitsky, Yuqing Du, Kanaad Parvate, Kathy Jang, Pieter Abbeel, Alexandre Bayen

Reinforcement Learning (RL) is an effective tool for controller design but can struggle with issues of robustness, failing catastrophically when the underlying system dynamics are perturbed.

Out-of-Distribution Generalization reinforcement-learning +1

Energy-Efficient Resource Management for Federated Edge Learning with CPU-GPU Heterogeneous Computing

no code implementations14 Jul 2020 Qunsong Zeng, Yuqing Du, Kaibin Huang, Kin K. Leung

Among others, the framework of federated edge learning (FEEL) is popular for its data-privacy preservation.

Information Theory Signal Processing Information Theory

AvE: Assistance via Empowerment

1 code implementation NeurIPS 2020 Yuqing Du, Stas Tiomkin, Emre Kiciman, Daniel Polani, Pieter Abbeel, Anca Dragan

One difficulty in using artificial agents for human-assistive applications lies in the challenge of accurately assisting with a person's goal(s).

One-Bit Over-the-Air Aggregation for Communication-Efficient Federated Edge Learning: Design and Convergence Analysis

no code implementations16 Jan 2020 Guangxu Zhu, Yuqing Du, Deniz Gunduz, Kaibin Huang

We provide a comprehensive analysis of the effects of wireless channel hostilities (channel noise, fading, and channel estimation errors) on the convergence rate of the proposed FEEL scheme.

Information Theory Distributed, Parallel, and Cluster Computing Networking and Internet Architecture Signal Processing Information Theory

An Introduction to Communication Efficient Edge Machine Learning

no code implementations3 Dec 2019 Qiao Lan, Zezhong Zhang, Yuqing Du, Zhenyi Lin, Kaibin Huang

The main theme in the area is to design new communication techniques and protocols for efficient implementation of different distributed learning frameworks (i. e., federated learning) in wireless networks.

Information Theory Signal Processing Information Theory

High-Dimensional Stochastic Gradient Quantization for Communication-Efficient Edge Learning

no code implementations9 Oct 2019 Yuqing Du, Sheng Yang, Kaibin Huang

First, the framework features a practical hierarchical architecture for decomposing the stochastic gradient into its norm and normalized block gradients, and efficiently quantizes them using a uniform quantizer and a low-dimensional codebook on a Grassmann manifold, respectively.

Federated Learning Quantization +1

Energy-Efficient Radio Resource Allocation for Federated Edge Learning

no code implementations13 Jul 2019 Qunsong Zeng, Yuqing Du, Kin K. Leung, Kaibin Huang

To reduce devices' energy consumption, we propose energy-efficient strategies for bandwidth allocation and scheduling.

Management Scheduling

Towards an Intelligent Edge: Wireless Communication Meets Machine Learning

no code implementations2 Sep 2018 Guangxu Zhu, Dongzhu Liu, Yuqing Du, Changsheng You, Jun Zhang, Kaibin Huang

Accordingly, a new research area, called edge learning, emerges, which crosses and revolutionizes two disciplines: wireless communication and machine learning.

BIG-bench Machine Learning Edge-computing

Cannot find the paper you are looking for? You can Submit a new open access paper.