Search Results for author: Qinhong Zhou

Found 6 papers, 6 papers with code

HAZARD Challenge: Embodied Decision Making in Dynamically Changing Environments

1 code implementation • 23 Jan 2024 • Qinhong Zhou, Sunli Chen, Yisong Wang, Haozhe Xu, Weihua Du, Hongxin Zhang, Yilun Du, Joshua B. Tenenbaum, Chuang Gan

Recent advances in high-fidelity virtual environments serve as one of the major driving forces for building intelligent embodied agents to perceive, reason and interact with the physical world.

Common Sense Reasoning Decision Making +1

Paper
Code

SALMON: Self-Alignment with Instructable Reward Models

1 code implementation • 9 Oct 2023 • Zhiqing Sun, Yikang Shen, Hongxin Zhang, Qinhong Zhou, Zhenfang Chen, David Cox, Yiming Yang, Chuang Gan

Supervised Fine-Tuning (SFT) on response demonstrations combined with Reinforcement Learning from Human Feedback (RLHF) constitutes a powerful paradigm for aligning LLM-based AI agents.

In-Context Learning Language Modelling

124

Paper
Code

Building Cooperative Embodied Agents Modularly with Large Language Models

1 code implementation • 5 Jul 2023 • Hongxin Zhang, Weihua Du, Jiaming Shan, Qinhong Zhou, Yilun Du, Joshua B. Tenenbaum, Tianmin Shu, Chuang Gan

In this work, we address challenging multi-agent cooperation problems with decentralized control, raw sensory observations, costly communication, and multi-objective tasks instantiated in various embodied environments.

Text Generation

173

Paper
Code

Bridging the Gap between Decision and Logits in Decision-based Knowledge Distillation for Pre-trained Language Models

1 code implementation • 15 Jun 2023 • Qinhong Zhou, Zonghan Yang, Peng Li, Yang Liu

By combining the theoretical and empirical estimations of the decision distributions together, the estimation of logits can be successfully reduced to a simple root-finding problem.

Data Augmentation Knowledge Distillation +2

Paper
Code

Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

1 code implementation • NeurIPS 2023 • Zhiqing Sun, Yikang Shen, Qinhong Zhou, Hongxin Zhang, Zhenfang Chen, David Cox, Yiming Yang, Chuang Gan

Recent AI-assistant agents, such as ChatGPT, predominantly rely on supervised fine-tuning (SFT) with human annotations and reinforcement learning from human feedback (RLHF) to align the output of large language models (LLMs) with human intentions, ensuring they are helpful, ethical, and reliable.

In-Context Learning Language Modelling

1,090

Paper
Code

See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning

1 code implementation • 12 Jan 2023 • Zhenfang Chen, Qinhong Zhou, Yikang Shen, Yining Hong, Hao Zhang, Chuang Gan

The see stage scans the image and grounds the visual concept candidates with a visual perception model.

Few-Shot Learning Image Captioning +4

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.