Search Results for author: Shangmin Guo

Found 18 papers, 9 papers with code

Language Model Evolution: An Iterated Learning Perspective

2 code implementations4 Apr 2024 Yi Ren, Shangmin Guo, Linlu Qiu, Bailin Wang, Danica J. Sutherland

With the widespread adoption of Large Language Models (LLMs), the prevalence of iterative interactions among these models is anticipated to increase.

Language Modelling

Direct Language Model Alignment from Online AI Feedback

no code implementations7 Feb 2024 Shangmin Guo, Biao Zhang, Tianlin Liu, Tianqi Liu, Misha Khalman, Felipe Llinares, Alexandre Rame, Thomas Mesnard, Yao Zhao, Bilal Piot, Johan Ferret, Mathieu Blondel

Moreover, responses in these datasets are often sampled from a language model distinct from the one being aligned, and since the model evolves over training, the alignment phase is inevitably off-policy.

Language Modelling

ICED: Zero-Shot Transfer in Reinforcement Learning via In-Context Environment Design

no code implementations5 Feb 2024 Samuel Garcin, James Doran, Shangmin Guo, Christopher G. Lucas, Stefano V. Albrecht

ICED generates levels using a variational autoencoder trained over an initial set of level parameters, reducing distributional shift, and achieves significant improvements in ZSG over adaptive level sampling strategies and UED methods.

Reinforcement Learning (RL)

Sample Relationship from Learning Dynamics Matters for Generalisation

no code implementations16 Jan 2024 Shangmin Guo, Yi Ren, Stefano V. Albrecht, Kenny Smith

Although much research has been done on proposing new models or loss functions to improve the generalisation of artificial neural networks (ANNs), less attention has been directed to the impact of the training data on generalisation.

How the level sampling process impacts zero-shot generalisation in deep reinforcement learning

no code implementations5 Oct 2023 Samuel Garcin, James Doran, Shangmin Guo, Christopher G. Lucas, Stefano V. Albrecht

A key limitation preventing the wider adoption of autonomous agents trained via deep reinforcement learning (RL) is their limited ability to generalise to new environments, even when these share similar characteristics with environments encountered during training.

Reinforcement Learning (RL)

How to prepare your task head for finetuning

no code implementations11 Feb 2023 Yi Ren, Shangmin Guo, Wonho Bae, Danica J. Sutherland

We identify a significant trend in the effect of changes in this initial energy on the resulting features after fine-tuning.

Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic Segmentation

1 code implementation15 Mar 2022 Runfa Chen, Yu Rong, Shangmin Guo, Jiaqi Han, Fuchun Sun, Tingyang Xu, Wenbing Huang

After the great success of Vision Transformer variants (ViTs) in computer vision, it has also demonstrated great potential in domain adaptive semantic segmentation.

Pseudo Label Segmentation +2

Better Supervisory Signals by Observing Learning Paths

1 code implementation ICLR 2022 Yi Ren, Shangmin Guo, Danica J. Sutherland

Observing the learning path not only provides a new perspective for understanding knowledge distillation, overfitting, and learning dynamics, but also reveals that the supervisory signal of a teacher network can be very unstable near the best points in training on real tasks.

Knowledge Distillation

Expressivity of Emergent Languages is a Trade-off between Contextual Complexity and Unpredictability

no code implementations ICLR 2022 Shangmin Guo, Yi Ren, Kory Wallace Mathewson, Simon Kirby, Stefano V Albrecht, Kenny Smith

Researchers are using deep learning models to explore the emergence of language in various language games, where simulated agents interact and develop an emergent language to solve a task.

Expressivity of Emergent Language is a Trade-off between Contextual Complexity and Unpredictability

1 code implementation7 Jun 2021 Shangmin Guo, Yi Ren, Kory Mathewson, Simon Kirby, Stefano V. Albrecht, Kenny Smith

Researchers are using deep learning models to explore the emergence of language in various language games, where agents interact and develop an emergent language to solve tasks.

Inductive Bias and Language Expressivity in Emergent Communication

1 code implementation4 Dec 2020 Shangmin Guo, Yi Ren, Agnieszka Słowik, Kory Mathewson

Referential games and reconstruction games are the most common game types for studying emergent languages.

Inductive Bias

Compositional Languages Emerge in a Neural Iterated Learning Model

1 code implementation ICLR 2020 Yi Ren, Shangmin Guo, Matthieu Labeau, Shay B. Cohen, Simon Kirby

The principle of compositionality, which enables natural language to represent complex concepts via a structured combination of simpler ones, allows us to convey an open-ended set of messages using a limited vocabulary.

Emergence of Numeric Concepts in Multi-Agent Autonomous Communication

1 code implementation4 Nov 2019 Shangmin Guo

Although their encodeing method is not compositional like natural languages from a perspective of human beings, the emergent languages can be generalised to unseen inputs and, more importantly, are easier for models to learn.

Grounded language learning

IJCNLP-2017 Task 5: Multi-choice Question Answering in Examinations

no code implementations IJCNLP 2017 Shangmin Guo, Kang Liu, Shizhu He, Cao Liu, Jun Zhao, Zhuoyu Wei

The IJCNLP-2017 Multi-choice Question Answering(MCQA) task aims at exploring the performance of current Question Answering(QA) techniques via the realworld complex questions collected from Chinese Senior High School Entrance Examination papers and CK12 website1.

Question Answering

Which is the Effective Way for Gaokao: Information Retrieval or Neural Networks?

1 code implementation EACL 2017 Shangmin Guo, Xiangrong Zeng, Shizhu He, Kang Liu, Jun Zhao

As one of the most important test of China, Gaokao is designed to be difficult enough to distinguish the excellent high school students.

Information Retrieval Multiple-choice +4

Cannot find the paper you are looking for? You can Submit a new open access paper.