Search Results for author: Hao Lang

Found 9 papers, 4 papers with code

Fine-Tuning Language Models with Reward Learning on Policy

1 code implementation28 Mar 2024 Hao Lang, Fei Huang, Yongbin Li

RLHF contains three steps, i. e., human preference collecting, reward learning, and policy optimization, which are usually performed serially.

MULTI-VIEW LEARNING

Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment

1 code implementation17 Mar 2024 Feifan Song, Bowen Yu, Hao Lang, Haiyang Yu, Fei Huang, Houfeng Wang, Yongbin Li

Additionally, the concept of diversity for prompts can be more complex than responses that are typically quantified by single digits.

Data Augmentation Diversity

Long-Tailed Question Answering in an Open World

1 code implementation11 May 2023 Yi Dai, Hao Lang, Yinhe Zheng, Fei Huang, Yongbin Li

A retrieve-then-rerank frame is further introduced to select in-context examples, which guild the LM to generate text that express knowledge for QA tasks.

Knowledge Distillation Language Modelling +1

Domain Incremental Lifelong Learning in an Open World

1 code implementation11 May 2023 Yi Dai, Hao Lang, Yinhe Zheng, Bowen Yu, Fei Huang, Yongbin Li

Specifically, we dedicate task-level prompts to capture task-specific knowledge to retain high LL performances and maintain instance-level prompts to learn knowledge shared across input samples to improve the model's generalization performance.

Language Modelling

Out-of-Domain Intent Detection Considering Multi-Turn Dialogue Contexts

no code implementations5 May 2023 Hao Lang, Yinhe Zheng, Binyuan Hui, Fei Huang, Yongbin Li

Out-of-Domain (OOD) intent detection is vital for practical dialogue systems, and it usually requires considering multi-turn dialogue contexts.

Intent Detection

A Survey on Out-of-Distribution Detection in NLP

no code implementations5 May 2023 Hao Lang, Yinhe Zheng, Yixuan Li, Jian Sun, Fei Huang, Yongbin Li

Out-of-distribution (OOD) detection is essential for the reliable and safe deployment of machine learning systems in the real world.

Out-of-Distribution Detection Out of Distribution (OOD) Detection +1

Automated Curriculum Learning for Turn-level Spoken Language Understanding with Weak Supervision

no code implementations10 Jun 2019 Hao Lang, Wen Wang

The RBSMA algorithm improves the test set accuracy by 7. 8% relative compared to the standard beam search.

Spoken Language Understanding

Cannot find the paper you are looking for? You can Submit a new open access paper.