Search Results for author: Yijia Shao

Found 12 papers, 9 papers with code

Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

1 code implementation • 14 Mar 2024 • Eric Zelikman, Georges Harik, Yijia Shao, Varuna Jayasiri, Nick Haber, Noah D. Goodman

Crucially, these improvements require no fine-tuning on these tasks.

GSM8K Language Modelling +1

289

Paper
Code

Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models

1 code implementation • 22 Feb 2024 • Yijia Shao, Yucheng Jiang, Theodore A. Kanell, Peter Xu, Omar Khattab, Monica S. Lam

We study how to apply large language models to write grounded and organized long-form articles from scratch, with comparable breadth and depth to Wikipedia pages.

Retrieval

3,895

Paper
Code

Class Incremental Learning via Likelihood Ratio Based Task Prediction

2 code implementations • 26 Sep 2023 • Haowei Lin, Yijia Shao, Weinan Qian, Ningxin Pan, Yiduo Guo, Bing Liu

An emerging theory-guided approach (called TIL+OOD) is to train a task-specific model for each task in a shared network for all tasks based on a task-incremental learning (TIL) method to deal with catastrophic forgetting.

Class Incremental Learning Incremental Learning

Paper
Code

Class-Incremental Learning based on Label Generation

1 code implementation • 22 Jun 2023 • Yijia Shao, Yiduo Guo, Dongyan Zhao, Bing Liu

Despite the great success of pre-trained language models, it is still a challenge to use these models for continual learning, especially for the class-incremental learning (CIL) setting due to catastrophic forgetting (CF).

Class Incremental Learning Incremental Learning

Paper
Code

ACCENT: An Automatic Event Commonsense Evaluation Metric for Open-Domain Dialogue Systems

1 code implementation • 12 May 2023 • Sarik Ghazarian, Yijia Shao, Rujun Han, Aram Galstyan, Nanyun Peng

We take the first step by focusing on event commonsense that considers events and their relations, and is crucial in both dialogues and general commonsense reasoning.

Paper
Code

Continual Pre-training of Language Models

2 code implementations • 7 Feb 2023 • Zixuan Ke, Yijia Shao, Haowei Lin, Tatsuya Konishi, Gyuhak Kim, Bing Liu

A novel proxy is also proposed to preserve the general knowledge in the original LM.

Ranked #1 on Continual Pretraining on ACL-ARC

Continual Learning Continual Pretraining +2

274

Paper
Code

Adapting a Language Model While Preserving its General Knowledge

2 code implementations • 21 Jan 2023 • Zixuan Ke, Yijia Shao, Haowei Lin, Hu Xu, Lei Shu, Bing Liu

This paper shows that the existing methods are suboptimal and proposes a novel method to perform a more informed adaptation of the knowledge in the LM by (1) soft-masking the attention heads based on their importance to best preserve the general knowledge in the LM and (2) contrasting the representations of the general and the full (both general and domain knowledge) to learn an integrated representation with both general and domain-specific knowledge.

Continual Learning General Knowledge +1

207

Paper
Code

LUNA: Language Understanding with Number Augmentations on Transformers via Number Plugins and Pre-training

1 code implementation • 6 Dec 2022 • Hongwei Han, Jialiang Xu, Mengyu Zhou, Yijia Shao, Shi Han, Dongmei Zhang

But current approaches to rich-number tasks with transformer-based language models abandon or lose some of the numeracy information - e. g., breaking numbers into sub-word tokens - which leads to many number-related errors.

Paper
Code

FormLM: Recommending Creation Ideas for Online Forms by Modelling Semantic and Structural Information

no code implementations • 10 Nov 2022 • Yijia Shao, Mengyu Zhou, Yifan Zhong, Tao Wu, Hongwei Han, Shi Han, Gideon Huang, Dongmei Zhang

To assist form designers, in this work we present FormLM to model online forms (by enhancing pre-trained language model with form structural information) and recommend form creation ideas (including question / options recommendations and block type suggestion).

Language Modelling

Paper
Add Code

Continual Training of Language Models for Few-Shot Learning

3 code implementations • 11 Oct 2022 • Zixuan Ke, Haowei Lin, Yijia Shao, Hu Xu, Lei Shu, Bing Liu

Recent work on applying large language models (LMs) achieves impressive performance in many NLP applications.

Ranked #1 on Continual Pretraining on AG News

Continual Learning Continual Pretraining +2

274

Paper
Code

AnaMeta: A Table Understanding Dataset of Field Metadata Knowledge Shared by Multi-dimensional Data Analysis Tasks

no code implementations • 2 Sep 2022 • Xinyi He, Mengyu Zhou, Mingjie Zhou, Jialiang Xu, Xiao Lv, Tianle Li, Yijia Shao, Shi Han, Zejian yuan, Dongmei Zhang

Tabular data analysis is performed every day across various domains.

Paper
Add Code

Efficient Out-of-Distribution Detection via CVAE data Generation

no code implementations • 29 Sep 2021 • Mengyu Wang, Yijia Shao, Haowei Lin, Wenpeng Hu, Bing Liu

Recently, contrastive loss with data augmentation and pseudo class creation has been shown to produce markedly better results for out-of-distribution (OOD) detection than previous methods.

Data Augmentation Out-of-Distribution Detection +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.