Search Results for author: Hangbo Bao

Found 19 papers, 12 papers with code

Pseudo-Masked Language Models for Unified Language Model Pre-Training

1 code implementation ICML 2020 Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Jianfeng Gao, Songhao Piao, Ming Zhou, Hsiao-Wuen Hon

We propose to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks using a novel training procedure, referred to as a pseudo-masked language model (PMLM).

Language Modelling Natural Language Understanding +1

A Unified View of Masked Image Modeling

1 code implementation19 Oct 2022 Zhiliang Peng, Li Dong, Hangbo Bao, Qixiang Ye, Furu Wei

Masked image modeling has demonstrated great potential to eliminate the label-hungry problem of training large-scale vision Transformers, achieving impressive performance on various downstream tasks.

Image Classification Segmentation +1

BEiT v2: Masked Image Modeling with Vector-Quantized Visual Tokenizers

2 code implementations12 Aug 2022 Zhiliang Peng, Li Dong, Hangbo Bao, Qixiang Ye, Furu Wei

The large-size BEiT v2 obtains 87. 3% top-1 accuracy for ImageNet-1K (224 size) fine-tuning, and 56. 7% mIoU on ADE20K for semantic segmentation.

Knowledge Distillation Representation Learning +2

VL-BEiT: Generative Vision-Language Pretraining

no code implementations2 Jun 2022 Hangbo Bao, Wenhui Wang, Li Dong, Furu Wei

Our minimalist solution conducts masked prediction on both monomodal and multimodal data with a shared Transformer.

Image Classification Language Modelling +7

THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption

no code implementations Findings (ACL) 2022 Tianyu Chen, Hangbo Bao, Shaohan Huang, Li Dong, Binxing Jiao, Daxin Jiang, Haoyi Zhou, JianXin Li, Furu Wei

As more and more pre-trained language models adopt on-cloud deployment, the privacy issues grow quickly, mainly for the exposure of plain-text user data (e. g., search history, medical record, bank account).

Privacy Preserving

Corrupted Image Modeling for Self-Supervised Visual Pre-Training

no code implementations7 Feb 2022 Yuxin Fang, Li Dong, Hangbo Bao, Xinggang Wang, Furu Wei

Given this corrupted image, an enhancer network learns to either recover all the original image pixels, or predict whether each visual token is replaced by a generator sample or not.

Image Classification Semantic Segmentation

VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts

2 code implementations3 Nov 2021 Hangbo Bao, Wenhui Wang, Li Dong, Qiang Liu, Owais Khan Mohammed, Kriti Aggarwal, Subhojit Som, Furu Wei

We present a unified Vision-Language pretrained Model (VLMo) that jointly learns a dual encoder and a fusion encoder with a modular Transformer network.

Image Retrieval Retrieval +3

s2s-ft: Fine-Tuning Pretrained Transformer Encoders for Sequence-to-Sequence Learning

1 code implementation26 Oct 2021 Hangbo Bao, Li Dong, Wenhui Wang, Nan Yang, Furu Wei

Pretrained bidirectional Transformers, such as BERT, have achieved significant improvements in a wide variety of language understanding tasks, while it is not straightforward to directly apply them for natural language generation.

Abstractive Text Summarization Question Generation +2

Learning to Sample Replacements for ELECTRA Pre-Training

no code implementations Findings (ACL) 2021 Yaru Hao, Li Dong, Hangbo Bao, Ke Xu, Furu Wei

Moreover, we propose to use a focal loss for the generator in order to relieve oversampling of correct tokens as replacements.

Language Modelling Masked Language Modeling

Attention Temperature Matters in Abstractive Summarization Distillation

1 code implementation ACL 2022 Shengqiang Zhang, Xingxing Zhang, Hangbo Bao, Furu Wei

In this paper, we find simply manipulating attention temperatures in Transformers can make pseudo labels easier to learn for student models.

Abstractive Text Summarization

MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers

2 code implementations Findings (ACL) 2021 Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, Furu Wei

We generalize deep self-attention distillation in MiniLM (Wang et al., 2020) by only using self-attention relation distillation for task-agnostic compression of pretrained Transformers.

Relation XLM-R

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

3 code implementations28 Feb 2020 Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Songhao Piao, Jianfeng Gao, Ming Zhou, Hsiao-Wuen Hon

We propose to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks using a novel training procedure, referred to as a pseudo-masked language model (PMLM).

Ranked #4 on Question Generation on SQuAD1.1 (using extra training data)

Abstractive Text Summarization Language Modelling +3

MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers

1 code implementation NeurIPS 2020 Wenhui Wang, Furu Wei, Li Dong, Hangbo Bao, Nan Yang, Ming Zhou

The small model (student) is trained by deeply mimicking the self-attention module, which plays a vital role in Transformer networks, of the large model (teacher).

Zero-shot Text Search

Neural Melody Composition from Lyrics

no code implementations12 Sep 2018 Hangbo Bao, Shaohan Huang, Furu Wei, Lei Cui, Yu Wu, Chuanqi Tan, Songhao Piao, Ming Zhou

In this paper, we study a novel task that learns to compose music from natural language.

Neural Question Generation from Text: A Preliminary Study

6 code implementations6 Apr 2017 Qingyu Zhou, Nan Yang, Furu Wei, Chuanqi Tan, Hangbo Bao, Ming Zhou

Automatic question generation aims to generate questions from a text passage where the generated questions can be answered by certain sub-spans of the given passage.

Position Question Generation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.