Search Results for author: Shaohan Huang

Found 40 papers, 25 papers with code

Pseudo-Label Guided Unsupervised Domain Adaptation of Contextual Embeddings

no code implementations EACL (AdaptNLP) 2021 Tianyu Chen, Shaohan Huang, Furu Wei, JianXin Li

In unsupervised domain adaptation, we aim to train a model that works well on a target domain when provided with labeled source samples and unlabeled target samples.

Language Modelling Masked Language Modeling +3

UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation

1 code implementation15 Mar 2023 Daixuan Cheng, Shaohan Huang, Junyu Bi, Yuefeng Zhan, Jianfeng Liu, Yujing Wang, Hao Sun, Furu Wei, Denvy Deng, Qi Zhang

Large Language Models (LLMs) are popular for their impressive abilities, but the need for model-specific fine-tuning or task-specific prompt engineering can hinder their generalization.

Prompt Engineering Retrieval

GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator

no code implementations20 Dec 2022 Jian Yang, Shuming Ma, Li Dong, Shaohan Huang, Haoyang Huang, Yuwei Yin, Dongdong Zhang, Liqun Yang, Zhoujun Li, Furu Wei

Inspired by the idea of Generative Adversarial Networks (GANs), we propose a GAN-style model for encoder-decoder pre-training by introducing an auxiliary discriminator, unifying the ability of language understanding and generation in a single model.

Denoising Text Generation

Beyond English-Centric Bitexts for Better Multilingual Language Representation Learning

no code implementations26 Oct 2022 Barun Patra, Saksham Singhal, Shaohan Huang, Zewen Chi, Li Dong, Furu Wei, Vishrav Chaudhary, Xia Song

In this paper, we elaborate upon recipes for building multilingual representation models that are not only competitive with existing state-of-the-art models but are also more parameter efficient, thereby promoting better adoption in resource-constrained scenarios and practical applications.

Representation Learning

MoEC: Mixture of Expert Clusters

no code implementations19 Jul 2022 Yuan Xie, Shaohan Huang, Tianyu Chen, Furu Wei

Sparsely Mixture of Experts (MoE) has received great interest due to its promising scaling capability with affordable computational overhead.

Machine Translation Natural Language Understanding

Language Models are General-Purpose Interfaces

1 code implementation13 Jun 2022 Yaru Hao, Haoyu Song, Li Dong, Shaohan Huang, Zewen Chi, Wenhui Wang, Shuming Ma, Furu Wei

Experimental results across various language-only and vision-language benchmarks show that our model outperforms or is competitive with specialized models on finetuning, zero-shot generalization, and few-shot learning.

Causal Language Modeling Few-Shot Learning +4

THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption

no code implementations Findings (ACL) 2022 Tianyu Chen, Hangbo Bao, Shaohan Huang, Li Dong, Binxing Jiao, Daxin Jiang, Haoyi Zhou, JianXin Li, Furu Wei

As more and more pre-trained language models adopt on-cloud deployment, the privacy issues grow quickly, mainly for the exposure of plain-text user data (e. g., search history, medical record, bank account).

Privacy Preserving

Task-Specific Expert Pruning for Sparse Mixture-of-Experts

no code implementations1 Jun 2022 Tianyu Chen, Shaohan Huang, Yuan Xie, Binxing Jiao, Daxin Jiang, Haoyi Zhou, JianXin Li, Furu Wei

The sparse Mixture-of-Experts (MoE) model is powerful for large-scale pre-training and has achieved promising results due to its model capacity.

DeepNet: Scaling Transformers to 1,000 Layers

6 code implementations1 Mar 2022 Hongyu Wang, Shuming Ma, Li Dong, Shaohan Huang, Dongdong Zhang, Furu Wei

In this paper, we propose a simple yet effective method to stabilize extremely deep Transformers.


Kformer: Knowledge Injection in Transformer Feed-Forward Layers

1 code implementation15 Jan 2022 Yunzhi Yao, Shaohan Huang, Li Dong, Furu Wei, Huajun Chen, Ningyu Zhang

In this work, we propose a simple model, Kformer, which takes advantage of the knowledge stored in PTMs and external knowledge via knowledge injection in Transformer FFN layers.

Language Modelling Question Answering

Improving Non-autoregressive Generation with Mixup Training

1 code implementation21 Oct 2021 Ting Jiang, Shaohan Huang, Zihan Zhang, Deqing Wang, Fuzhen Zhuang, Furu Wei, Haizhen Huang, Liangjie Zhang, Qi Zhang

While pre-trained language models have achieved great success on various natural language understanding tasks, how to effectively leverage them into non-autoregressive generation tasks remains a challenge.

Natural Language Understanding Paraphrase Generation +2

Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training

1 code implementation EMNLP 2021 Bo Zheng, Li Dong, Shaohan Huang, Saksham Singhal, Wanxiang Che, Ting Liu, Xia Song, Furu Wei

We find that many languages are under-represented in recent cross-lingual language models due to the limited vocabulary capacity.

Language Modelling

DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders

1 code implementation25 Jun 2021 Shuming Ma, Li Dong, Shaohan Huang, Dongdong Zhang, Alexandre Muzio, Saksham Singhal, Hany Hassan Awadalla, Xia Song, Furu Wei

While pretrained encoders have achieved success in various natural language understanding (NLU) tasks, there is a gap between these pretrained encoders and natural language generation (NLG).

Abstractive Text Summarization Machine Translation +5

MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers

2 code implementations Findings (ACL) 2021 Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, Furu Wei

We generalize deep self-attention distillation in MiniLM (Wang et al., 2020) by only using self-attention relation distillation for task-agnostic compression of pretrained Transformers.


Unsupervised Fine-tuning for Text Clustering

no code implementations COLING 2020 Shaohan Huang, Furu Wei, Lei Cui, Xingxing Zhang, Ming Zhou

Fine-tuning with pre-trained language models (e. g. BERT) has achieved great success in many language understanding tasks in supervised settings (e. g. text classification).

text-classification Text Classification +1

Generating Commonsense Explanation by Extracting Bridge Concepts from Reasoning Paths

no code implementations Asian Chapter of the Association for Computational Linguistics 2020 Haozhe Ji, Pei Ke, Shaohan Huang, Furu Wei, Minlie Huang

Commonsense explanation generation aims to empower the machine's sense-making capability by generating plausible explanations to statements against commonsense.

Explanation Generation

Language Generation with Multi-Hop Reasoning on Commonsense Knowledge Graph

1 code implementation EMNLP 2020 Haozhe Ji, Pei Ke, Shaohan Huang, Furu Wei, Xiaoyan Zhu, Minlie Huang

Despite the success of generative pre-trained language models on a series of text generation tasks, they still suffer in cases where reasoning over underlying commonsense knowledge is required during generation.

Text Generation

DocBank: A Benchmark Dataset for Document Layout Analysis

2 code implementations COLING 2020 Minghao Li, Yiheng Xu, Lei Cui, Shaohan Huang, Furu Wei, Zhoujun Li, Ming Zhou

DocBank is constructed using a simple yet effective way with weak supervision from the \LaTeX{} documents available on the arXiv. com.

Document Layout Analysis

TableBank: Table Benchmark for Image-based Table Detection and Recognition

1 code implementation LREC 2020 Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou, Zhoujun Li

We present TableBank, a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on the internet.

Table Detection

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

13 code implementations31 Dec 2019 Yiheng Xu, Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou

In this paper, we propose the \textbf{LayoutLM} to jointly model interactions between text and layout information across scanned document images, which is beneficial for a great number of real-world document image understanding tasks such as information extraction from scanned documents.

Document AI Document Image Classification +2

TableBank: A Benchmark Dataset for Table Detection and Recognition

2 code implementations LREC 2020 Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou, Zhoujun Li

We present TableBank, a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on the internet.

Table Detection

Text Morphing

no code implementations30 Sep 2018 Shaohan Huang, Yu Wu, Furu Wei, Ming Zhou

In this paper, we introduce a novel natural language generation task, termed as text morphing, which targets at generating the intermediate sentences that are fluency and smooth with the two input sentences.

Text Generation

Neural Melody Composition from Lyrics

no code implementations12 Sep 2018 Hangbo Bao, Shaohan Huang, Furu Wei, Lei Cui, Yu Wu, Chuanqi Tan, Songhao Piao, Ming Zhou

In this paper, we study a novel task that learns to compose music from natural language.

Dictionary-Guided Editing Networks for Paraphrase Generation

no code implementations21 Jun 2018 Shaohan Huang, Yu Wu, Furu Wei, Ming Zhou

An intuitive way for a human to write paraphrase sentences is to replace words or phrases in the original sentence with their corresponding synonyms and make necessary changes to ensure the new sentences are fluent and grammatically correct.

Paraphrase Generation

Response Generation by Context-aware Prototype Editing

3 code implementations19 Jun 2018 Yu Wu, Furu Wei, Shaohan Huang, Yunli Wang, Zhoujun Li, Ming Zhou

Open domain response generation has achieved remarkable progress in recent years, but sometimes yields short and uninformative responses.

Informativeness Response Generation +1

Learning to Generate Product Reviews from Attributes

no code implementations EACL 2017 Li Dong, Shaohan Huang, Furu Wei, Mirella Lapata, Ming Zhou, Ke Xu

This paper presents an attention-enhanced attribute-to-sequence model to generate product reviews for given attribute information, such as user, product, and rating.

Review Generation Sentiment Analysis +1

Cannot find the paper you are looking for? You can Submit a new open access paper.