Search Results for author: Zewen Chi

Found 26 papers, 17 papers with code

XLM-E: Cross-lingual Language Model Pre-training via ELECTRA

3 code implementations • ACL 2022 • Zewen Chi, Shaohan Huang, Li Dong, Shuming Ma, Bo Zheng, Saksham Singhal, Payal Bajaj, Xia Song, Xian-Ling Mao, Heyan Huang, Furu Wei

In this paper, we introduce ELECTRA-style tasks to cross-lingual language model pre-training.

Ranked #1 on Zero-Shot Cross-Lingual Transfer on XTREME

Language Modelling Translation +1

18,327

Paper
Code

On the Representation Collapse of Sparse Mixture of Experts

2 code implementations • 20 Apr 2022 • Zewen Chi, Li Dong, Shaohan Huang, Damai Dai, Shuming Ma, Barun Patra, Saksham Singhal, Payal Bajaj, Xia Song, Xian-Ling Mao, Heyan Huang, Furu Wei

We also present a comprehensive analysis on the representation and routing behaviors of our models.

Clustering Language Modelling

18,327

Paper
Code

Language Models are General-Purpose Interfaces

1 code implementation • 13 Jun 2022 • Yaru Hao, Haoyu Song, Li Dong, Shaohan Huang, Zewen Chi, Wenhui Wang, Shuming Ma, Furu Wei

Experimental results across various language-only and vision-language benchmarks show that our model outperforms or is competitive with specialized models on finetuning, zero-shot generalization, and few-shot learning.

Ranked #2 on Image Captioning on nocaps val

Causal Language Modeling Few-Shot Learning +6

18,327

Paper
Code

Language Is Not All You Need: Aligning Perception with Language Models

1 code implementation • NeurIPS 2023 • Shaohan Huang, Li Dong, Wenhui Wang, Yaru Hao, Saksham Singhal, Shuming Ma, Tengchao Lv, Lei Cui, Owais Khan Mohammed, Barun Patra, Qiang Liu, Kriti Aggarwal, Zewen Chi, Johan Bjorck, Vishrav Chaudhary, Subhojit Som, Xia Song, Furu Wei

A big convergence of language, multimodal perception, action, and world modeling is a key step toward artificial general intelligence.

Image Captioning Language Modelling +4

18,327

Paper
Code

InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

4 code implementations • NAACL 2021 • Zewen Chi, Li Dong, Furu Wei, Nan Yang, Saksham Singhal, Wenhui Wang, Xia Song, Xian-Ling Mao, He-Yan Huang, Ming Zhou

In this work, we present an information-theoretic framework that formulates cross-lingual language model pre-training as maximizing mutual information between multilingual-multi-granularity texts.

Ranked #16 on Zero-Shot Cross-Lingual Transfer on XTREME

Contrastive Learning Cross-Lingual Transfer +2

18,326

Paper
Code

Optimizing Prompts for Text-to-Image Generation

2 code implementations • NeurIPS 2023 • Yaru Hao, Zewen Chi, Li Dong, Furu Wei

Instead of laborious human engineering, we propose prompt adaptation, a general framework that automatically adapts original user input to model-preferred prompts.

Language Modelling Prompt Engineering +2

3,180

Paper
Code

TorchScale: Transformers at Scale

1 code implementation • 23 Nov 2022 • Shuming Ma, Hongyu Wang, Shaohan Huang, Wenhui Wang, Zewen Chi, Li Dong, Alon Benhaim, Barun Patra, Vishrav Chaudhary, Xia Song, Furu Wei

Large Transformers have achieved state-of-the-art performance across many tasks.

Language Modelling Machine Translation +1

2,918

Paper
Code

Complicated Table Structure Recognition

1 code implementation • 13 Aug 2019 • Zewen Chi, He-Yan Huang, Heng-Da Xu, Houjin Yu, Wanxuan Yin, Xian-Ling Mao

It also attracts lots of attention to recognize the table structures in PDF files.

327

Paper
Code

Cross-Lingual Natural Language Generation via Pre-Training

1 code implementation • 23 Sep 2019 • Zewen Chi, Li Dong, Furu Wei, Wenhui Wang, Xian-Ling Mao, He-Yan Huang

In this work we focus on transferring supervision signals of natural language generation (NLG) tasks between multiple languages.

Abstractive Text Summarization Machine Translation +5

128

Paper
Code

MT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs

1 code implementation • EMNLP 2021 • Zewen Chi, Li Dong, Shuming Ma, Shaohan Huang Xian-Ling Mao, Heyan Huang, Furu Wei

Multilingual T5 (mT5) pretrains a sequence-to-sequence model on massive monolingual texts, which has shown promising results on many cross-lingual tasks.

Abstractive Text Summarization Machine Translation +7

128

Paper
Code

Consistency Regularization for Cross-Lingual Fine-Tuning

1 code implementation • ACL 2021 • Bo Zheng, Li Dong, Shaohan Huang, Wenhui Wang, Zewen Chi, Saksham Singhal, Wanxiang Che, Ting Liu, Xia Song, Furu Wei

Fine-tuning pre-trained cross-lingual language models can transfer task-specific supervision from one language to the others.

Machine Translation Question Answering +3

Paper
Code

Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment

1 code implementation • ACL 2021 • Zewen Chi, Li Dong, Bo Zheng, Shaohan Huang, Xian-Ling Mao, Heyan Huang, Furu Wei

The cross-lingual language models are typically pretrained with masked language modeling on multilingual text or parallel sentences.

Denoising Language Modelling +4

Paper
Code

A Robust and Domain-Adaptive Approach for Low-Resource Named Entity Recognition

1 code implementation • 2 Jan 2021 • Houjin Yu, Xian-Ling Mao, Zewen Chi, Wei Wei, Heyan Huang

Recently, it has attracted much attention to build reliable named entity recognition (NER) systems using limited annotated data.

Ranked #3 on Named Entity Recognition (NER) on SciERC (using extra training data)

Low Resource Named Entity Recognition named-entity-recognition +2

Paper
Code

Unifying Cross-lingual Summarization and Machine Translation with Compression Rate

1 code implementation • 15 Oct 2021 • Yu Bai, Heyan Huang, Kai Fan, Yang Gao, Yiming Zhu, Jiaao Zhan, Zewen Chi, Boxing Chen

Through introducing compression rate, the information ratio between the source and the target text, we regard the MT task as a special CLS task with a compression rate of 100%.

Data Augmentation Machine Translation +1

Paper
Code

Cross-Lingual Phrase Retrieval

1 code implementation • ACL 2022 • Heqi Zheng, Xiao Zhang, Zewen Chi, Heyan Huang, Tan Yan, Tian Lan, Wei Wei, Xian-Ling Mao

In this paper, we propose XPR, a cross-lingual phrase retriever that extracts phrase representations from unlabeled example sentences.

Retrieval Sentence

Paper
Code

Unsupervised Question Answering via Answer Diversifying

1 code implementation • COLING 2022 • Yuxiang Nie, Heyan Huang, Zewen Chi, Xian-Ling Mao

Previous works usually make use of heuristic rules as well as pre-trained models to construct data and train QA models.

Data Augmentation Denoising +4

Paper
Code

ET5: A Novel End-to-end Framework for Conversational Machine Reading Comprehension

1 code implementation • COLING 2022 • Xiao Zhang, Heyan Huang, Zewen Chi, Xian-Ling Mao

Conversational machine reading comprehension (CMRC) aims to assist computers to understand an natural language text and thereafter engage in a multi-turn conversation to answer questions related to the text.

Decision Making Machine Reading Comprehension

Paper
Code

Zewen at SemEval-2018 Task 1: An Ensemble Model for Affect Prediction in Tweets

no code implementations • SEMEVAL 2018 • Zewen Chi, He-Yan Huang, Jiangui Chen, Hao Wu, Ran Wei

This paper presents a method for Affect in Tweets, which is the task to automatically determine the intensity of emotions and intensity of sentiment of tweets.

Sentence Classification Sentiment Analysis

Paper
Add Code

Can Monolingual Pretrained Models Help Cross-Lingual Classification?

no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Zewen Chi, Li Dong, Furu Wei, Xian-Ling Mao, He-Yan Huang

Multilingual pretrained language models (such as multilingual BERT) have achieved impressive results for cross-lingual transfer.

Classification Cross-Lingual Transfer +1

Paper
Add Code

Generating Informative Dialogue Responses with Keywords-Guided Networks

no code implementations • 3 Jul 2020 • Heng-Da Xu, Xian-Ling Mao, Zewen Chi, Jing-Jing Zhu, Fanshu Sun, He-Yan Huang

Specifically, KW-Seq2Seq first uses a keywords decoder to predict some topic keywords, and then generates the final response under the guidance of them.

Paper
Add Code

XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders

no code implementations • 31 Dec 2020 • Shuming Ma, Jian Yang, Haoyang Huang, Zewen Chi, Li Dong, Dongdong Zhang, Hany Hassan Awadalla, Alexandre Muzio, Akiko Eriguchi, Saksham Singhal, Xia Song, Arul Menezes, Furu Wei

Multilingual machine translation enables a single model to translate between different languages.

Language Modelling Machine Translation +2

Paper
Add Code

Cross-Lingual Language Model Meta-Pretraining

no code implementations • 23 Sep 2021 • Zewen Chi, Heyan Huang, Luyang Liu, Yu Bai, Xian-Ling Mao

The success of pretrained cross-lingual language models relies on two essential abilities, i. e., generalization ability for learning downstream tasks in a source language, and cross-lingual transferability for transferring the task knowledge to other languages.

Cross-Lingual Transfer Language Modelling

Paper
Add Code

Beyond English-Centric Bitexts for Better Multilingual Language Representation Learning

no code implementations • 26 Oct 2022 • Barun Patra, Saksham Singhal, Shaohan Huang, Zewen Chi, Li Dong, Furu Wei, Vishrav Chaudhary, Xia Song

In this paper, we elaborate upon recipes for building multilingual representation models that are not only competitive with existing state-of-the-art models but are also more parameter efficient, thereby promoting better adoption in resource-constrained scenarios and practical applications.

Representation Learning

Paper
Add Code

Bridging The Gap: Entailment Fused-T5 for Open-retrieval Conversational Machine Reading Comprehension

no code implementations • 19 Dec 2022 • Xiao Zhang, Heyan Huang, Zewen Chi, Xian-Ling Mao

Open-retrieval conversational machine reading comprehension (OCMRC) simulates real-life conversational interaction scenes.

Decision Making Machine Reading Comprehension +3

Paper
Add Code

Measuring Cross-Lingual Transferability of Multilingual Transformers on Sentence Classification

no code implementations • 15 May 2023 • Zewen Chi, Heyan Huang, Xian-Ling Mao

Recent studies have exhibited remarkable capabilities of pre-trained multilingual Transformers, especially cross-lingual transferability.

Cross-Lingual Transfer Sentence +1

Paper
Add Code

ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training

no code implementations • 28 Feb 2024 • Le Zhuo, Zewen Chi, Minghao Xu, Heyan Huang, Heqi Zheng, Conghui He, Xian-Ling Mao, Wentao Zhang

We propose ProtLLM, a versatile cross-modal large language model (LLM) for both protein-centric and protein-language tasks.

In-Context Learning Language Modelling +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.