1 code implementation • 27 Feb 2023 • Shaohan Huang, Li Dong, Wenhui Wang, Yaru Hao, Saksham Singhal, Shuming Ma, Tengchao Lv, Lei Cui, Owais Khan Mohammed, Barun Patra, Qiang Liu, Kriti Aggarwal, Zewen Chi, Johan Bjorck, Vishrav Chaudhary, Subhojit Som, Xia Song, Furu Wei
A big convergence of language, multimodal perception, action, and world modeling is a key step toward artificial general intelligence.
Ranked #2 on
Image Captioning
on Flickr30k Captions test
(CIDEr metric)
no code implementations • 19 Dec 2022 • Xiao Zhang, Heyan Huang, Zewen Chi, Xian-Ling Mao
Open-retrieval conversational machine reading comprehension (OCMRC) simulates real-life conversational interaction scenes.
1 code implementation • 19 Dec 2022 • Yaru Hao, Zewen Chi, Li Dong, Furu Wei
Instead of laborious human engineering, we propose prompt adaptation, a general framework that automatically adapts original user input to model-preferred prompts.
1 code implementation • 23 Nov 2022 • Shuming Ma, Hongyu Wang, Shaohan Huang, Wenhui Wang, Zewen Chi, Li Dong, Alon Benhaim, Barun Patra, Vishrav Chaudhary, Xia Song, Furu Wei
Large Transformers have achieved state-of-the-art performance across many tasks.
no code implementations • 26 Oct 2022 • Barun Patra, Saksham Singhal, Shaohan Huang, Zewen Chi, Li Dong, Furu Wei, Vishrav Chaudhary, Xia Song
In this paper, we elaborate upon recipes for building multilingual representation models that are not only competitive with existing state-of-the-art models but are also more parameter efficient, thereby promoting better adoption in resource-constrained scenarios and practical applications.
1 code implementation • COLING 2022 • Xiao Zhang, Heyan Huang, Zewen Chi, Xian-Ling Mao
Conversational machine reading comprehension (CMRC) aims to assist computers to understand an natural language text and thereafter engage in a multi-turn conversation to answer questions related to the text.
1 code implementation • COLING 2022 • Yuxiang Nie, Heyan Huang, Zewen Chi, Xian-Ling Mao
Previous works usually make use of heuristic rules as well as pre-trained models to construct data and train QA models.
1 code implementation • 13 Jun 2022 • Yaru Hao, Haoyu Song, Li Dong, Shaohan Huang, Zewen Chi, Wenhui Wang, Shuming Ma, Furu Wei
Experimental results across various language-only and vision-language benchmarks show that our model outperforms or is competitive with specialized models on finetuning, zero-shot generalization, and few-shot learning.
Ranked #2 on
Image Captioning
on nocaps val
2 code implementations • 20 Apr 2022 • Zewen Chi, Li Dong, Shaohan Huang, Damai Dai, Shuming Ma, Barun Patra, Saksham Singhal, Payal Bajaj, Xia Song, Xian-Ling Mao, Heyan Huang, Furu Wei
We also present a comprehensive analysis on the representation and routing behaviors of our models.
1 code implementation • ACL 2022 • Heqi Zheng, Xiao Zhang, Zewen Chi, Heyan Huang, Tan Yan, Tian Lan, Wei Wei, Xian-Ling Mao
In this paper, we propose XPR, a cross-lingual phrase retriever that extracts phrase representations from unlabeled example sentences.
1 code implementation • 15 Oct 2021 • Yu Bai, Heyan Huang, Kai Fan, Yang Gao, Yiming Zhu, Jiaao Zhan, Zewen Chi, Boxing Chen
Through introducing compression rate, the information ratio between the source and the target text, we regard the MT task as a special CLS task with a compression rate of 100%.
no code implementations • 23 Sep 2021 • Zewen Chi, Heyan Huang, Luyang Liu, Yu Bai, Xian-Ling Mao
The success of pretrained cross-lingual language models relies on two essential abilities, i. e., generalization ability for learning downstream tasks in a source language, and cross-lingual transferability for transferring the task knowledge to other languages.
2 code implementations • ACL 2022 • Zewen Chi, Shaohan Huang, Li Dong, Shuming Ma, Bo Zheng, Saksham Singhal, Payal Bajaj, Xia Song, Xian-Ling Mao, Heyan Huang, Furu Wei
In this paper, we introduce ELECTRA-style tasks to cross-lingual language model pre-training.
Ranked #1 on
Zero-Shot Cross-Lingual Transfer
on XTREME
1 code implementation • ACL 2021 • Bo Zheng, Li Dong, Shaohan Huang, Wenhui Wang, Zewen Chi, Saksham Singhal, Wanxiang Che, Ting Liu, Xia Song, Furu Wei
Fine-tuning pre-trained cross-lingual language models can transfer task-specific supervision from one language to the others.
1 code implementation • ACL 2021 • Zewen Chi, Li Dong, Bo Zheng, Shaohan Huang, Xian-Ling Mao, Heyan Huang, Furu Wei
The cross-lingual language models are typically pretrained with masked language modeling on multilingual text or parallel sentences.
1 code implementation • EMNLP 2021 • Zewen Chi, Li Dong, Shuming Ma, Shaohan Huang Xian-Ling Mao, Heyan Huang, Furu Wei
Multilingual T5 (mT5) pretrains a sequence-to-sequence model on massive monolingual texts, which has shown promising results on many cross-lingual tasks.
1 code implementation • 2 Jan 2021 • Houjin Yu, Xian-Ling Mao, Zewen Chi, Wei Wei, Heyan Huang
Recently, it has attracted much attention to build reliable named entity recognition (NER) systems using limited annotated data.
Ranked #3 on
Named Entity Recognition (NER)
on SciERC
(using extra training data)
Low Resource Named Entity Recognition
named-entity-recognition
+2
no code implementations • 31 Dec 2020 • Shuming Ma, Jian Yang, Haoyang Huang, Zewen Chi, Li Dong, Dongdong Zhang, Hany Hassan Awadalla, Alexandre Muzio, Akiko Eriguchi, Saksham Singhal, Xia Song, Arul Menezes, Furu Wei
Multilingual machine translation enables a single model to translate between different languages.
4 code implementations • NAACL 2021 • Zewen Chi, Li Dong, Furu Wei, Nan Yang, Saksham Singhal, Wenhui Wang, Xia Song, Xian-Ling Mao, He-Yan Huang, Ming Zhou
In this work, we present an information-theoretic framework that formulates cross-lingual language model pre-training as maximizing mutual information between multilingual-multi-granularity texts.
Ranked #16 on
Zero-Shot Cross-Lingual Transfer
on XTREME
no code implementations • 3 Jul 2020 • Heng-Da Xu, Xian-Ling Mao, Zewen Chi, Jing-Jing Zhu, Fanshu Sun, He-Yan Huang
Specifically, KW-Seq2Seq first uses a keywords decoder to predict some topic keywords, and then generates the final response under the guidance of them.
no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Zewen Chi, Li Dong, Furu Wei, Xian-Ling Mao, He-Yan Huang
Multilingual pretrained language models (such as multilingual BERT) have achieved impressive results for cross-lingual transfer.
1 code implementation • 23 Sep 2019 • Zewen Chi, Li Dong, Furu Wei, Wenhui Wang, Xian-Ling Mao, He-Yan Huang
In this work we focus on transferring supervision signals of natural language generation (NLG) tasks between multiple languages.
1 code implementation • 13 Aug 2019 • Zewen Chi, He-Yan Huang, Heng-Da Xu, Houjin Yu, Wanxuan Yin, Xian-Ling Mao
It also attracts lots of attention to recognize the table structures in PDF files.
no code implementations • SEMEVAL 2018 • Zewen Chi, He-Yan Huang, Jiangui Chen, Hao Wu, Ran Wei
This paper presents a method for Affect in Tweets, which is the task to automatically determine the intensity of emotions and intensity of sentiment of tweets.