no code implementations • EMNLP (FEVER) 2021 • Yang Liu, Chenguang Zhu, Michael Zeng
Fact verification is a challenging task of identifying the truthfulness of given claims based on the retrieval of relevant evidence texts.
no code implementations • 20 Mar 2023 • Haibin Yu, Yuxuan Hu, Yao Qian, Ma Jin, Linquan Liu, Shujie Liu, Yu Shi, Yanmin Qian, Edward Lin, Michael Zeng
Code-switching speech refers to a means of expression by mixing two or more languages within a single utterance.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
1 code implementation • 20 Mar 2023 • Zhengyuan Yang, Linjie Li, JianFeng Wang, Kevin Lin, Ehsan Azarnasab, Faisal Ahmed, Zicheng Liu, Ce Liu, Michael Zeng, Lijuan Wang
We propose MM-REACT, a system paradigm that integrates ChatGPT with a pool of vision experts to achieve multimodal reasoning and action.
1 code implementation • 15 Mar 2023 • Chenda Li, Yao Qian, Zhuo Chen, Dongmei Wang, Takuya Yoshioka, Shujie Liu, Yanmin Qian, Michael Zeng
Automatic target sound extraction (TSE) is a machine learning approach to mimic the human auditory perception capability of attending to a sound source of interest from a mixture of sources.
2 code implementations • 5 Dec 2022 • Zineng Tang, ZiYi Yang, Guoxin Wang, Yuwei Fang, Yang Liu, Chenguang Zhu, Michael Zeng, Cha Zhang, Mohit Bansal
UDOP leverages the spatial correlation between textual content and document image to model image, text, and layout modalities with one uniform representation.
no code implementations • 23 Nov 2022 • Zhengyuan Yang, JianFeng Wang, Zhe Gan, Linjie Li, Kevin Lin, Chenfei Wu, Nan Duan, Zicheng Liu, Ce Liu, Michael Zeng, Lijuan Wang
Human evaluation on PaintSkill shows that ReCo is +19. 28% and +17. 21% more accurate in generating images with correct object count and spatial relationship than the T2I model.
1 code implementation • 17 Nov 2022 • Yulong Chen, Yang Liu, Ruochen Xu, ZiYi Yang, Chenguang Zhu, Michael Zeng, Yue Zhang
The diverse demands of different summarization tasks and their high annotation costs are driving a need for few-shot summarization.
1 code implementation • 9 Nov 2022 • Yusen Zhang, Yang Liu, ZiYi Yang, Yuwei Fang, Yulong Chen, Dragomir Radev, Chenguang Zhu, Michael Zeng, Rui Zhang
We propose two simple and effective parameter-efficient approaches for the new task of mixed controllable summarization based on hard prompt tuning and soft prefix tuning.
1 code implementation • 12 Oct 2022 • Zhuosheng Zhang, Shuohang Wang, Yichong Xu, Yuwei Fang, Wenhao Yu, Yang Liu, Hai Zhao, Chenguang Zhu, Michael Zeng
Leveraging task-aware annotated data as supervised signals to assist with self-supervised learning on large-scale unlabeled data has become a new trend in pre-training language models.
1 code implementation • 21 Sep 2022 • Wenhao Yu, Dan Iter, Shuohang Wang, Yichong Xu, Mingxuan Ju, Soumya Sanyal, Chenguang Zhu, Michael Zeng, Meng Jiang
We call our method generate-then-read (GenRead), which first prompts a large language model to generate contextutal documents based on a given question, and then reads the generated documents to produce the final answer.
no code implementations • 21 Aug 2022 • Pengcheng He, Baolin Peng, Liyang Lu, Song Wang, Jie Mei, Yang Liu, Ruochen Xu, Hany Hassan Awadalla, Yu Shi, Chenguang Zhu, Wayne Xiong, Michael Zeng, Jianfeng Gao, Xuedong Huang
Z-Code++ creates new state of the art on 9 out of 13 text summarization tasks across 5 languages.
no code implementations • 3 Jun 2022 • Yujia Xie, Luowei Zhou, Xiyang Dai, Lu Yuan, Nguyen Bach, Ce Liu, Michael Zeng
Thanks to the strong zero-shot capability of foundation models, we start by constructing a rich semantic representation of the image (e. g., image tags, object attributes / locations, captions) as a structured textual prompt, called visual clues, using a vision foundation model.
1 code implementation • 18 May 2022 • Reid Pryzant, ZiYi Yang, Yichong Xu, Chenguang Zhu, Michael Zeng
Semi-supervised learning has shown promise in allowing NLP models to generalize from small amounts of labeled data.
no code implementations • 3 May 2022 • ZiYi Yang, Yuwei Fang, Chenguang Zhu, Reid Pryzant, Dongdong Chen, Yu Shi, Yichong Xu, Yao Qian, Mei Gao, Yi-Ling Chen, Liyang Lu, Yujia Xie, Robert Gmyr, Noel Codella, Naoyuki Kanda, Bin Xiao, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang
Human intelligence is multimodal; we integrate visual, linguistic, and acoustic signals to maintain a holistic worldview.
no code implementations • 13 Apr 2022 • Chenguang Zhu, Michael Zeng
Recent development of large-scale pre-trained language models (PLM) have significantly improved the capability of models in various NLP tasks, in terms of performance after task-specific fine-tuning and zero-shot / few-shot learning.
1 code implementation • ACL 2022 • Shuohang Wang, Yichong Xu, Yuwei Fang, Yang Liu, Siqi Sun, Ruochen Xu, Chenguang Zhu, Michael Zeng
Surprisingly, we found that REtrieving from the traINing datA (REINA) only can lead to significant gains on multiple NLG and NLU tasks.
no code implementations • 10 Feb 2022 • Yulong Chen, Yang Liu, Li Dong, Shuohang Wang, Chenguang Zhu, Michael Zeng, Yue Zhang
However, for prompt learning, there are still two salient gaps between NLP tasks and pretraining.
1 code implementation • 29 Jan 2022 • Ming Zhong, Yang Liu, Suyu Ge, Yuning Mao, Yizhu Jiao, Xingxing Zhang, Yichong Xu, Chenguang Zhu, Michael Zeng, Jiawei Han
In this paper, we propose the first unsupervised multi-granularity summarization framework, GranuSum.
1 code implementation • CVPR 2022 • Manling Li, Ruochen Xu, Shuohang Wang, Luowei Zhou, Xudong Lin, Chenguang Zhu, Michael Zeng, Heng Ji, Shih-Fu Chang
Vision-language (V+L) pretraining models have achieved great success in supporting multimedia applications by understanding the alignments between images and text.
no code implementations • 10 Dec 2021 • Kenichi Kumatani, Dimitrios Dimitriadis, Yashesh Gaur, Robert Gmyr, Sefik Emre Eskimez, Jinyu Li, Michael Zeng
For untranscribed speech data, the hypothesis from an ASR system must be used as a label.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
1 code implementation • 8 Dec 2021 • Yixin Nie, Linjie Li, Zhe Gan, Shuohang Wang, Chenguang Zhu, Michael Zeng, Zicheng Liu, Mohit Bansal, Lijuan Wang
Based on this, we ask an even bolder question: can we have an all-MLP architecture for VL modeling, where both VL fusion and the vision encoder are replaced with MLPs?
2 code implementations • 6 Dec 2021 • Yichong Xu, Chenguang Zhu, Shuohang Wang, Siqi Sun, Hao Cheng, Xiaodong Liu, Jianfeng Gao, Pengcheng He, Michael Zeng, Xuedong Huang
In particular, we focus on the task of Commonsense Reasoning, demonstrating that the proposed external attention mechanism can augment existing transformer models and significantly improve the model's reasoning capabilities.
Ranked #2 on
Common Sense Reasoning
on CommonsenseQA
(using extra training data)
1 code implementation • 22 Nov 2021 • Lu Yuan, Dongdong Chen, Yi-Ling Chen, Noel Codella, Xiyang Dai, Jianfeng Gao, Houdong Hu, Xuedong Huang, Boxin Li, Chunyuan Li, Ce Liu, Mengchen Liu, Zicheng Liu, Yumao Lu, Yu Shi, Lijuan Wang, JianFeng Wang, Bin Xiao, Zhen Xiao, Jianwei Yang, Michael Zeng, Luowei Zhou, Pengchuan Zhang
Computer vision foundation models, which are trained on diverse, large-scale dataset and can be adapted to a wide range of downstream tasks, are critical for this mission to solve real-world computer vision applications.
Ranked #1 on
Action Recognition In Videos
on Kinetics-600
1 code implementation • CVPR 2022 • Zi-Yi Dou, Yichong Xu, Zhe Gan, JianFeng Wang, Shuohang Wang, Lijuan Wang, Chenguang Zhu, Pengchuan Zhang, Lu Yuan, Nanyun Peng, Zicheng Liu, Michael Zeng
Vision-and-language (VL) pre-training has proven to be highly effective on various VL downstream tasks.
Ranked #15 on
Cross-Modal Retrieval
on COCO 2014
4 code implementations • 26 Oct 2021 • Sanyuan Chen, Chengyi Wang, Zhengyang Chen, Yu Wu, Shujie Liu, Zhuo Chen, Jinyu Li, Naoyuki Kanda, Takuya Yoshioka, Xiong Xiao, Long Zhou, Shuo Ren, Yanmin Qian, Yao Qian, Jian Wu, Michael Zeng, Xiangzhan Yu, Furu Wei
Self-supervised learning (SSL) achieves great success in speech recognition, while limited exploration has been attempted for other speech processing tasks.
no code implementations • Findings (ACL) 2022 • Yuwei Fang, Shuohang Wang, Yichong Xu, Ruochen Xu, Siqi Sun, Chenguang Zhu, Michael Zeng
Then we utilize a diverse of 4 English knowledge sources to provide more comprehensive coverage of knowledge in different formats.
no code implementations • Findings (ACL) 2022 • Yang Liu, Chenguang Zhu, Michael Zeng
In this paper, we bring a new way of digesting news content by introducing the task of segmenting a news article into multiple sections and generating the corresponding summary to each section.
1 code implementation • Findings (ACL) 2022 • Wenhao Yu, Chenguang Zhu, Yuwei Fang, Donghan Yu, Shuohang Wang, Yichong Xu, Michael Zeng, Meng Jiang
In addition to training with the masked language modeling objective, we propose two novel self-supervised pre-training tasks on word and sentence-level alignment between input text sequence and rare word definitions to enhance language modeling representation with dictionary.
no code implementations • ACL 2022 • Donghan Yu, Chenguang Zhu, Yuwei Fang, Wenhao Yu, Shuohang Wang, Yichong Xu, Xiang Ren, Yiming Yang, Michael Zeng
The recent proposed Fusion-in-Decoder (FiD), which is built on top of the pretrained generative model T5, achieves the state-of-the-art performance in the reading module.
1 code implementation • 6 Sep 2021 • Ming Zhong, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng
For a dialogue, it corrupts a window of text with dialogue-inspired noise, and guides the model to reconstruct this window based on the content of the remaining conversation.
no code implementations • 1 Sep 2021 • Ruochen Xu, Yuwei Fang, Chenguang Zhu, Michael Zeng
It is often observed in knowledge-centric tasks (e. g., common sense question and answering, relation classification) that the integration of external knowledge such as entity representation into language models can help provide useful information to boost the performance.
1 code implementation • Findings (EMNLP) 2021 • Shuohang Wang, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng
Data annotation is a time-consuming and labor-intensive process for many NLP tasks.
no code implementations • 25 Jul 2021 • Linhao Zhang, Yu Shi, Linjun Shou, Ming Gong, Houfeng Wang, Michael Zeng
In this paper, we attempt to bridge these two lines of research and propose a joint and domain adaptive approach to SLU.
1 code implementation • Findings (ACL) 2021 • Han Wang, Yang Liu, Chenguang Zhu, Linjun Shou, Ming Gong, Yichong Xu, Michael Zeng
Commonsense generation is a challenging task of generating a plausible sentence describing an everyday scenario using provided concepts.
1 code implementation • NAACL 2021 • Chenguang Zhu, Yang Liu, Jie Mei, Michael Zeng
MediaSum, a large-scale media interview dataset consisting of 463. 6K transcripts with abstractive summaries.
no code implementations • 22 Feb 2021 • Junwei Liao, Yu Shi, Ming Gong, Linjun Shou, Sefik Eskimez, Liyang Lu, Hong Qu, Michael Zeng
Many downstream tasks and human readers rely on the output of the ASR system; therefore, errors introduced by the speaker and ASR system alike will be propagated to the next task in the pipeline.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 12 Feb 2021 • Junwei Liao, Yu Shi, Ming Gong, Linjun Shou, Hong Qu, Michael Zeng
However, the performance of using multiple encoders and decoders on zero-shot translation still lags behind universal NMT.
no code implementations • 11 Feb 2021 • Yao Qian, Ximo Bian, Yu Shi, Naoyuki Kanda, Leo Shen, Zhen Xiao, Michael Zeng
End-to-end (E2E) spoken language understanding (SLU) can infer semantics directly from speech signal without cascading an automatic speech recognizer (ASR) with a natural language understanding (NLU) module.
Ranked #2 on
Spoken Language Understanding
on Fluent Speech Commands
(using extra training data)
3 code implementations • 19 Jan 2021 • Chengyi Wang, Yu Wu, Yao Qian, Kenichi Kumatani, Shujie Liu, Furu Wei, Michael Zeng, Xuedong Huang
In this paper, we propose a unified pre-training approach called UniSpeech to learn speech representations with both unlabeled and labeled data, in which supervised phonetic CTC learning and phonetically-aware contrastive self-supervised learning are conducted in a multi-task learning manner.
2 code implementations • Findings (ACL) 2021 • Yichong Xu, Chenguang Zhu, Ruochen Xu, Yang Liu, Michael Zeng, Xuedong Huang
However, although a KG contains rich structural information, it lacks the context to provide a more precise understanding of the concepts.
Ranked #3 on
Common Sense Reasoning
on CommonsenseQA
(using extra training data)
no code implementations • 21 Oct 2020 • Xie Chen, Sarangarajan Parthasarathy, William Gale, Shuangyu Chang, Michael Zeng
The context information is captured by the hidden states of LSTM-LMs across utterance and can be used to guide the first-pass search effectively.
1 code implementation • Asian Chapter of the Association for Computational Linguistics 2020 • Ruochen Xu, Chenguang Zhu, Yu Shi, Michael Zeng, Xuedong Huang
Cross-lingual Summarization (CLS) aims at producing a summary in the target language for an article in the source language.
1 code implementation • NAACL 2021 • Yu-An Chung, Chenguang Zhu, Michael Zeng
Besides conducting a self-supervised masked language modeling task on the two individual modules using unpaired speech and text, SPLAT aligns representations from the two modules in a shared latent space using a small amount of paired speech and text.
no code implementations • 2 Oct 2020 • Donghan Yu, Chenguang Zhu, Yiming Yang, Michael Zeng
Knowledge graphs (KGs) contain rich information about world knowledge, entities and relations.
no code implementations • 27 Jun 2020 • Beliz Gunel, Chenguang Zhu, Michael Zeng, Xuedong Huang
In this work, we propose a novel architecture that extends Transformer encoder-decoder architecture in order to improve on these shortcomings.
no code implementations • 3 Jun 2020 • Yumo Xu, Chenguang Zhu, Baolin Peng, Michael Zeng
Dialog policy determines the next-step actions for agents and hence is central to a dialogue system.
no code implementations • 29 Apr 2020 • Baolin Peng, Chenguang Zhu, Michael Zeng, Jianfeng Gao
The training of spoken language understanding (SLU) models often faces the problem of data scarcity.
no code implementations • 9 Apr 2020 • Junwei Liao, Sefik Emre Eskimez, Liyang Lu, Yu Shi, Ming Gong, Linjun Shou, Hong Qu, Michael Zeng
In this work, we propose a novel NLP task called ASR post-processing for readability (APR) that aims to transform the noisy ASR output into a readable text for humans and downstream tasks while maintaining the semantic meaning of the speaker.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
3 code implementations • Findings of the Association for Computational Linguistics 2020 • Chenguang Zhu, Ruochen Xu, Michael Zeng, Xuedong Huang
With the abundance of automatic meeting transcripts, meeting summarization is of great interest to both participants and other parties.
no code implementations • NAACL 2021 • Chenguang Zhu, William Hinthorn, Ruochen Xu, Qingkai Zeng, Michael Zeng, Xuedong Huang, Meng Jiang
Automatic abstractive summaries are found to often distort or fabricate facts in the article.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Baolin Peng, Chenguang Zhu, Chunyuan Li, Xiujun Li, Jinchao Li, Michael Zeng, Jianfeng Gao
It is pre-trained on a large set of annotated NLG corpus to acquire the controllable generation ability, and fine-tuned with only a few domain-specific labels to adapt to new domains.
Ranked #4 on
Data-to-Text Generation
on MULTIWOZ 2.1
no code implementations • Findings of the Association for Computational Linguistics 2020 • Ziyi Yang, Chenguang Zhu, Robert Gmyr, Michael Zeng, Xuedong Huang, Eric Darve
Text summarization aims to extract essential information from a piece of text and transform the text into a concise version.
no code implementations • 25 Dec 2019 • Chenguang Zhu, Ziyi Yang, Robert Gmyr, Michael Zeng, Xuedong Huang
A typical journalistic convention in news articles is to deliver the most salient information in the beginning, also known as the lead bias.
no code implementations • IJCNLP 2019 • Chenguang Zhu, Michael Zeng, Xuedong Huang
In this paper, we propose a novel multi-task learning framework, NLG-LM, for natural language generation.
no code implementations • WS 2019 • Chenguang Zhu, Michael Zeng, Xuedong Huang
In this paper, we put forward a slot-independent neural model (SIM) to track dialogue states while keeping the model complexity invariant to the number of dialogue slots.
no code implementations • 25 Sep 2019 • Chenguang Zhu, ZiYi Yang, Robert Gmyr, Michael Zeng, Xuedong Huang
For example, the pretrained model without finetuning outperforms pointer-generator network on CNN/DailyMail dataset.
no code implementations • 3 May 2019 • Takuya Yoshioka, Zhuo Chen, Dimitrios Dimitriadis, William Hinthorn, Xuedong Huang, Andreas Stolcke, Michael Zeng
The speaker-attributed WER (SAWER) is 26. 7%.
3 code implementations • 10 Dec 2018 • Chenguang Zhu, Michael Zeng, Xuedong Huang
Conversational question answering (CQA) is a novel QA task that requires understanding of dialogue context.
Ranked #3 on
Question Answering
on CoQA
(Overall metric)