1 code implementation • Findings (EMNLP) 2021 • Yusen Zhang, Ansong Ni, Tao Yu, Rui Zhang, Chenguang Zhu, Budhaditya Deb, Asli Celikyilmaz, Ahmed Hassan Awadallah, Dragomir Radev
Dialogue summarization helps readers capture salient information from long conversations in meetings, interviews, and TV series.
no code implementations • ACL 2022 • Chenguang Zhu, Yichong Xu, Xiang Ren, Bill Lin, Meng Jiang, Wenhao Yu
Knowledge in natural language processing (NLP) has been a rising trend especially after the advent of large scale pre-trained models.
no code implementations • EMNLP (FEVER) 2021 • Yang Liu, Chenguang Zhu, Michael Zeng
Fact verification is a challenging task of identifying the truthfulness of given claims based on the retrieval of relevant evidence texts.
no code implementations • 2 Dec 2024 • Bolin Lai, Felix Juefei-Xu, Miao Liu, Xiaoliang Dai, Nikhil Mehta, Chenguang Zhu, Zeyi Huang, James M. Rehg, Sangmin Lee, Ning Zhang, Tong Xiao
We also introduce a relation regularization method to further disentangle image transformation features from irrelevant contents in exemplar images.
no code implementations • 25 Nov 2024 • Yue Yu, Zhengxing Chen, Aston Zhang, Liang Tan, Chenguang Zhu, Richard Yuanzhe Pang, Yundi Qian, Xuewei Wang, Suchin Gururangan, Chao Zhang, Melanie Kambadur, Dhruv Mahajan, Rui Hou
Reward modeling is crucial for aligning large language models (LLMs) with human preferences, especially in reinforcement learning from human feedback (RLHF).
1 code implementation • 21 Oct 2024 • Yun He, Di Jin, Chaoqi Wang, Chloe Bi, Karishma Mandyam, Hejia Zhang, Chen Zhu, Ning li, Tengyu Xu, Hongjiang Lv, Shruti Bhosale, Chenguang Zhu, Karthik Abinav Sankararaman, Eryk Helenowski, Melanie Kambadur, Aditya Tayade, Hao Ma, Han Fang, Sinong Wang
To address this gap, we introduce Multi-IF, a new benchmark designed to assess LLMs' proficiency in following multi-turn and multilingual instructions.
1 code implementation • 30 Sep 2024 • Ming Zhong, Aston Zhang, Xuewei Wang, Rui Hou, Wenhan Xiong, Chenguang Zhu, Zhengxing Chen, Liang Tan, Chloe Bi, Mike Lewis, Sravya Popuri, Sharan Narang, Melanie Kambadur, Dhruv Mahajan, Sergey Edunov, Jiawei Han, Laurens van der Maaten
The development and evaluation of Large Language Models (LLMs) have largely focused on individual capabilities.
no code implementations • 5 Sep 2024 • Chenguang Zhu, Shan Gao, Huafeng Chen, Guangqian Guo, Chaowei Wang, Yaoxing Wang, Chen Shu Lei, Quanjiang Fan
To solve this problem, we propose a dual-branch image fusion network called Tmamba.
1 code implementation • 16 Aug 2024 • Yulong Chen, Yang Liu, Jianhao Yan, Xuefeng Bai, Ming Zhong, Yinghao Yang, ZiYi Yang, Chenguang Zhu, Yue Zhang
We then build a benchmark, SC-G4, consisting of 1, 835 instances generated by GPT-4 using these patterns, with human-annotated gold responses.
no code implementations • 1 Jul 2024 • Sathish Reddy Indurthi, Wenxuan Zhou, Shamil Chollampatt, Ravi Agrawal, Kaiqiang Song, Lingxiao Zhao, Chenguang Zhu
Advancements in Large Language Models (LLMs) have significantly enhanced instruction-following capabilities.
1 code implementation • 17 Jun 2024 • Wenxuan Zhou, Ravi Agrawal, Shujian Zhang, Sathish Reddy Indurthi, Sanqiang Zhao, Kaiqiang Song, Silei Xu, Chenguang Zhu
This method not only addresses the distributional gap problem but also enhances the optimization process without incurring additional costs.
no code implementations • CVPR 2024 • Zineng Tang, ZiYi Yang, Mahmoud Khademi, Yang Liu, Chenguang Zhu, Mohit Bansal
We present CoDi-2 a Multimodal Large Language Model (MLLM) for learning in-context interleaved multimodal representations.
no code implementations • 30 Nov 2023 • Zineng Tang, ZiYi Yang, Mahmoud Khademi, Yang Liu, Chenguang Zhu, Mohit Bansal
We present CoDi-2, a versatile and interactive Multimodal Large Language Model (MLLM) that can follow complex multimodal interleaved instructions, conduct in-context learning (ICL), reason, chat, edit, etc., in an any-to-any input-output modality paradigm.
1 code implementation • 19 Oct 2023 • Siru Ouyang, Shuohang Wang, Yang Liu, Ming Zhong, Yizhu Jiao, Dan Iter, Reid Pryzant, Chenguang Zhu, Heng Ji, Jiawei Han
Recent progress in Large Language Models (LLMs) has produced models that exhibit remarkable performance across a variety of NLP tasks.
no code implementations • 19 Oct 2023 • Zhihan Zhang, Shuohang Wang, Wenhao Yu, Yichong Xu, Dan Iter, Qingkai Zeng, Yang Liu, Chenguang Zhu, Meng Jiang
Large language models (LLMs) can perform a wide range of tasks by following natural language instructions, without the necessity of task-specific fine-tuning.
no code implementations • 4 Oct 2023 • Tanmay Gautam, Reid Pryzant, ZiYi Yang, Chenguang Zhu, Somayeh Sojoudi
SCQ works like a differentiable convex optimization (DCO) layer: in the forward pass, we solve for the optimal convex combination of codebook vectors that quantize the inputs.
1 code implementation • NeurIPS 2023 • Liliang Ren, Yang Liu, Shuohang Wang, Yichong Xu, Chenguang Zhu, ChengXiang Zhai
To validate the effectiveness of SMA on sequence modeling, we design a novel neural architecture, SeqBoat, which employs SMA to sparsely activate a Gated Attention Unit (GAU) based on the state representations learned from an SSM.
1 code implementation • 4 Jun 2023 • Banghua Zhu, Hiteshi Sharma, Felipe Vieira Frujeri, Shi Dong, Chenguang Zhu, Michael I. Jordan, Jiantao Jiao
Reinforcement learning from human feedback (RLHF) has emerged as a reliable approach to aligning large language models (LLMs) to human preferences.
1 code implementation • 24 May 2023 • Dan Iter, Reid Pryzant, Ruochen Xu, Shuohang Wang, Yang Liu, Yichong Xu, Chenguang Zhu
Our method is based on the observation that the effectiveness of in-context demonstrations negatively correlates with the perplexity of the test example by a language model that was finetuned on that demonstration.
1 code implementation • 23 May 2023 • Simeng Sun, Yang Liu, Shuohang Wang, Chenguang Zhu, Mohit Iyyer
PEARL outperforms zero-shot and chain-of-thought prompting on this dataset, and ablation experiments show that each stage of PEARL is critical to its performance.
no code implementations • 23 May 2023 • Yuwei Fang, Mahmoud Khademi, Chenguang Zhu, ZiYi Yang, Reid Pryzant, Yichong Xu, Yao Qian, Takuya Yoshioka, Lu Yuan, Michael Zeng, Xuedong Huang
Artificial General Intelligence (AGI) requires comprehensive understanding and generation capabilities for a variety of tasks spanning different modalities and functionalities.
no code implementations • 22 May 2023 • Ruochen Xu, Song Wang, Yang Liu, Shuohang Wang, Yichong Xu, Dan Iter, Chenguang Zhu, Michael Zeng
We hypothesize that there is a hidden query for each summary sentence in a generic summarization annotation, and we utilize a large-scale pretrained language model to recover it.
no code implementations • 22 May 2023 • Yichong Xu, Ruochen Xu, Dan Iter, Yang Liu, Shuohang Wang, Chenguang Zhu, Michael Zeng
While large models such as GPT-3 demonstrate exceptional performance in zeroshot and fewshot summarization tasks, their extensive serving and fine-tuning costs hinder their utilization in various applications.
no code implementations • 21 May 2023 • ZiYi Yang, Mahmoud Khademi, Yichong Xu, Reid Pryzant, Yuwei Fang, Chenguang Zhu, Dongdong Chen, Yao Qian, Mei Gao, Yi-Ling Chen, Robert Gmyr, Naoyuki Kanda, Noel Codella, Bin Xiao, Yu Shi, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang
The convergence of text, visual, and audio data is a key step towards human-like artificial intelligence, however the current Vision-Language-Speech landscape is dominated by encoder-only models which lack generative abilities.
2 code implementations • NeurIPS 2023 • Zineng Tang, ZiYi Yang, Chenguang Zhu, Michael Zeng, Mohit Bansal
We present Composable Diffusion (CoDi), a novel generative model capable of generating any combination of output modalities, such as language, image, video, or audio, from any combination of input modalities.
Ranked #8 on Audio Generation on AudioCaps (FAD metric)
1 code implementation • 15 May 2023 • Canwen Xu, Yichong Xu, Shuohang Wang, Yang Liu, Chenguang Zhu, Julian McAuley
Large language models (LLMs) such as GPT-3 and GPT-4 are powerful but their weights are often publicly unavailable and their immense sizes make the models difficult to be tuned with common hardware.
5 code implementations • 4 May 2023 • Reid Pryzant, Dan Iter, Jerry Li, Yin Tat Lee, Chenguang Zhu, Michael Zeng
Large Language Models (LLMs) have shown impressive performance as general purpose agents, but their abilities remain highly dependent on prompts which are hand written with onerous trial-and-error effort.
2 code implementations • 29 Mar 2023 • Yang Liu, Dan Iter, Yichong Xu, Shuohang Wang, Ruochen Xu, Chenguang Zhu
In this work, we present G-Eval, a framework of using large language models with chain-of-thoughts (CoT) and a form-filling paradigm, to assess the quality of NLG outputs.
no code implementations • 22 Feb 2023 • Simeng Sun, Yang Liu, Dan Iter, Chenguang Zhu, Mohit Iyyer
This motivates the use of parameter-efficient adaptation methods such as prompt tuning (PT), which adds a small number of tunable embeddings to an otherwise frozen model, and in-context learning (ICL), in which demonstrations of the task are provided to the model in natural language without any additional training.
no code implementations • 19 Dec 2022 • Soumya Sanyal, Yichong Xu, Shuohang Wang, ZiYi Yang, Reid Pryzant, Wenhao Yu, Chenguang Zhu, Xiang Ren
Logical reasoning of text is an important ability that requires understanding the information present in the text, their interconnections, and then reasoning through them to infer new conclusions.
4 code implementations • CVPR 2023 • Zineng Tang, ZiYi Yang, Guoxin Wang, Yuwei Fang, Yang Liu, Chenguang Zhu, Michael Zeng, Cha Zhang, Mohit Bansal
UDOP leverages the spatial correlation between textual content and document image to model image, text, and layout modalities with one uniform representation.
Ranked #5 on Visual Question Answering (VQA) on InfographicVQA (using extra training data)
1 code implementation • CVPR 2023 • Shuquan Ye, Yujia Xie, Dongdong Chen, Yichong Xu, Lu Yuan, Chenguang Zhu, Jing Liao
Through our analysis, we find one important reason is that existing large-scale VL datasets do not contain much commonsense knowledge, which motivates us to improve the commonsense of VL-models from the data perspective.
1 code implementation • 17 Nov 2022 • Yulong Chen, Yang Liu, Ruochen Xu, ZiYi Yang, Chenguang Zhu, Michael Zeng, Yue Zhang
The high annotation costs and diverse demands of various summarization tasks motivate the development of few-shot summarization.
no code implementations • 15 Nov 2022 • Ziniu Hu, Yichong Xu, Wenhao Yu, Shuohang Wang, ZiYi Yang, Chenguang Zhu, Kai-Wei Chang, Yizhou Sun
Answering open-domain questions requires world knowledge about in-context entities.
1 code implementation • 9 Nov 2022 • Yusen Zhang, Yang Liu, ZiYi Yang, Yuwei Fang, Yulong Chen, Dragomir Radev, Chenguang Zhu, Michael Zeng, Rui Zhang
We propose two simple and effective parameter-efficient approaches for the new task of mixed controllable summarization based on hard prompt tuning and soft prefix tuning.
1 code implementation • 23 Oct 2022 • Wenhao Yu, Chenguang Zhu, Zhihan Zhang, Shuohang Wang, Zhuosheng Zhang, Yuwei Fang, Meng Jiang
However, applying such methods to commonsense reasoning tasks faces two unique challenges, i. e., the lack of a general large-scale corpus for retrieval and a corresponding effective commonsense retriever.
1 code implementation • 23 Oct 2022 • Vin Sachidananda, ZiYi Yang, Chenguang Zhu
Contrastive Learning has recently achieved state-of-the-art performance in a wide range of tasks.
2 code implementations • 13 Oct 2022 • Ming Zhong, Yang Liu, Da Yin, Yuning Mao, Yizhu Jiao, PengFei Liu, Chenguang Zhu, Heng Ji, Jiawei Han
We re-frame NLG evaluation as a Boolean Question Answering (QA) task, and by guiding the model with different questions, we can use one evaluator to evaluate from multiple dimensions.
1 code implementation • 12 Oct 2022 • Zhuosheng Zhang, Shuohang Wang, Yichong Xu, Yuwei Fang, Wenhao Yu, Yang Liu, Hai Zhao, Chenguang Zhu, Michael Zeng
Leveraging task-aware annotated data as supervised signals to assist with self-supervised learning on large-scale unlabeled data has become a new trend in pre-training language models.
Ranked #1 on Sentence Completion on HellaSwag
1 code implementation • 7 Oct 2022 • Zhihan Zhang, Wenhao Yu, Chenguang Zhu, Meng Jiang
The entity knowledge is stored in the memory as latent representations, and the memory is pre-trained on Wikipedia along with encoder-decoder parameters.
no code implementations • 6 Oct 2022 • Junyi Chai, Reid Pryzant, Victor Ye Dong, Konstantin Golobokov, Chenguang Zhu, Yi Liu
Controllable text generation systems often leverage control codes to direct various properties of the output like style and length.
2 code implementations • 21 Sep 2022 • Wenhao Yu, Dan Iter, Shuohang Wang, Yichong Xu, Mingxuan Ju, Soumya Sanyal, Chenguang Zhu, Michael Zeng, Meng Jiang
We call our method generate-then-read (GenRead), which first prompts a large language model to generate contextutal documents based on a given question, and then reads the generated documents to produce the final answer.
1 code implementation • 21 Aug 2022 • Pengcheng He, Baolin Peng, Liyang Lu, Song Wang, Jie Mei, Yang Liu, Ruochen Xu, Hany Hassan Awadalla, Yu Shi, Chenguang Zhu, Wayne Xiong, Michael Zeng, Jianfeng Gao, Xuedong Huang
Z-Code++ creates new state of the art on 9 out of 13 text summarization tasks across 5 languages.
1 code implementation • 2 Jun 2022 • Yuanze Lin, Yujia Xie, Dongdong Chen, Yichong Xu, Chenguang Zhu, Lu Yuan
Specifically, we observe that in most state-of-the-art knowledge-based VQA methods: 1) visual features are extracted either from the whole image or in a sliding window manner for retrieving knowledge, and the important relationship within/among object regions is neglected; 2) visual features are not well utilized in the final answering model, which is counter-intuitive to some extent.
Ranked #11 on Visual Question Answering (VQA) on OK-VQA
1 code implementation • 25 May 2022 • Yixin Liu, Ansong Ni, Linyong Nan, Budhaditya Deb, Chenguang Zhu, Ahmed H. Awadallah, Dragomir Radev
Our experimental results show that our model has a better performance compared with strong baselines with efficient attention modules, and our analysis provides further insights into our locality-aware modeling strategy.
1 code implementation • 22 May 2022 • Zhenhailong Wang, Manling Li, Ruochen Xu, Luowei Zhou, Jie Lei, Xudong Lin, Shuohang Wang, ZiYi Yang, Chenguang Zhu, Derek Hoiem, Shih-Fu Chang, Mohit Bansal, Heng Ji
The goal of this work is to build flexible video-language models that can generalize to various video-to-text tasks from few examples, such as domain-specific captioning, question answering, and future event prediction.
1 code implementation • 18 May 2022 • Reid Pryzant, ZiYi Yang, Yichong Xu, Chenguang Zhu, Michael Zeng
Semi-supervised learning has shown promise in allowing NLP models to generalize from small amounts of labeled data.
no code implementations • 3 May 2022 • ZiYi Yang, Yuwei Fang, Chenguang Zhu, Reid Pryzant, Dongdong Chen, Yu Shi, Yichong Xu, Yao Qian, Mei Gao, Yi-Ling Chen, Liyang Lu, Yujia Xie, Robert Gmyr, Noel Codella, Naoyuki Kanda, Bin Xiao, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang
Human intelligence is multimodal; we integrate visual, linguistic, and acoustic signals to maintain a holistic worldview.
no code implementations • 13 Apr 2022 • Chenguang Zhu, Michael Zeng
Recent development of large-scale pre-trained language models (PLM) have significantly improved the capability of models in various NLP tasks, in terms of performance after task-specific fine-tuning and zero-shot / few-shot learning.
1 code implementation • ACL 2022 • Shuohang Wang, Yichong Xu, Yuwei Fang, Yang Liu, Siqi Sun, Ruochen Xu, Chenguang Zhu, Michael Zeng
Surprisingly, we found that REtrieving from the traINing datA (REINA) only can lead to significant gains on multiple NLG and NLU tasks.
no code implementations • ACL 2022 • Woojeong Jin, Dong-Ho Lee, Chenguang Zhu, Jay Pujara, Xiang Ren
Pre-trained language models are still far from human performance in tasks that need understanding of properties (e. g. appearance, measurable quantity) and affordances of everyday objects in the real world since the text lacks such information due to reporting bias.
1 code implementation • NAACL (DLG4NLP) 2022 • Wenhao Yu, Chenguang Zhu, Lianhui Qin, Zhihan Zhang, Tong Zhao, Meng Jiang
A set of knowledge experts seek diverse reasoning on KG to encourage various generation outputs.
no code implementations • 18 Feb 2022 • Ripon K. Saha, Akira Ura, Sonal Mahajan, Chenguang Zhu, Linyi Li, Yang Hu, Hiroaki Yoshida, Sarfraz Khurshid, Mukul R. Prasad
In this work we propose an AutoML technique SapientML, that can learn from a corpus of existing datasets and their human-written pipelines, and efficiently generate a high-quality pipeline for a predictive task on a new dataset.
no code implementations • 10 Feb 2022 • Yulong Chen, Yang Liu, Li Dong, Shuohang Wang, Chenguang Zhu, Michael Zeng, Yue Zhang
However, for prompt learning, there are still two salient gaps between NLP tasks and pretraining.
2 code implementations • 29 Jan 2022 • Ming Zhong, Yang Liu, Suyu Ge, Yuning Mao, Yizhu Jiao, Xingxing Zhang, Yichong Xu, Chenguang Zhu, Michael Zeng, Jiawei Han
In this paper, we propose the first unsupervised multi-granularity summarization framework, GranuSum.
2 code implementations • CVPR 2022 • Manling Li, Ruochen Xu, Shuohang Wang, Luowei Zhou, Xudong Lin, Chenguang Zhu, Michael Zeng, Heng Ji, Shih-Fu Chang
Vision-language (V+L) pretraining models have achieved great success in supporting multimedia applications by understanding the alignments between images and text.
1 code implementation • 8 Dec 2021 • Yixin Nie, Linjie Li, Zhe Gan, Shuohang Wang, Chenguang Zhu, Michael Zeng, Zicheng Liu, Mohit Bansal, Lijuan Wang
Based on this, we ask an even bolder question: can we have an all-MLP architecture for VL modeling, where both VL fusion and the vision encoder are replaced with MLPs?
2 code implementations • 6 Dec 2021 • Yichong Xu, Chenguang Zhu, Shuohang Wang, Siqi Sun, Hao Cheng, Xiaodong Liu, Jianfeng Gao, Pengcheng He, Michael Zeng, Xuedong Huang
In particular, we focus on the task of Commonsense Reasoning, demonstrating that the proposed external attention mechanism can augment existing transformer models and significantly improve the model's reasoning capabilities.
Ranked #1 on Common Sense Reasoning on CommonsenseQA (using extra training data)
3 code implementations • CVPR 2022 • Zi-Yi Dou, Yichong Xu, Zhe Gan, JianFeng Wang, Shuohang Wang, Lijuan Wang, Chenguang Zhu, Pengchuan Zhang, Lu Yuan, Nanyun Peng, Zicheng Liu, Michael Zeng
Vision-and-language (VL) pre-training has proven to be highly effective on various VL downstream tasks.
Ranked #20 on Cross-Modal Retrieval on COCO 2014 (using extra training data)
no code implementations • 21 Oct 2021 • Baolin Peng, Chunyuan Li, Zhu Zhang, Jinchao Li, Chenguang Zhu, Jianfeng Gao
We propose SYNERGY, a hybrid learning framework where a task bot is developed in two steps: (i) Symbolic knowledge to neural networks: Large amounts of simulated dialog sessions are generated based on task-specific symbolic knowledge which is represented as a task schema consisting of dialog flows and task-oriented databases.
2 code implementations • ACL 2022 • Yusen Zhang, Ansong Ni, Ziming Mao, Chen Henry Wu, Chenguang Zhu, Budhaditya Deb, Ahmed H. Awadallah, Dragomir Radev, Rui Zhang
To the best of our knowledge, Summ$^N$ is the first multi-stage split-then-summarize framework for long input summarization.
no code implementations • Findings (ACL) 2022 • Yuwei Fang, Shuohang Wang, Yichong Xu, Ruochen Xu, Siqi Sun, Chenguang Zhu, Michael Zeng
Then we utilize a diverse of 4 English knowledge sources to provide more comprehensive coverage of knowledge in different formats.
1 code implementation • ACL 2022 • Ziming Mao, Chen Henry Wu, Ansong Ni, Yusen Zhang, Rui Zhang, Tao Yu, Budhaditya Deb, Chenguang Zhu, Ahmed H. Awadallah, Dragomir Radev
Transformer-based models have achieved state-of-the-art performance on short-input summarization.
no code implementations • Findings (ACL) 2022 • Yang Liu, Chenguang Zhu, Michael Zeng
In this paper, we bring a new way of digesting news content by introducing the task of segmenting a news article into multiple sections and generating the corresponding summary to each section.
1 code implementation • Findings (ACL) 2022 • Wenhao Yu, Chenguang Zhu, Yuwei Fang, Donghan Yu, Shuohang Wang, Yichong Xu, Michael Zeng, Meng Jiang
In addition to training with the masked language modeling objective, we propose two novel self-supervised pre-training tasks on word and sentence-level alignment between input text sequence and rare word definitions to enhance language modeling representation with dictionary.
no code implementations • ACL 2022 • Donghan Yu, Chenguang Zhu, Yuwei Fang, Wenhao Yu, Shuohang Wang, Yichong Xu, Xiang Ren, Yiming Yang, Michael Zeng
The recent proposed Fusion-in-Decoder (FiD), which is built on top of the pretrained generative model T5, achieves the state-of-the-art performance in the reading module.
1 code implementation • 10 Sep 2021 • Yusen Zhang, Ansong Ni, Tao Yu, Rui Zhang, Chenguang Zhu, Budhaditya Deb, Asli Celikyilmaz, Ahmed Hassan Awadallah, Dragomir Radev
Dialogue summarization helps readers capture salient information from long conversations in meetings, interviews, and TV series.
1 code implementation • 6 Sep 2021 • Ming Zhong, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng
For a dialogue, it corrupts a window of text with dialogue-inspired noise, and guides the model to reconstruct this window based on the content of the remaining conversation.
no code implementations • 1 Sep 2021 • Ruochen Xu, Yuwei Fang, Chenguang Zhu, Michael Zeng
It is often observed in knowledge-centric tasks (e. g., common sense question and answering, relation classification) that the integration of external knowledge such as entity representation into language models can help provide useful information to boost the performance.
1 code implementation • Findings (EMNLP) 2021 • Shuohang Wang, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng
Data annotation is a time-consuming and labor-intensive process for many NLP tasks.
1 code implementation • Findings (ACL) 2021 • Han Wang, Yang Liu, Chenguang Zhu, Linjun Shou, Ming Gong, Yichong Xu, Michael Zeng
Commonsense generation is a challenging task of generating a plausible sentence describing an everyday scenario using provided concepts.
1 code implementation • EMNLP 2021 • Wenhao Yu, Chenguang Zhu, Tong Zhao, Zhichun Guo, Meng Jiang
Generating paragraphs of diverse contents is important in many applications.
1 code implementation • NAACL 2021 • Chenguang Zhu, Yang Liu, Jie Mei, Michael Zeng
MediaSum, a large-scale media interview dataset consisting of 463. 6K transcripts with abstractive summaries.
no code implementations • ACL 2021 • Baolin Peng, Chunyuan Li, Zhu Zhang, Chenguang Zhu, Jinchao Li, Jianfeng Gao
For task-oriented dialog systems to be maximally useful, it must be able to process conversations in a way that is (1) generalizable with a small number of training examples for new task domains, and (2) robust to user input in various styles, modalities or domains.
2 code implementations • Findings (ACL) 2021 • Yichong Xu, Chenguang Zhu, Ruochen Xu, Yang Liu, Michael Zeng, Xuedong Huang
However, although a KG contains rich structural information, it lacks the context to provide a more precise understanding of the concepts.
Ranked #5 on Common Sense Reasoning on CommonsenseQA (using extra training data)
1 code implementation • Asian Chapter of the Association for Computational Linguistics 2020 • Ruochen Xu, Chenguang Zhu, Yu Shi, Michael Zeng, Xuedong Huang
Cross-lingual Summarization (CLS) aims at producing a summary in the target language for an article in the source language.
3 code implementations • 9 Oct 2020 • Wenhao Yu, Chenguang Zhu, Zaitang Li, Zhiting Hu, Qingyun Wang, Heng Ji, Meng Jiang
To address this issue, researchers have considered incorporating various forms of knowledge beyond the input text into the generation models.
2 code implementations • NAACL 2021 • Yu-An Chung, Chenguang Zhu, Michael Zeng
Besides conducting a self-supervised masked language modeling task on the two individual modules using unpaired speech and text, SPLAT aligns representations from the two modules in a shared latent space using a small amount of paired speech and text.
no code implementations • 2 Oct 2020 • Donghan Yu, Chenguang Zhu, Yiming Yang, Michael Zeng
Knowledge graphs (KGs) contain rich information about world knowledge, entities and relations.
2 code implementations • EMNLP 2021 • Xiangyu Dong, Wenhao Yu, Chenguang Zhu, Meng Jiang
Our model has a multi-step decoder that injects the entity types into the process of entity mention generation.
no code implementations • 10 Sep 2020 • Yuwei Fang, Shuohang Wang, Zhe Gan, Siqi Sun, Jingjing Liu, Chenguang Zhu
Although deep neural networks have achieved tremendous success for question answering (QA), they are still suffering from heavy computational and energy cost for real product deployment.
no code implementations • 27 Jun 2020 • Beliz Gunel, Chenguang Zhu, Michael Zeng, Xuedong Huang
In this work, we propose a novel architecture that extends Transformer encoder-decoder architecture in order to improve on these shortcomings.
no code implementations • ICLR 2021 • Vin Sachidananda, ZiYi Yang, Chenguang Zhu
Due to widespread interest in machine translation and transfer learning, there are numerous algorithms for mapping multiple embeddings to a shared representation space.
no code implementations • 3 Jun 2020 • Yumo Xu, Chenguang Zhu, Baolin Peng, Michael Zeng
Dialog policy determines the next-step actions for agents and hence is central to a dialogue system.
no code implementations • SIGDIAL (ACL) 2020 • Chenguang Zhu
The natural language generation (NLG) module in a task-oriented dialogue system produces user-facing utterances conveying required information.
1 code implementation • 29 Apr 2020 • Baolin Peng, Chenguang Zhu, Michael Zeng, Jianfeng Gao
The training of spoken language understanding (SLU) models often faces the problem of data scarcity.
3 code implementations • Findings of the Association for Computational Linguistics 2020 • Chenguang Zhu, Ruochen Xu, Michael Zeng, Xuedong Huang
With the abundance of automatic meeting transcripts, meeting summarization is of great interest to both participants and other parties.
no code implementations • NAACL 2021 • Chenguang Zhu, William Hinthorn, Ruochen Xu, Qingkai Zeng, Michael Zeng, Xuedong Huang, Meng Jiang
Automatic abstractive summaries are found to often distort or fabricate facts in the article.
2 code implementations • Findings of the Association for Computational Linguistics 2020 • Baolin Peng, Chenguang Zhu, Chunyuan Li, Xiujun Li, Jinchao Li, Michael Zeng, Jianfeng Gao
It is pre-trained on a large set of annotated NLG corpus to acquire the controllable generation ability, and fine-tuned with only a few domain-specific labels to adapt to new domains.
Ranked #4 on Data-to-Text Generation on MULTIWOZ 2.1
no code implementations • Findings of the Association for Computational Linguistics 2020 • Ziyi Yang, Chenguang Zhu, Robert Gmyr, Michael Zeng, Xuedong Huang, Eric Darve
Text summarization aims to extract essential information from a piece of text and transform the text into a concise version.
no code implementations • 25 Dec 2019 • Chenguang Zhu, Ziyi Yang, Robert Gmyr, Michael Zeng, Xuedong Huang
A typical journalistic convention in news articles is to deliver the most salient information in the beginning, also known as the lead bias.
no code implementations • IJCNLP 2019 • Chenguang Zhu, Michael Zeng, Xuedong Huang
In this paper, we propose a novel multi-task learning framework, NLG-LM, for natural language generation.
no code implementations • WS 2019 • Chenguang Zhu, Michael Zeng, Xuedong Huang
In this paper, we put forward a slot-independent neural model (SIM) to track dialogue states while keeping the model complexity invariant to the number of dialogue slots.
no code implementations • 25 Sep 2019 • Chenguang Zhu, ZiYi Yang, Robert Gmyr, Michael Zeng, Xuedong Huang
For example, the pretrained model without finetuning outperforms pointer-generator network on CNN/DailyMail dataset.
1 code implementation • ACL 2019 • Ziyi Yang, Chenguang Zhu, Sachidan, Vin a, Eric Darve
In this paper, we propose an approach for embedding imputation which uses grounded information in the form of a knowledge graph.
no code implementations • ACL 2019 • Ziyi Yang, Chenguang Zhu, Vin Sachidananda, Eric Darve
In this paper, we propose an approach for embedding imputation which uses grounded information in the form of a knowledge graph.
1 code implementation • ICLR 2019 • Ziyi Yang, Chenguang Zhu, Weizhu Chen
We model the semantic meaning of a word in a sentence based on two aspects.
6 code implementations • 10 Dec 2018 • Chenguang Zhu, Michael Zeng, Xuedong Huang
Conversational question answering (CQA) is a novel QA task that requires understanding of dialogue context.
Ranked #3 on Question Answering on CoQA (Overall metric)
Conversational Question Answering Machine Reading Comprehension +1
1 code implementation • IJCNLP 2019 • Ziyi Yang, Chenguang Zhu, Weizhu Chen
Inspired by the Gram-Schmidt Process in geometric theory, we build an orthogonal basis of the subspace spanned by a word and its surrounding context in a sentence.
3 code implementations • NAACL 2019 • Jianmo Ni, Chenguang Zhu, Weizhu Chen, Julian McAuley
In this paper we propose a retriever-reader model that learns to attend on essential terms during the question answering process.
3 code implementations • ICLR 2018 • Hsin-Yuan Huang, Chenguang Zhu, Yelong Shen, Weizhu Chen
This paper introduces a new neural structure called FusionNet, which extends existing attention approaches from three perspectives.
Ranked #26 on Question Answering on SQuAD1.1 dev