no code implementations • ACL 2022 • Chenguang Zhu, Yichong Xu, Xiang Ren, Bill Lin, Meng Jiang, Wenhao Yu
Knowledge in natural language processing (NLP) has been a rising trend especially after the advent of large scale pre-trained models.
no code implementations • 24 May 2023 • Dan Iter, Reid Pryzant, Ruochen Xu, Shuohang Wang, Yang Liu, Yichong Xu, Chenguang Zhu
Our method is based on the observation that the effectiveness of in-context demonstrations negatively correlates with the perplexity of the test example by a language model that was finetuned on that demonstration.
no code implementations • 23 May 2023 • Yuwei Fang, Mahmoud Khademi, Chenguang Zhu, ZiYi Yang, Reid Pryzant, Yichong Xu, Yao Qian, Takuya Yoshioka, Lu Yuan, Michael Zeng, Xuedong Huang
Artificial General Intelligence (AGI) requires comprehensive understanding and generation capabilities for a variety of tasks spanning different modalities and functionalities.
no code implementations • 22 May 2023 • Ruochen Xu, Song Wang, Yang Liu, Shuohang Wang, Yichong Xu, Dan Iter, Chenguang Zhu, Michael Zeng
We hypothesize that there is a hidden query for each summary sentence in a generic summarization annotation, and we utilize a large-scale pretrained language model to recover it.
no code implementations • 22 May 2023 • Yichong Xu, Ruochen Xu, Dan Iter, Yang Liu, Shuohang Wang, Chenguang Zhu, Michael Zeng
While large models such as GPT-3 demonstrate exceptional performance in zeroshot and fewshot summarization tasks, their extensive serving and fine-tuning costs hinder their utilization in various applications.
no code implementations • 21 May 2023 • ZiYi Yang, Mahmoud Khademi, Yichong Xu, Reid Pryzant, Yuwei Fang, Chenguang Zhu, Dongdong Chen, Yao Qian, Mei Gao, Yi-Ling Chen, Robert Gmyr, Naoyuki Kanda, Noel Codella, Bin Xiao, Yu Shi, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang
The convergence of text, visual, and audio data is a key step towards human-like artificial intelligence, however the current Vision-Language-Speech landscape is dominated by encoder-only models which lack generative abilities.
1 code implementation • 15 May 2023 • Canwen Xu, Yichong Xu, Shuohang Wang, Yang Liu, Chenguang Zhu, Julian McAuley
Large language models (LLMs) such as GPT-3 and GPT-4 are powerful but their weights are often publicly unavailable and their immense sizes make the models difficult to be tuned with common hardware.
no code implementations • 29 Mar 2023 • Yang Liu, Dan Iter, Yichong Xu, Shuohang Wang, Ruochen Xu, Chenguang Zhu
In this work, we present G-Eval, a framework of using large language models with chain-of-thoughts (CoT) and a form-filling paradigm, to assess the quality of NLG outputs.
no code implementations • 19 Dec 2022 • Soumya Sanyal, Yichong Xu, Shuohang Wang, ZiYi Yang, Reid Pryzant, Wenhao Yu, Chenguang Zhu, Xiang Ren
Logical reasoning of text is an important ability that requires understanding the information present in the text, their interconnections, and then reasoning through them to infer new conclusions.
1 code implementation • CVPR 2023 • Shuquan Ye, Yujia Xie, Dongdong Chen, Yichong Xu, Lu Yuan, Chenguang Zhu, Jing Liao
Through our analysis, we find one important reason is that existing large-scale VL datasets do not contain much commonsense knowledge, which motivates us to improve the commonsense of VL-models from the data perspective.
no code implementations • 15 Nov 2022 • Ziniu Hu, Yichong Xu, Wenhao Yu, Shuohang Wang, ZiYi Yang, Chenguang Zhu, Kai-Wei Chang, Yizhou Sun
Answering open-domain questions requires world knowledge about in-context entities.
1 code implementation • 12 Oct 2022 • Zhuosheng Zhang, Shuohang Wang, Yichong Xu, Yuwei Fang, Wenhao Yu, Yang Liu, Hai Zhao, Chenguang Zhu, Michael Zeng
Leveraging task-aware annotated data as supervised signals to assist with self-supervised learning on large-scale unlabeled data has become a new trend in pre-training language models.
1 code implementation • 21 Sep 2022 • Wenhao Yu, Dan Iter, Shuohang Wang, Yichong Xu, Mingxuan Ju, Soumya Sanyal, Chenguang Zhu, Michael Zeng, Meng Jiang
We call our method generate-then-read (GenRead), which first prompts a large language model to generate contextutal documents based on a given question, and then reads the generated documents to produce the final answer.
1 code implementation • 2 Jun 2022 • Yuanze Lin, Yujia Xie, Dongdong Chen, Yichong Xu, Chenguang Zhu, Lu Yuan
Specifically, we observe that in most state-of-the-art knowledge-based VQA methods: 1) visual features are extracted either from the whole image or in a sliding window manner for retrieving knowledge, and the important relationship within/among object regions is neglected; 2) visual features are not well utilized in the final answering model, which is counter-intuitive to some extent.
Ranked #4 on
Visual Question Answering (VQA)
on OK-VQA
1 code implementation • 18 May 2022 • Reid Pryzant, ZiYi Yang, Yichong Xu, Chenguang Zhu, Michael Zeng
Semi-supervised learning has shown promise in allowing NLP models to generalize from small amounts of labeled data.
no code implementations • 3 May 2022 • ZiYi Yang, Yuwei Fang, Chenguang Zhu, Reid Pryzant, Dongdong Chen, Yu Shi, Yichong Xu, Yao Qian, Mei Gao, Yi-Ling Chen, Liyang Lu, Yujia Xie, Robert Gmyr, Noel Codella, Naoyuki Kanda, Bin Xiao, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang
Human intelligence is multimodal; we integrate visual, linguistic, and acoustic signals to maintain a holistic worldview.
1 code implementation • 5 Apr 2022 • Yusha Liu, Yichong Xu, Nihar B. Shah, Aarti Singh
Our approach addresses the two aforementioned challenges by: (i) ensuring that rankings are incorporated into the updates scores in the same manner for all papers, thereby mitigating arbitrariness, and (ii) allowing to seamlessly use existing interfaces and workflows designed for scores.
1 code implementation • ACL 2022 • Shuohang Wang, Yichong Xu, Yuwei Fang, Yang Liu, Siqi Sun, Ruochen Xu, Chenguang Zhu, Michael Zeng
Surprisingly, we found that REtrieving from the traINing datA (REINA) only can lead to significant gains on multiple NLG and NLU tasks.
1 code implementation • 29 Jan 2022 • Ming Zhong, Yang Liu, Suyu Ge, Yuning Mao, Yizhu Jiao, Xingxing Zhang, Yichong Xu, Chenguang Zhu, Michael Zeng, Jiawei Han
In this paper, we propose the first unsupervised multi-granularity summarization framework, GranuSum.
2 code implementations • 6 Dec 2021 • Yichong Xu, Chenguang Zhu, Shuohang Wang, Siqi Sun, Hao Cheng, Xiaodong Liu, Jianfeng Gao, Pengcheng He, Michael Zeng, Xuedong Huang
In particular, we focus on the task of Commonsense Reasoning, demonstrating that the proposed external attention mechanism can augment existing transformer models and significantly improve the model's reasoning capabilities.
Ranked #2 on
Common Sense Reasoning
on CommonsenseQA
(using extra training data)
1 code implementation • CVPR 2022 • Zi-Yi Dou, Yichong Xu, Zhe Gan, JianFeng Wang, Shuohang Wang, Lijuan Wang, Chenguang Zhu, Pengchuan Zhang, Lu Yuan, Nanyun Peng, Zicheng Liu, Michael Zeng
Vision-and-language (VL) pre-training has proven to be highly effective on various VL downstream tasks.
Ranked #17 on
Cross-Modal Retrieval
on COCO 2014
no code implementations • Findings (ACL) 2022 • Yuwei Fang, Shuohang Wang, Yichong Xu, Ruochen Xu, Siqi Sun, Chenguang Zhu, Michael Zeng
Then we utilize a diverse of 4 English knowledge sources to provide more comprehensive coverage of knowledge in different formats.
1 code implementation • Findings (ACL) 2022 • Wenhao Yu, Chenguang Zhu, Yuwei Fang, Donghan Yu, Shuohang Wang, Yichong Xu, Michael Zeng, Meng Jiang
In addition to training with the masked language modeling objective, we propose two novel self-supervised pre-training tasks on word and sentence-level alignment between input text sequence and rare word definitions to enhance language modeling representation with dictionary.
no code implementations • ACL 2022 • Donghan Yu, Chenguang Zhu, Yuwei Fang, Wenhao Yu, Shuohang Wang, Yichong Xu, Xiang Ren, Yiming Yang, Michael Zeng
The recent proposed Fusion-in-Decoder (FiD), which is built on top of the pretrained generative model T5, achieves the state-of-the-art performance in the reading module.
1 code implementation • 6 Sep 2021 • Ming Zhong, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng
For a dialogue, it corrupts a window of text with dialogue-inspired noise, and guides the model to reconstruct this window based on the content of the remaining conversation.
1 code implementation • Findings (EMNLP) 2021 • Shuohang Wang, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng
Data annotation is a time-consuming and labor-intensive process for many NLP tasks.
1 code implementation • Findings (ACL) 2021 • Han Wang, Yang Liu, Chenguang Zhu, Linjun Shou, Ming Gong, Yichong Xu, Michael Zeng
Commonsense generation is a challenging task of generating a plausible sentence describing an everyday scenario using provided concepts.
2 code implementations • Findings (ACL) 2021 • Yichong Xu, Chenguang Zhu, Ruochen Xu, Yang Liu, Michael Zeng, Xuedong Huang
However, although a KG contains rich structural information, it lacks the context to provide a more precise understanding of the concepts.
Ranked #3 on
Common Sense Reasoning
on CommonsenseQA
(using extra training data)
no code implementations • NeurIPS 2020 • Yichong Xu, Ruosong Wang, Lin F. Yang, Aarti Singh, Artur Dubrawski
If preferences are stochastic, and the preference probability relates to the hidden reward values, we present algorithms for PbRL, both with and without a simulator, that are able to identify the best policy up to accuracy $\varepsilon$ with high probability.
no code implementations • 3 Nov 2019 • Yichong Xu, Aparna Joshi, Aarti Singh, Artur Dubrawski
We consider a novel setting of zeroth order non-convex optimization, where in addition to querying the function value at a given point, we can also duel two points and get the point with the larger function value.
no code implementations • 16 Oct 2019 • Yuexin Wu, Yichong Xu, Aarti Singh, Yiming Yang, Artur Dubrawski
Graph Neural Networks (GNNs) for prediction tasks like node classification or edge prediction have received increasing attention in recent machine learning from graphically structured data.
no code implementations • 14 Oct 2019 • Yichong Xu, Xi Chen, Aarti Singh, Artur Dubrawski
The Thresholding Bandit Problem (TBP) aims to find the set of arms with mean rewards greater than a given threshold.
no code implementations • 25 Sep 2019 • Yuexin Wu, Yichong Xu, Aarti Singh, Artur Dubrawski, Yiming Yang
Graph Neural Networks (GNNs) for prediction tasks like node classification or edge prediction have received increasing attention in recent machine learning from graphically structured data.
no code implementations • WS 2019 • Yichong Xu, Xiaodong Liu, Chunyuan Li, Hoifung Poon, Jianfeng Gao
We use a multi-source transfer learning approach to transfer the knowledge from MT-DNN and SciBERT to natural language understanding tasks in the medical domain.
5 code implementations • NAACL 2019 • Yichong Xu, Xiaodong Liu, Yelong Shen, Jingjing Liu, Jianfeng Gao
We propose a multi-task learning framework to learn a joint Machine Reading Comprehension (MRC) model that can be applied to a wide range of MRC tasks in different domains.
no code implementations • ICML 2018 • Yichong Xu, Hariank Muthakana, Sivaraman Balakrishnan, Aarti Singh, Artur Dubrawski
Finally, we present experiments that show the efficacy of RR and investigate its robustness to various sources of noise and model-misspecification.
1 code implementation • 16 Jun 2018 • Yichong Xu, Han Zhao, Xiaofei Shi, Jeremy Zhang, Nihar B. Shah
We then empirically show that the requisite property on the authorship graph is indeed satisfied in the submission data from the ICLR conference, and further demonstrate a simple trick to make the partitioning method more practically appealing for conference peer review.
no code implementations • ICML 2018 • Yichong Xu, Sivaraman Balakrishnan, Aarti Singh, Artur Dubrawski
In supervised learning, we typically leverage a fully labeled dataset to design methods for function estimation or prediction.
no code implementations • NeurIPS 2017 • Yichong Xu, Hongyang Zhang, Kyle Miller, Aarti Singh, Artur Dubrawski
We study the problem of interactively learning a binary classifier using noisy labeling and pairwise comparison oracles, where the comparison oracle answers which one in the given two instances is more likely to be positive.
no code implementations • 14 Nov 2017 • Yichong Xu, Jingjing Liu, Jianfeng Gao, Yelong Shen, Xiaodong Liu
This paper presents a novel neural model - Dynamic Fusion Network (DFN), for machine reading comprehension (MRC).
no code implementations • 19 Apr 2017 • Yichong Xu, Hongyang Zhang, Aarti Singh, Kyle Miller, Artur Dubrawski
We study the problem of interactively learning a binary classifier using noisy labeling and pairwise comparison oracles, where the comparison oracle answers which one in the given two instances is more likely to be positive.
no code implementations • 24 Nov 2014 • Yichong Xu, Tianjun Xiao, Jiaxing Zhang, Kuiyuan Yang, Zheng Zhang
Even though convolutional neural networks (CNN) has achieved near-human performance in various computer vision tasks, its ability to tolerate scale variations is limited.
no code implementations • CVPR 2015 • Tianjun Xiao, Yichong Xu, Kuiyuan Yang, Jiaxing Zhang, Yuxin Peng, Zheng Zhang
Our pipeline integrates three types of attention: the bottom-up attention that propose candidate patches, the object-level top-down attention that selects relevant patches to a certain object, and the part-level top-down attention that localizes discriminative parts.