Search Results for author: Yichong Xu

Found 45 papers, 21 papers with code

Knowledge-Augmented Methods for Natural Language Processing

no code implementations ACL 2022 Chenguang Zhu, Yichong Xu, Xiang Ren, Bill Lin, Meng Jiang, Wenhao Yu

Knowledge in natural language processing (NLP) has been a rising trend especially after the advent of large scale pre-trained models.

Text Generation

Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models

no code implementations19 Oct 2023 Zhihan Zhang, Shuohang Wang, Wenhao Yu, Yichong Xu, Dan Iter, Qingkai Zeng, Yang Liu, Chenguang Zhu, Meng Jiang

Large language models (LLMs) can perform a wide range of tasks by following natural language instructions, without the necessity of task-specific fine-tuning.

Sparse Modular Activation for Efficient Sequence Modeling

1 code implementation NeurIPS 2023 Liliang Ren, Yang Liu, Shuohang Wang, Yichong Xu, Chenguang Zhu, ChengXiang Zhai

To validate the effectiveness of SMA on sequence modeling, we design a novel neural architecture, SeqBoat, which employs SMA to sparsely activate a Gated Attention Unit (GAU) based on the state representations learned from an SSM.

Chunking Long-range modeling

In-Context Demonstration Selection with Cross Entropy Difference

1 code implementation24 May 2023 Dan Iter, Reid Pryzant, Ruochen Xu, Shuohang Wang, Yang Liu, Yichong Xu, Chenguang Zhu

Our method is based on the observation that the effectiveness of in-context demonstrations negatively correlates with the perplexity of the test example by a language model that was finetuned on that demonstration.

Language Modelling Text Generation

i-Code Studio: A Configurable and Composable Framework for Integrative AI

no code implementations23 May 2023 Yuwei Fang, Mahmoud Khademi, Chenguang Zhu, ZiYi Yang, Reid Pryzant, Yichong Xu, Yao Qian, Takuya Yoshioka, Lu Yuan, Michael Zeng, Xuedong Huang

Artificial General Intelligence (AGI) requires comprehensive understanding and generation capabilities for a variety of tasks spanning different modalities and functionalities.

Question Answering Retrieval +4

InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT

no code implementations22 May 2023 Yichong Xu, Ruochen Xu, Dan Iter, Yang Liu, Shuohang Wang, Chenguang Zhu, Michael Zeng

While large models such as GPT-3 demonstrate exceptional performance in zeroshot and fewshot summarization tasks, their extensive serving and fine-tuning costs hinder their utilization in various applications.

LMGQS: A Large-scale Dataset for Query-focused Summarization

no code implementations22 May 2023 Ruochen Xu, Song Wang, Yang Liu, Shuohang Wang, Yichong Xu, Dan Iter, Chenguang Zhu, Michael Zeng

We hypothesize that there is a hidden query for each summary sentence in a generic summarization annotation, and we utilize a large-scale pretrained language model to recover it.

Language Modelling Query-focused Summarization +1

i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data

no code implementations21 May 2023 ZiYi Yang, Mahmoud Khademi, Yichong Xu, Reid Pryzant, Yuwei Fang, Chenguang Zhu, Dongdong Chen, Yao Qian, Mei Gao, Yi-Ling Chen, Robert Gmyr, Naoyuki Kanda, Noel Codella, Bin Xiao, Yu Shi, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang

The convergence of text, visual, and audio data is a key step towards human-like artificial intelligence, however the current Vision-Language-Speech landscape is dominated by encoder-only models which lack generative abilities.

Decoder

Small Models are Valuable Plug-ins for Large Language Models

1 code implementation15 May 2023 Canwen Xu, Yichong Xu, Shuohang Wang, Yang Liu, Chenguang Zhu, Julian McAuley

Large language models (LLMs) such as GPT-3 and GPT-4 are powerful but their weights are often publicly unavailable and their immense sizes make the models difficult to be tuned with common hardware.

In-Context Learning

G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment

2 code implementations29 Mar 2023 Yang Liu, Dan Iter, Yichong Xu, Shuohang Wang, Ruochen Xu, Chenguang Zhu

In this work, we present G-Eval, a framework of using large language models with chain-of-thoughts (CoT) and a form-filling paradigm, to assess the quality of NLG outputs.

Dialogue Generation nlg evaluation +1

APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning

no code implementations19 Dec 2022 Soumya Sanyal, Yichong Xu, Shuohang Wang, ZiYi Yang, Reid Pryzant, Wenhao Yu, Chenguang Zhu, Xiang Ren

Logical reasoning of text is an important ability that requires understanding the information present in the text, their interconnections, and then reasoning through them to infer new conclusions.

Data Augmentation Language Modelling +3

Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles

1 code implementation CVPR 2023 Shuquan Ye, Yujia Xie, Dongdong Chen, Yichong Xu, Lu Yuan, Chenguang Zhu, Jing Liao

Through our analysis, we find one important reason is that existing large-scale VL datasets do not contain much commonsense knowledge, which motivates us to improve the commonsense of VL-models from the data perspective.

Data Augmentation Retrieval

Task Compass: Scaling Multi-task Pre-training with Task Prefix

1 code implementation12 Oct 2022 Zhuosheng Zhang, Shuohang Wang, Yichong Xu, Yuwei Fang, Wenhao Yu, Yang Liu, Hai Zhao, Chenguang Zhu, Michael Zeng

Leveraging task-aware annotated data as supervised signals to assist with self-supervised learning on large-scale unlabeled data has become a new trend in pre-training language models.

Common Sense Reasoning Data Augmentation +4

Generate rather than Retrieve: Large Language Models are Strong Context Generators

2 code implementations21 Sep 2022 Wenhao Yu, Dan Iter, Shuohang Wang, Yichong Xu, Mingxuan Ju, Soumya Sanyal, Chenguang Zhu, Michael Zeng, Meng Jiang

We call our method generate-then-read (GenRead), which first prompts a large language model to generate contextutal documents based on a given question, and then reads the generated documents to produce the final answer.

Language Modelling Large Language Model +1

REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering

1 code implementation2 Jun 2022 Yuanze Lin, Yujia Xie, Dongdong Chen, Yichong Xu, Chenguang Zhu, Lu Yuan

Specifically, we observe that in most state-of-the-art knowledge-based VQA methods: 1) visual features are extracted either from the whole image or in a sliding window manner for retrieving knowledge, and the important relationship within/among object regions is neglected; 2) visual features are not well utilized in the final answering model, which is counter-intuitive to some extent.

Question Answering Retrieval +1

Automatic Rule Induction for Interpretable Semi-Supervised Learning

1 code implementation18 May 2022 Reid Pryzant, ZiYi Yang, Yichong Xu, Chenguang Zhu, Michael Zeng

Semi-supervised learning has shown promise in allowing NLP models to generalize from small amounts of labeled data.

Relation Extraction

Integrating Rankings into Quantized Scores in Peer Review

1 code implementation5 Apr 2022 Yusha Liu, Yichong Xu, Nihar B. Shah, Aarti Singh

Our approach addresses the two aforementioned challenges by: (i) ensuring that rankings are incorporated into the updates scores in the same manner for all papers, thereby mitigating arbitrariness, and (ii) allowing to seamlessly use existing interfaces and workflows designed for scores.

Decision Making

Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention

2 code implementations6 Dec 2021 Yichong Xu, Chenguang Zhu, Shuohang Wang, Siqi Sun, Hao Cheng, Xiaodong Liu, Jianfeng Gao, Pengcheng He, Michael Zeng, Xuedong Huang

In particular, we focus on the task of Commonsense Reasoning, demonstrating that the proposed external attention mechanism can augment existing transformer models and significantly improve the model's reasoning capabilities.

 Ranked #1 on Common Sense Reasoning on CommonsenseQA (using extra training data)

Common Sense Reasoning

Dict-BERT: Enhancing Language Model Pre-training with Dictionary

1 code implementation Findings (ACL) 2022 Wenhao Yu, Chenguang Zhu, Yuwei Fang, Donghan Yu, Shuohang Wang, Yichong Xu, Michael Zeng, Meng Jiang

In addition to training with the masked language modeling objective, we propose two novel self-supervised pre-training tasks on word and sentence-level alignment between input text sequence and rare word definitions to enhance language modeling representation with dictionary.

Language Modelling Masked Language Modeling +1

KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering

no code implementations ACL 2022 Donghan Yu, Chenguang Zhu, Yuwei Fang, Wenhao Yu, Shuohang Wang, Yichong Xu, Xiang Ren, Yiming Yang, Michael Zeng

The recent proposed Fusion-in-Decoder (FiD), which is built on top of the pretrained generative model T5, achieves the state-of-the-art performance in the reading module.

Answer Generation Decoder +5

DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarization

1 code implementation6 Sep 2021 Ming Zhong, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng

For a dialogue, it corrupts a window of text with dialogue-inspired noise, and guides the model to reconstruct this window based on the content of the remaining conversation.

abstractive question answering Denoising +2

Retrieval Enhanced Model for Commonsense Generation

1 code implementation Findings (ACL) 2021 Han Wang, Yang Liu, Chenguang Zhu, Linjun Shou, Ming Gong, Yichong Xu, Michael Zeng

Commonsense generation is a challenging task of generating a plausible sentence describing an everyday scenario using provided concepts.

Retrieval Sentence +1

Fusing Context Into Knowledge Graph for Commonsense Question Answering

2 code implementations Findings (ACL) 2021 Yichong Xu, Chenguang Zhu, Ruochen Xu, Yang Liu, Michael Zeng, Xuedong Huang

However, although a KG contains rich structural information, it lacks the context to provide a more precise understanding of the concepts.

Ranked #4 on Common Sense Reasoning on CommonsenseQA (using extra training data)

Common Sense Reasoning Knowledge Graphs +3

Preference-based Reinforcement Learning with Finite-Time Guarantees

no code implementations NeurIPS 2020 Yichong Xu, Ruosong Wang, Lin F. Yang, Aarti Singh, Artur Dubrawski

If preferences are stochastic, and the preference probability relates to the hidden reward values, we present algorithms for PbRL, both with and without a simulator, that are able to identify the best policy up to accuracy $\varepsilon$ with high probability.

reinforcement-learning Reinforcement Learning (RL)

Zeroth Order Non-convex optimization with Dueling-Choice Bandits

no code implementations3 Nov 2019 Yichong Xu, Aparna Joshi, Aarti Singh, Artur Dubrawski

We consider a novel setting of zeroth order non-convex optimization, where in addition to querying the function value at a given point, we can also duel two points and get the point with the larger function value.

Active Learning for Graph Neural Networks via Node Feature Propagation

no code implementations16 Oct 2019 Yuexin Wu, Yichong Xu, Aarti Singh, Yiming Yang, Artur Dubrawski

Graph Neural Networks (GNNs) for prediction tasks like node classification or edge prediction have received increasing attention in recent machine learning from graphically structured data.

Active Learning Clustering +3

Thresholding Bandit Problem with Both Duels and Pulls

no code implementations14 Oct 2019 Yichong Xu, Xi Chen, Aarti Singh, Artur Dubrawski

The Thresholding Bandit Problem (TBP) aims to find the set of arms with mean rewards greater than a given threshold.

Active Learning Graph Neural Networks via Node Feature Propagation

no code implementations25 Sep 2019 Yuexin Wu, Yichong Xu, Aarti Singh, Artur Dubrawski, Yiming Yang

Graph Neural Networks (GNNs) for prediction tasks like node classification or edge prediction have received increasing attention in recent machine learning from graphically structured data.

Active Learning Node Classification +1

Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension

5 code implementations NAACL 2019 Yichong Xu, Xiaodong Liu, Yelong Shen, Jingjing Liu, Jianfeng Gao

We propose a multi-task learning framework to learn a joint Machine Reading Comprehension (MRC) model that can be applied to a wide range of MRC tasks in different domains.

Machine Reading Comprehension Machine Translation +3

Nonparametric Regression with Comparisons: Escaping the Curse of Dimensionality with Ordinal Information

no code implementations ICML 2018 Yichong Xu, Hariank Muthakana, Sivaraman Balakrishnan, Aarti Singh, Artur Dubrawski

Finally, we present experiments that show the efficacy of RR and investigate its robustness to various sources of noise and model-misspecification.

regression

On Strategyproof Conference Peer Review

1 code implementation16 Jun 2018 Yichong Xu, Han Zhao, Xiaofei Shi, Jeremy Zhang, Nihar B. Shah

We then empirically show that the requisite property on the authorship graph is indeed satisfied in the submission data from the ICLR conference, and further demonstrate a simple trick to make the partitioning method more practically appealing for conference peer review.

Regression with Comparisons: Escaping the Curse of Dimensionality with Ordinal Information

no code implementations ICML 2018 Yichong Xu, Sivaraman Balakrishnan, Aarti Singh, Artur Dubrawski

In supervised learning, we typically leverage a fully labeled dataset to design methods for function estimation or prediction.

regression

Noise-Tolerant Interactive Learning Using Pairwise Comparisons

no code implementations NeurIPS 2017 Yichong Xu, Hongyang Zhang, Kyle Miller, Aarti Singh, Artur Dubrawski

We study the problem of interactively learning a binary classifier using noisy labeling and pairwise comparison oracles, where the comparison oracle answers which one in the given two instances is more likely to be positive.

Dynamic Fusion Networks for Machine Reading Comprehension

no code implementations14 Nov 2017 Yichong Xu, Jingjing Liu, Jianfeng Gao, Yelong Shen, Xiaodong Liu

This paper presents a novel neural model - Dynamic Fusion Network (DFN), for machine reading comprehension (MRC).

Machine Reading Comprehension

Noise-Tolerant Interactive Learning from Pairwise Comparisons

no code implementations19 Apr 2017 Yichong Xu, Hongyang Zhang, Aarti Singh, Kyle Miller, Artur Dubrawski

We study the problem of interactively learning a binary classifier using noisy labeling and pairwise comparison oracles, where the comparison oracle answers which one in the given two instances is more likely to be positive.

Scale-Invariant Convolutional Neural Networks

no code implementations24 Nov 2014 Yichong Xu, Tianjun Xiao, Jiaxing Zhang, Kuiyuan Yang, Zheng Zhang

Even though convolutional neural networks (CNN) has achieved near-human performance in various computer vision tasks, its ability to tolerate scale variations is limited.

Data Augmentation General Classification

The Application of Two-level Attention Models in Deep Convolutional Neural Network for Fine-grained Image Classification

no code implementations CVPR 2015 Tianjun Xiao, Yichong Xu, Kuiyuan Yang, Jiaxing Zhang, Yuxin Peng, Zheng Zhang

Our pipeline integrates three types of attention: the bottom-up attention that propose candidate patches, the object-level top-down attention that selects relevant patches to a certain object, and the part-level top-down attention that localizes discriminative parts.

Classification Fine-Grained Image Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.