Search Results for author: Shuohang Wang

Found 43 papers, 25 papers with code

Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners

1 code implementation22 May 2022 Zhenhailong Wang, Manling Li, Ruochen Xu, Luowei Zhou, Jie Lei, Xudong Lin, Shuohang Wang, ZiYi Yang, Chenguang Zhu, Derek Hoiem, Shih-Fu Chang, Mohit Bansal, Heng Ji

The goal of this work is to build flexible video-language models that can generalize to various video-to-text tasks from few examples, such as domain-specific captioning, question answering, and future event prediction.

Automatic Speech Recognition Question Answering +2

CLIP-Event: Connecting Text and Images with Event Structures

1 code implementation CVPR 2022 Manling Li, Ruochen Xu, Shuohang Wang, Luowei Zhou, Xudong Lin, Chenguang Zhu, Michael Zeng, Heng Ji, Shih-Fu Chang

Vision-language (V+L) pretraining models have achieved great success in supporting multimedia applications by understanding the alignments between images and text.

Contrastive Learning Event Extraction +1

MLP Architectures for Vision-and-Language Modeling: An Empirical Study

1 code implementation8 Dec 2021 Yixin Nie, Linjie Li, Zhe Gan, Shuohang Wang, Chenguang Zhu, Michael Zeng, Zicheng Liu, Mohit Bansal, Lijuan Wang

Based on this, we ask an even bolder question: can we have an all-MLP architecture for VL modeling, where both VL fusion and the vision encoder are replaced with MLPs?

Language Modelling Visual Question Answering +1

Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention

2 code implementations6 Dec 2021 Yichong Xu, Chenguang Zhu, Shuohang Wang, Siqi Sun, Hao Cheng, Xiaodong Liu, Jianfeng Gao, Pengcheng He, Michael Zeng, Xuedong Huang

In particular, we focus on the task of Commonsense Reasoning, demonstrating that the proposed external attention mechanism can augment existing transformer models and significantly improve the model's reasoning capabilities.

 Ranked #1 on Common Sense Reasoning on CommonsenseQA (using extra training data)

Common Sense Reasoning

Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models

no code implementations4 Nov 2021 Boxin Wang, Chejian Xu, Shuohang Wang, Zhe Gan, Yu Cheng, Jianfeng Gao, Ahmed Hassan Awadallah, Bo Li

In this paper, we present Adversarial GLUE (AdvGLUE), a new multi-task benchmark to quantitatively and thoroughly explore and evaluate the vulnerabilities of modern large-scale language models under various types of adversarial attacks.

Adversarial Attack Adversarial Robustness +1

Dict-BERT: Enhancing Language Model Pre-training with Dictionary

no code implementations Findings (ACL) 2022 Wenhao Yu, Chenguang Zhu, Yuwei Fang, Donghan Yu, Shuohang Wang, Yichong Xu, Michael Zeng, Meng Jiang

In addition to training with the masked language modeling objective, we propose two novel self-supervised pre-training tasks on word and sentence-level alignment between input text sequence and rare word definitions to enhance language modeling representation with dictionary.

Language Modelling Masked Language Modeling

KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering

no code implementations ACL 2022 Donghan Yu, Chenguang Zhu, Yuwei Fang, Wenhao Yu, Shuohang Wang, Yichong Xu, Xiang Ren, Yiming Yang, Michael Zeng

The recent proposed Fusion-in-Decoder (FiD), which is built on top of the pretrained generative model T5, achieves the state-of-the-art performance in the reading module.

Answer Generation Open-Domain Question Answering +1

NOAHQA: Numerical Reasoning with Interpretable Graph Question Answering Dataset

1 code implementation Findings (EMNLP) 2021 Qiyuan Zhang, Lei Wang, Sicheng Yu, Shuohang Wang, Yang Wang, Jing Jiang, Ee-Peng Lim

While diverse question answering (QA) datasets have been proposed and contributed significantly to the development of deep learning models for QA tasks, the existing datasets fall short in two aspects.

Graph Question Answering Question Answering

The Elastic Lottery Ticket Hypothesis

1 code implementation NeurIPS 2021 Xiaohan Chen, Yu Cheng, Shuohang Wang, Zhe Gan, Jingjing Liu, Zhangyang Wang

Based on these results, we articulate the Elastic Lottery Ticket Hypothesis (E-LTH): by mindfully replicating (or dropping) and re-ordering layers for one network, its corresponding winning ticket could be stretched (or squeezed) into a subnetwork for another deeper (or shallower) network from the same family, whose performance is nearly the same competitive as the latter's winning ticket directly found by IMP.

Adversarial Masking: Towards Understanding Robustness Trade-off for Generalization

no code implementations1 Jan 2021 Minhao Cheng, Zhe Gan, Yu Cheng, Shuohang Wang, Cho-Jui Hsieh, Jingjing Liu

By incorporating different feature maps after the masking, we can distill better features to help model generalization.

EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets

1 code implementation ACL 2021 Xiaohan Chen, Yu Cheng, Shuohang Wang, Zhe Gan, Zhangyang Wang, Jingjing Liu

Heavily overparameterized language models such as BERT, XLNet and T5 have achieved impressive success in many NLP tasks.

Model Compression

Counterfactual Variable Control for Robust and Interpretable Question Answering

1 code implementation12 Oct 2020 Sicheng Yu, Yulei Niu, Shuohang Wang, Jing Jiang, Qianru Sun

We then conduct two novel CVC inference methods (on trained models) to capture the effect of comprehensive reasoning as the final prediction.

Causal Inference Multiple-choice +2

Cross-Thought for Sentence Encoder Pre-training

1 code implementation EMNLP 2020 Shuohang Wang, Yuwei Fang, Siqi Sun, Zhe Gan, Yu Cheng, Jing Jiang, Jingjing Liu

In this paper, we propose Cross-Thought, a novel approach to pre-training sequence encoder, which is instrumental in building reusable sequence embeddings for large-scale NLP tasks such as question answering.

Information Retrieval Language Modelling +3

Multi-Fact Correction in Abstractive Text Summarization

no code implementations EMNLP 2020 Yue Dong, Shuohang Wang, Zhe Gan, Yu Cheng, Jackie Chi Kit Cheung, Jingjing Liu

Pre-trained neural abstractive summarization systems have dominated extractive strategies on news summarization performance, at least in terms of ROUGE.

Abstractive Text Summarization News Summarization +1

Contrastive Distillation on Intermediate Representations for Language Model Compression

1 code implementation EMNLP 2020 Siqi Sun, Zhe Gan, Yu Cheng, Yuwei Fang, Shuohang Wang, Jingjing Liu

Existing language model compression methods mostly use a simple L2 loss to distill knowledge in the intermediate representations of a large BERT model to a smaller one.

Knowledge Distillation Language Modelling +1

FILTER: An Enhanced Fusion Method for Cross-lingual Language Understanding

1 code implementation10 Sep 2020 Yuwei Fang, Shuohang Wang, Zhe Gan, Siqi Sun, Jingjing Liu

During inference, the model makes predictions based on the text input in the target language and its translation in the source language.

NER POS +4

Accelerating Real-Time Question Answering via Question Generation

no code implementations10 Sep 2020 Yuwei Fang, Shuohang Wang, Zhe Gan, Siqi Sun, Jingjing Liu, Chenguang Zhu

Although deep neural networks have achieved tremendous success for question answering (QA), they are still suffering from heavy computational and energy cost for real product deployment.

Data Augmentation Multi-Task Learning +2

T3: Tree-Autoencoder Constrained Adversarial Text Generation for Targeted Attack

3 code implementations EMNLP 2020 Boxin Wang, Hengzhi Pei, Boyuan Pan, Qian Chen, Shuohang Wang, Bo Li

In particular, we propose a tree-based autoencoder to embed the discrete text data into a continuous representation space, upon which we optimize the adversarial perturbation.

Adversarial Attack Adversarial Text +4

Compositional De-Attention Networks

no code implementations NeurIPS 2019 Yi Tay, Anh Tuan Luu, Aston Zhang, Shuohang Wang, Siu Cheung Hui

Attentional models are distinctly characterized by their ability to learn relative importance, i. e., assigning a different weight to input values.

Machine Translation Natural Language Inference +3

What does BERT Learn from Multiple-Choice Reading Comprehension Datasets?

no code implementations28 Oct 2019 Chenglei Si, Shuohang Wang, Min-Yen Kan, Jing Jiang

Based on our experiments on the 5 key MCRC datasets - RACE, MCTest, MCScript, MCScript2. 0, DREAM - we observe that 1) fine-tuned BERT mainly learns how keywords lead to correct prediction, instead of learning semantic understanding and reasoning; and 2) BERT does not need correct syntactic information to solve the task; 3) there exists artifacts in these datasets such that they can be solved even without the full context.

Multiple-choice Reading Comprehension

A Co-Matching Model for Multi-choice Reading Comprehension

1 code implementation ACL 2018 Shuohang Wang, Mo Yu, Shiyu Chang, Jing Jiang

Multi-choice reading comprehension is a challenging task, which involves the matching between a passage and a question-answer pair.

Reading Comprehension

A Compare-Aggregate Model for Matching Text Sequences

2 code implementations6 Nov 2016 Shuohang Wang, Jing Jiang

We particularly focus on the different comparison functions we can use to match two vectors.

Answer Selection Reading Comprehension

Cannot find the paper you are looking for? You can Submit a new open access paper.