TVT: Transferable Vision Transformer for Unsupervised Domain Adaptation

no code implementations12 Aug 2021 Jinyu Yang, Jingjing Liu, Ning Xu, Junzhou Huang

With the recent exponential increase in applying Vision Transformer (ViT) to vision tasks, the capability of ViT in adapting cross-domain knowledge, however, remains unexplored in the literature.

Transfer Learning Unsupervised Domain Adaptation

Adversarial VQA: A New Benchmark for Evaluating the Robustness of VQA Models

no code implementations1 Jun 2021 Linjie Li, Jie Lei, Zhe Gan, Jingjing Liu

We hope our Adversarial VQA dataset can shed new light on robustness study in the community and serve as a valuable benchmark for future work.

Data Augmentation Question Answering +1

Playing Lottery Tickets with Vision and Language

no code implementations23 Apr 2021 Zhe Gan, Yen-Chun Chen, Linjie Li, Tianlong Chen, Yu Cheng, Shuohang Wang, Jingjing Liu

In this work, we perform the first empirical study to assess whether such trainable subnetworks also exist in pre-trained V+L models.

Question Answering Referring Expression Comprehension +3

The Elastic Lottery Ticket Hypothesis

1 code implementation30 Mar 2021 Xiaohan Chen, Yu Cheng, Shuohang Wang, Zhe Gan, Jingjing Liu, Zhangyang Wang

Based on these results, we articulate the Elastic Lottery Ticket Hypothesis (E-LTH): by mindfully replicating (or dropping) and re-ordering layers for one network, its corresponding winning ticket could be stretched (or squeezed) into a subnetwork for another deeper (or shallower) network from the same family, whose performance is nearly the same competitive as the latter's winning ticket directly found by IMP.

Adversarial Feature Augmentation and Normalization for Visual Recognition

1 code implementation22 Mar 2021 Tianlong Chen, Yu Cheng, Zhe Gan, JianFeng Wang, Lijuan Wang, Zhangyang Wang, Jingjing Liu

Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.

Classification Data Augmentation +1

Data-Efficient GAN Training Beyond (Just) Augmentations: A Lottery Ticket Perspective

1 code implementation28 Feb 2021 Tianlong Chen, Yu Cheng, Zhe Gan, Jingjing Liu, Zhangyang Wang

Treating this as an inductive prior, we suggest a brand-new angle towards data-efficient GAN training: by first identifying the lottery ticket from the original GAN using the small training set of real images; and then focusing on training that sparse subnetwork by re-using the same set.

Data Augmentation

Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling

1 code implementation CVPR 2021 Jie Lei, Linjie Li, Luowei Zhou, Zhe Gan, Tamara L. Berg, Mohit Bansal, Jingjing Liu

Experiments on text-to-video retrieval and video question answering on six datasets demonstrate that ClipBERT outperforms (or is on par with) existing methods that exploit full-length videos, suggesting that end-to-end learning with just a few sparsely sampled clips is often more accurate than using densely extracted offline features from full-length videos, proving the proverbial less-is-more principle.

Ranked #2 on Visual Question Answering on MSRVTT-QA (using extra training data)

Question Answering Video Question Answering +2

Adversarial Masking: Towards Understanding Robustness Trade-off for Generalization

no code implementations1 Jan 2021 Minhao Cheng, Zhe Gan, Yu Cheng, Shuohang Wang, Cho-Jui Hsieh, Jingjing Liu

By incorporating different feature maps after the masking, we can distill better features to help model generalization.

ALFA: Adversarial Feature Augmentation for Enhanced Image Recognition

no code implementations1 Jan 2021 Tianlong Chen, Yu Cheng, Zhe Gan, Yu Hu, Zhangyang Wang, Jingjing Liu

Adversarial training is an effective method to combat adversarial attacks in order to create robust neural networks.

EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets

1 code implementation ACL 2021 Xiaohan Chen, Yu Cheng, Shuohang Wang, Zhe Gan, Zhangyang Wang, Jingjing Liu

Heavily overparameterized language models such as BERT, XLNet and T5 have achieved impressive success in many NLP tasks.

Model Compression

Wasserstein Contrastive Representation Distillation

no code implementations CVPR 2021 Liqun Chen, Dong Wang, Zhe Gan, Jingjing Liu, Ricardo Henao, Lawrence Carin

The primary goal of knowledge distillation (KD) is to encapsulate the information of a model learned from a teacher network into a student network, with the latter being more compact than the former.

Contrastive Learning Knowledge Distillation +2

A Closer Look at the Robustness of Vision-and-Language Pre-trained Models

no code implementations15 Dec 2020 Linjie Li, Zhe Gan, Jingjing Liu

Large-scale pre-trained multimodal transformers, such as ViLBERT and UNITER, have propelled the state of the art in vision-and-language (V+L) research to a new level.

Cross-Thought for Sentence Encoder Pre-training

1 code implementation EMNLP 2020 Shuohang Wang, Yuwei Fang, Siqi Sun, Zhe Gan, Yu Cheng, Jing Jiang, Jingjing Liu

In this paper, we propose Cross-Thought, a novel approach to pre-training sequence encoder, which is instrumental in building reusable sequence embeddings for large-scale NLP tasks such as question answering.

Information Retrieval Language Modelling +2

Multi-Fact Correction in Abstractive Text Summarization

no code implementations EMNLP 2020 Yue Dong, Shuohang Wang, Zhe Gan, Yu Cheng, Jackie Chi Kit Cheung, Jingjing Liu

Pre-trained neural abstractive summarization systems have dominated extractive strategies on news summarization performance, at least in terms of ROUGE.

Abstractive Text Summarization Question Answering

Efficient Robust Training via Backward Smoothing

1 code implementation3 Oct 2020 Jinghui Chen, Yu Cheng, Zhe Gan, Quanquan Gu, Jingjing Liu

Adversarial training is so far the most effective strategy in defending against adversarial examples.

Contrastive Distillation on Intermediate Representations for Language Model Compression

1 code implementation EMNLP 2020 Siqi Sun, Zhe Gan, Yu Cheng, Yuwei Fang, Shuohang Wang, Jingjing Liu

Existing language model compression methods mostly use a simple L2 loss to distill knowledge in the intermediate representations of a large BERT model to a smaller one.

Knowledge Distillation Language Modelling +1

Accelerating Real-Time Question Answering via Question Generation

no code implementations10 Sep 2020 Yuwei Fang, Shuohang Wang, Zhe Gan, Siqi Sun, Jingjing Liu, Chenguang Zhu

Although deep neural networks have achieved tremendous success for question answering (QA), they are still suffering from heavy computational and energy cost for real product deployment.

Data Augmentation Multi-Task Learning +2

FILTER: An Enhanced Fusion Method for Cross-lingual Language Understanding

1 code implementation10 Sep 2020 Yuwei Fang, Shuohang Wang, Zhe Gan, Siqi Sun, Jingjing Liu

During inference, the model makes predictions based on the text input in the target language and its translation in the source language.


Graph Optimal Transport for Cross-Domain Alignment

1 code implementation ICML 2020 Liqun Chen, Zhe Gan, Yu Cheng, Linjie Li, Lawrence Carin, Jingjing Liu

In GOT, cross-domain alignment is formulated as a graph matching problem, by representing entities into a dynamically-constructed graph.

Graph Matching Image Captioning +4

MaxVA: Fast Adaptation of Step Sizes by Maximizing Observed Variance of Gradients

1 code implementation21 Jun 2020 Chen Zhu, Yu Cheng, Zhe Gan, Furong Huang, Jingjing Liu, Tom Goldstein

Adaptive gradient methods such as RMSProp and Adam use exponential moving estimate of the squared gradient to compute adaptive step sizes, achieving better convergence than SGD in face of noisy objectives.

Image Classification Machine Translation +2

Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

no code implementations ECCV 2020 Jize Cao, Zhe Gan, Yu Cheng, Licheng Yu, Yen-Chun Chen, Jingjing Liu

To reveal the secrets behind the scene of these powerful models, we present VALUE (Vision-And-Language Understanding Evaluation), a set of meticulously designed probing tasks (e. g., Visual Coreference Resolution, Visual Relation Detection, Linguistic Probing Tasks) generalizable to standard pre-trained V+L models, aiming to decipher the inner workings of multimodal pre-training (e. g., the implicit knowledge garnered in individual attention heads, the inherent cross-modal alignment learned through contextualized multimodal embeddings).

Coreference Resolution

Contextual Text Style Transfer

no code implementations Findings of the Association for Computational Linguistics 2020 Yu Cheng, Zhe Gan, Yizhe Zhang, Oussama Elachqar, Dianqi Li, Jingjing Liu

To realize high-quality style transfer with natural context preservation, we propose a Context-Aware Style Transfer (CAST) model, which uses two separate encoders for each input sentence and its surrounding context.

Style Transfer Text Style Transfer

APo-VAE: Text Generation in Hyperbolic Space

no code implementations NAACL 2021 Shuyang Dai, Zhe Gan, Yu Cheng, Chenyang Tao, Lawrence Carin, Jingjing Liu

In this paper, we investigate text generation in a hyperbolic latent space to learn continuous hierarchical representations.

Hierarchical structure Language Modelling +1

BachGAN: High-Resolution Image Synthesis from Salient Object Layout

1 code implementation CVPR 2020 Yandong Li, Yu Cheng, Zhe Gan, Licheng Yu, Liqiang Wang, Jingjing Liu

We propose a new task towards more practical application for image generation - high-quality image synthesis from salient object layout.

Image Generation

VIOLIN: A Large-Scale Dataset for Video-and-Language Inference

1 code implementation CVPR 2020 Jingzhou Liu, Wenhu Chen, Yu Cheng, Zhe Gan, Licheng Yu, Yiming Yang, Jingjing Liu

We introduce a new task, Video-and-Language Inference, for joint multimodal understanding of video and text.

Self-Guided Adaptation: Progressive Representation Alignment for Domain Adaptive Object Detection

no code implementations19 Mar 2020 Zongxian Li, Qixiang Ye, Chong Zhang, Jingjing Liu, Shijian Lu, Yonghong Tian

In this work, we propose a Self-Guided Adaptation (SGA) model, target at aligning feature representation and transferring object detection models across domains while considering the instantaneous alignment difficulty.

Object Detection Unsupervised Domain Adaptation

Distilling Knowledge Learned in BERT for Text Generation

1 code implementation ACL 2020 Yen-Chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu

Experiments show that the proposed approach significantly outperforms strong Transformer baselines on multiple language generation tasks such as machine translation and text summarization.

Language Modelling Machine Translation +2

DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation

7 code implementations1 Nov 2019 Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett, Xiang Gao, Jianfeng Gao, Jingjing Liu, Bill Dolan

We present a large, tunable neural conversational response generation model, DialoGPT (dialogue generative pre-trained transformer).

Conversational Response Generation

Discourse-Aware Neural Extractive Text Summarization

1 code implementation ACL 2020 Jiacheng Xu, Zhe Gan, Yu Cheng, Jingjing Liu

Recently BERT has been adopted for document encoding in state-of-the-art text summarization models.

Extractive Text Summarization

Meta Module Network for Compositional Visual Reasoning

1 code implementation8 Oct 2019 Wenhu Chen, Zhe Gan, Linjie Li, Yu Cheng, William Wang, Jingjing Liu

To design a more powerful NMN architecture for practical use, we propose Meta Module Network (MMN) centered on a novel meta module, which can take in function recipes and morph into diverse instance modules dynamically.

Visual Reasoning

FreeLB: Enhanced Adversarial Training for Natural Language Understanding

2 code implementations ICLR 2020 Chen Zhu, Yu Cheng, Zhe Gan, Siqi Sun, Tom Goldstein, Jingjing Liu

Adversarial training, which minimizes the maximal risk for label-preserving input perturbations, has proved to be effective for improving the generalization of language models.

Natural Language Understanding Word Embeddings

UNITER: UNiversal Image-TExt Representation Learning

5 code implementations ECCV 2020 Yen-Chun Chen, Linjie Li, Licheng Yu, Ahmed El Kholy, Faisal Ahmed, Zhe Gan, Yu Cheng, Jingjing Liu

Different from previous work that applies joint random masking to both modalities, we use conditional masking on pre-training tasks (i. e., masked language/region modeling is conditioned on full observation of image/text).

Language Modelling Question Answering +6

What Makes A Good Story? Designing Composite Rewards for Visual Storytelling

1 code implementation11 Sep 2019 Junjie Hu, Yu Cheng, Zhe Gan, Jingjing Liu, Jianfeng Gao, Graham Neubig

Previous storytelling approaches mostly focused on optimizing traditional metrics such as BLEU, ROUGE and CIDEr.

Visual Storytelling

Contrastively Smoothed Class Alignment for Unsupervised Domain Adaptation

no code implementations11 Sep 2019 Shuyang Dai, Yu Cheng, Yizhe Zhang, Zhe Gan, Jingjing Liu, Lawrence Carin

Recent unsupervised approaches to domain adaptation primarily focus on minimizing the gap between the source and the target domains through refining the feature generator, in order to learn a better alignment between the two domains.

Unsupervised Domain Adaptation

Patient Knowledge Distillation for BERT Model Compression

2 code implementations IJCNLP 2019 Siqi Sun, Yu Cheng, Zhe Gan, Jingjing Liu

Pre-trained language models such as BERT have proven to be highly effective for natural language processing (NLP) tasks.

Knowledge Distillation Model Compression

Adversarial Domain Adaptation for Machine Reading Comprehension

no code implementations IJCNLP 2019 Huazheng Wang, Zhe Gan, Xiaodong Liu, Jingjing Liu, Jianfeng Gao, Hongning Wang

In this paper, we focus on unsupervised domain adaptation for Machine Reading Comprehension (MRC), where the source domain has a large amount of labeled data, while only unlabeled passages are available in the target domain.

Machine Reading Comprehension Representation Learning +1

A Hybrid Retrieval-Generation Neural Conversation Model

1 code implementation19 Apr 2019 Liu Yang, Junjie Hu, Minghui Qiu, Chen Qu, Jianfeng Gao, W. Bruce Croft, Xiaodong Liu, Yelong Shen, Jingjing Liu

In this paper, we propose a hybrid neural conversation model that combines the merits of both response retrieval and generation methods.

Text Generation

Relation-Aware Graph Attention Network for Visual Question Answering

1 code implementation ICCV 2019 Linjie Li, Zhe Gan, Yu Cheng, Jingjing Liu

In order to answer semantically-complicated questions about an image, a Visual Question Answering (VQA) model needs to fully understand the visual scene in the image, especially the interactive dynamics between different objects.

Graph Attention Question Answering +1

Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

1 code implementation CVPR 2019 Liyiming Ke, Xiujun Li, Yonatan Bisk, Ari Holtzman, Zhe Gan, Jingjing Liu, Jianfeng Gao, Yejin Choi, Siddhartha Srinivasa

We present the Frontier Aware Search with backTracking (FAST) Navigator, a general framework for action decoding, that achieves state-of-the-art results on the Room-to-Room (R2R) Vision-and-Language navigation challenge of Anderson et.

Vision and Language Navigation Vision-Language Navigation

Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog

no code implementations ACL 2019 Zhe Gan, Yu Cheng, Ahmed El Kholy, Linjie Li, Jingjing Liu, Jianfeng Gao

This paper presents a new model for visual dialog, Recurrent Dual Attention Network (ReDAN), using multi-step reasoning to answer a series of questions about an image.

Question Answering Visual Dialog

Sequential Attention GAN for Interactive Image Editing

no code implementations20 Dec 2018 Yu Cheng, Zhe Gan, Yitong Li, Jingjing Liu, Jianfeng Gao

The main challenges in this sequential and interactive image generation task are two-fold: 1) contextual consistency between a generated image and the provided textual description; 2) step-by-step region-level modification to maintain visual consistency across the generated image sequence in each session.

Text-to-Image Generation

Switch-based Active Deep Dyna-Q: Efficient Adaptive Planning for Task-Completion Dialogue Policy Learning

1 code implementation19 Nov 2018 Yuexin Wu, Xiujun Li, Jingjing Liu, Jianfeng Gao, Yiming Yang

Training task-completion dialogue agents with reinforcement learning usually requires a large number of real user experiences.

Active Learning Q-Learning +1

Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension

5 code implementations NAACL 2019 Yichong Xu, Xiaodong Liu, Yelong Shen, Jingjing Liu, Jianfeng Gao

We propose a multi-task learning framework to learn a joint Machine Reading Comprehension (MRC) model that can be applied to a wide range of MRC tasks in different domains.

Machine Reading Comprehension Machine Translation +2

Discriminative Deep Dyna-Q: Robust Planning for Dialogue Policy Learning

2 code implementations EMNLP 2018 Shang-Yu Su, Xiujun Li, Jianfeng Gao, Jingjing Liu, Yun-Nung Chen

This paper presents a Discriminative Deep Dyna-Q (D3Q) approach to improving the effectiveness and robustness of Deep Dyna-Q (DDQ), a recently proposed framework that extends the Dyna-Q algorithm to integrate planning for task-completion dialogue policy learning.

Task-Completion Dialogue Policy Learning

Microsoft Dialogue Challenge: Building End-to-End Task-Completion Dialogue Systems

2 code implementations29 Jul 2018 Xiujun Li, Yu Wang, Siqi Sun, Sarah Panda, Jingjing Liu, Jianfeng Gao

This proposal introduces a Dialogue Challenge for building end-to-end task-completion dialogue systems, with the goal of encouraging the dialogue research community to collaborate and benchmark on standard datasets and unified experimental environment.

Deep Dyna-Q: Integrating Planning for Task-Completion Dialogue Policy Learning

3 code implementations ACL 2018 Baolin Peng, Xiujun Li, Jianfeng Gao, Jingjing Liu, Kam-Fai Wong, Shang-Yu Su

During dialogue policy learning, the world model is constantly updated with real user experience to approach real user behavior, and in turn, the dialogue agent is optimized using both real experience and simulated experience.

Task-Completion Dialogue Policy Learning

Dynamic Fusion Networks for Machine Reading Comprehension

no code implementations14 Nov 2017 Yichong Xu, Jingjing Liu, Jianfeng Gao, Yelong Shen, Xiaodong Liu

This paper presents a novel neural model - Dynamic Fusion Network (DFN), for machine reading comprehension (MRC).

Machine Reading Comprehension

Adversarial Advantage Actor-Critic Model for Task-Completion Dialogue Policy Learning

no code implementations31 Oct 2017 Baolin Peng, Xiujun Li, Jianfeng Gao, Jingjing Liu, Yun-Nung Chen, Kam-Fai Wong

This paper presents a new method --- adversarial advantage actor-critic (Adversarial A2C), which significantly improves the efficiency of dialogue policy learning in task-completion dialogue systems.

Task-Completion Dialogue Policy Learning

Image Disguise based on Generative Model

no code implementations21 Oct 2017 Xintao Duan, Haoxian Song, En Zhang, Jingjing Liu

To protect image contents, most existing encryption algorithms are designed to transform an original image into a texture-like or noise-like image, which is, however, an obvious visual sign indicating the presence of an encrypted image, results in a significantly large number of attacks.

Multispectral Deep Neural Networks for Pedestrian Detection

1 code implementation8 Nov 2016 Jingjing Liu, Shaoting Zhang, Shu Wang, Dimitris N. Metaxas

Multispectral pedestrian detection is essential for around-the-clock applications, e. g., surveillance and autonomous driving.

Autonomous Driving Pedestrian Detection

