Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Priors

1 code implementation20 May 2022 Ravid Shwartz-Ziv, Micah Goldblum, Hossein Souri, Sanyam Kapoor, Chen Zhu, Yann Lecun, Andrew Gordon Wilson

Deep learning is increasingly moving towards a transfer learning paradigm whereby large foundation models are fine-tuned on downstream tasks, starting from an initialization learned on the source task.

Transfer Learning

Plug-In Inversion: Model-Agnostic Inversion for Vision with Data Augmentations

1 code implementation31 Jan 2022 Amin Ghiasi, Hamid Kazemi, Steven Reich, Chen Zhu, Micah Goldblum, Tom Goldstein

Existing techniques for model inversion typically rely on hard-to-tune regularizers, such as total variation or feature regularization, which must be individually calibrated for each network in order to produce adequate images.

Image Classification

Complexity-Oriented Per-shot Video Coding Optimization

no code implementations23 Dec 2021 Hongcheng Zhong, Jun Xu, Chen Zhu, Donghui Feng, Li Song

Current per-shot encoding schemes aim to improve the compression efficiency by shot-level optimization.

Understanding the Role of Self-Supervised Learning in Out-of-Distribution Detection Task

no code implementations26 Oct 2021 Jiuhai Chen, Chen Zhu, Bin Dai

In this paper, we study how SSL can enhance the performance of the out-of-distribution (OOD) detection task.

Computer Vision OOD Detection +2

Long-Short Transformer: Efficient Transformers for Language and Vision

3 code implementations NeurIPS 2021 Chen Zhu, Wei Ping, Chaowei Xiao, Mohammad Shoeybi, Tom Goldstein, Anima Anandkumar, Bryan Catanzaro

For instance, Transformer-LS achieves 0. 97 test BPC on enwik8 using half the number of parameters than previous method, while being faster and is able to handle 3x as long sequences compared to its full-attention version on the same hardware.

Language Modelling

Inductive Predictions of Extreme Hydrologic Events in The Wabash River Watershed

no code implementations25 Apr 2021 Nicholas Majeske, Bidisha Abesh, Chen Zhu, Ariful Azad

We present a machine learning method to predict extreme hydrologic events from spatially and temporally varying hydrological and meteorological data.

Time Series

The Intrinsic Dimension of Images and Its Impact on Learning

1 code implementation ICLR 2021 Phillip Pope, Chen Zhu, Ahmed Abdelkader, Micah Goldblum, Tom Goldstein

We find that common natural image datasets indeed have very low intrinsic dimension relative to the high number of pixels in the images.

Computer Vision Image Generation

Modifying Memories in Transformer Models

no code implementations1 Dec 2020 Chen Zhu, Ankit Singh Rawat, Manzil Zaheer, Srinadh Bhojanapalli, Daliang Li, Felix Yu, Sanjiv Kumar

In this paper, we propose a new task of \emph{explicitly modifying specific factual knowledge in Transformer models while ensuring the model performance does not degrade on the unmodified facts}.

Are Adversarial Examples Created Equal? A Learnable Weighted Minimax Risk for Robustness under Non-uniform Attacks

no code implementations24 Oct 2020 Huimin Zeng, Chen Zhu, Tom Goldstein, Furong Huang

Adversarial Training is proved to be an efficient method to defend against adversarial examples, being one of the few defenses that withstand strong attacks.

Robust Optimization as Data Augmentation for Large-scale Graphs

3 code implementations CVPR 2022 Kezhi Kong, Guohao Li, Mucong Ding, Zuxuan Wu, Chen Zhu, Bernard Ghanem, Gavin Taylor, Tom Goldstein

Data augmentation helps neural networks generalize better by enlarging the training set, but it remains an open question how to effectively augment graph data to enhance the performance of GNNs (Graph Neural Networks).

Data Augmentation Graph Classification +3

Reducing the Teacher-Student Gap via Spherical Knowledge Disitllation

1 code implementation15 Oct 2020 Jia Guo, Minghao Chen, Yao Hu, Chen Zhu, Xiaofei He, Deng Cai

We investigate this problem by study the gap of confidence between teacher and student.

Knowledge Distillation

Towards Accurate Quantization and Pruning via Data-free Knowledge Transfer

no code implementations14 Oct 2020 Chen Zhu, Zheng Xu, Ali Shafahi, Manli Shu, Amin Ghiasi, Tom Goldstein

Further, we demonstrate that the compact structure and corresponding initialization from the Lottery Ticket Hypothesis can also help in data-free training.

Data Free Quantization Transfer Learning

MaxVA: Fast Adaptation of Step Sizes by Maximizing Observed Variance of Gradients

1 code implementation21 Jun 2020 Chen Zhu, Yu Cheng, Zhe Gan, Furong Huang, Jingjing Liu, Tom Goldstein

Adaptive gradient methods such as RMSProp and Adam use exponential moving estimate of the squared gradient to compute adaptive step sizes, achieving better convergence than SGD in face of noisy objectives.

Image Classification Machine Translation +3

Certified Defenses for Adversarial Patches

1 code implementation ICLR 2020 Ping-Yeh Chiang, Renkun Ni, Ahmed Abdelkader, Chen Zhu, Christoph Studer, Tom Goldstein

Adversarial patch attacks are among one of the most practical threat models against real-world computer vision systems.

Computer Vision

SetRank: A Setwise Bayesian Approach for Collaborative Ranking from Implicit Feedback

1 code implementation23 Feb 2020 Chao Wang, HengShu Zhu, Chen Zhu, Chuan Qin, Hui Xiong

The recent development of online recommender systems has a focus on collaborative ranking from implicit feedback, such as user clicks and purchases.

Collaborative Ranking Recommendation Systems

Improving the Tightness of Convex Relaxation Bounds for Training Certifiably Robust Classifiers

no code implementations22 Feb 2020 Chen Zhu, Renkun Ni, Ping-Yeh Chiang, Hengduo Li, Furong Huang, Tom Goldstein

Convex relaxations are effective for training and certifying neural networks against norm-bounded adversarial attacks, but they leave a large gap between certifiable and empirical robustness.

Adaptive Densely Connected Super-Resolution Reconstruction

1 code implementation17 Dec 2019 Tangxin Xie, Xin Yang, Yu Jia, Chen Zhu, Xiaochuan Li

For a better performance in single image super-resolution(SISR), we present an image super-resolution algorithm based on adaptive dense connection (ADCSR).

Image Super-Resolution Single Image Super Resolution +1

Learning from Noisy Anchors for One-stage Object Detection

1 code implementation CVPR 2020 Hengduo Li, Zuxuan Wu, Chen Zhu, Caiming Xiong, Richard Socher, Larry S. Davis

State-of-the-art object detectors rely on regressing and classifying an extensive list of possible anchors, which are divided into positive and negative samples based on their intersection-over-union (IoU) with corresponding groundtruth objects.

Classification General Classification +2

Deep k-NN Defense against Clean-label Data Poisoning Attacks

1 code implementation29 Sep 2019 Neehar Peri, Neal Gupta, W. Ronny Huang, Liam Fowl, Chen Zhu, Soheil Feizi, Tom Goldstein, John P. Dickerson

Targeted clean-label data poisoning is a type of adversarial attack on machine learning systems in which an adversary injects a few correctly-labeled, minimally-perturbed samples into the training data, causing a model to misclassify a particular test sample during inference.

Adversarial Attack Data Poisoning

Improved Training of Certifiably Robust Models

no code implementations25 Sep 2019 Chen Zhu, Renkun Ni, Ping-Yeh Chiang, Hengduo Li, Furong Huang, Tom Goldstein

Convex relaxations are effective for training and certifying neural networks against norm-bounded adversarial attacks, but they leave a large gap between certifiable and empirical (PGD) robustness.

FreeLB: Enhanced Adversarial Training for Natural Language Understanding

2 code implementations ICLR 2020 Chen Zhu, Yu Cheng, Zhe Gan, Siqi Sun, Tom Goldstein, Jingjing Liu

Adversarial training, which minimizes the maximal risk for label-preserving input perturbations, has proved to be effective for improving the generalization of language models.

Natural Language Understanding Overall - Test +1

Adversarially robust transfer learning

1 code implementation ICLR 2020 Ali Shafahi, Parsa Saadatpanah, Chen Zhu, Amin Ghiasi, Christoph Studer, David Jacobs, Tom Goldstein

By training classifiers on top of these feature extractors, we produce new models that inherit the robustness of their parent networks.

Transfer Learning

Transferable Clean-Label Poisoning Attacks on Deep Neural Nets

1 code implementation15 May 2019 Chen Zhu, W. Ronny Huang, Ali Shafahi, Hengduo Li, Gavin Taylor, Christoph Studer, Tom Goldstein

Clean-label poisoning attacks inject innocuous looking (and "correctly" labeled) poison images into training data, causing a model to misclassify a targeted image after being trained on this data.

Transfer Learning

Enhancing Person-Job Fit for Talent Recruitment: An Ability-aware Neural Network Approach

no code implementations21 Dec 2018 Chuan Qin, HengShu Zhu, Tong Xu, Chen Zhu, Liang Jiang, Enhong Chen, Hui Xiong

The wide spread use of online recruitment services has led to information explosion in the job market.

Fine-grained Video Categorization with Redundancy Reduction Attention

no code implementations ECCV 2018 Chen Zhu, Xiao Tan, Feng Zhou, Xiao Liu, Kaiyu Yue, Errui Ding, Yi Ma

Specifically, it firstly summarizes the video by weight-summing all feature vectors in the feature maps of selected frames with a spatio-temporal soft attention, and then predicts which channels to suppress or to enhance according to this summary with a learned non-linear transform.

Video Classification

Person-Job Fit: Adapting the Right Talent for the Right Job with Joint Representation Learning

no code implementations8 Oct 2018 Chen Zhu, HengShu Zhu, Hui Xiong, Chao Ma, Fang Xie, Pengliang Ding, Pan Li

To this end, in this paper, we propose a novel end-to-end data-driven model based on Convolutional Neural Network (CNN), namely Person-Job Fit Neural Network (PJFNN), for matching a talent qualification to the requirements of a job.

Data Visualization Representation Learning

Compressing Neural Networks using the Variational Information Bottelneck

1 code implementation ICML 2018 Bin Dai, Chen Zhu, Baining Guo, David Wipf

Neural networks can be compressed to reduce memory and computational requirements, or to increase accuracy by facilitating the use of a larger base architecture.

Recruitment Market Trend Analysis with Sequential Latent Variable Models

no code implementations8 Dec 2017 Chen Zhu, HengShu Zhu, Hui Xiong, Pengliang Ding, Fang Xie

To this end, in this paper, we propose a new research paradigm for recruitment market analysis by leveraging unsupervised learning techniques for automatically discovering recruitment market trends based on large-scale recruitment data.

Structured Attentions for Visual Question Answering

1 code implementation ICCV 2017 Chen Zhu, Yanpeng Zhao, Shuaiyi Huang, Kewei Tu, Yi Ma

In this paper, we demonstrate the importance of encoding such relations by showing the limited effective receptive field of ResNet on two datasets, and propose to model the visual attention as a multivariate distribution over a grid-structured Conditional Random Field on image regions.

Visual Question Answering

