Search Results for author: Li-Wei Wang

Found 82 papers, 36 papers with code

Dimensionality Dependent PAC-Bayes Margin Bound

no code implementations NeurIPS 2012 Chi Jin, Li-Wei Wang

We show that our bound is strictly sharper than a previously well-known PAC-Bayes margin bound if the feature space is of finite dimension; and the two bounds tend to be equivalent as the dimension goes to infinity.

Model Selection

A Theoretical Analysis of NDCG Type Ranking Measures

no code implementations24 Apr 2013 Yining Wang, Li-Wei Wang, Yuanzhi Li, Di He, Tie-Yan Liu, Wei Chen

We show that NDCG with logarithmic discount has consistent distinguishability although it converges to the same limit for all ranking functions.

Vocal Bursts Type Prediction

Efficient Algorithm for Privately Releasing Smooth Queries

no code implementations NeurIPS 2013 Ziteng Wang, Kai Fan, Jia-Qi Zhang, Li-Wei Wang

Outputting the summary runs in time $O(n^{1+\frac{d}{2d+K}})$, and the evaluation algorithm for answering a query runs in time $\tilde O (n^{\frac{d+2+\frac{2d}{K}}{2d+K}} )$.

Differentially Private Data Releasing for Smooth Queries with Synthetic Database Output

no code implementations6 Jan 2014 Chi Jin, Ziteng Wang, Junliang Huang, Yiqiao Zhong, Li-Wei Wang

We develop an $\epsilon$-differentially private mechanism for the class of $K$-smooth queries.

Multi-scale Orderless Pooling of Deep Convolutional Activation Features

no code implementations7 Mar 2014 Yunchao Gong, Li-Wei Wang, Ruiqi Guo, Svetlana Lazebnik

Deep convolutional neural networks (CNN) have shown their promise as a universal representation for recognition.

Classification General Classification +2

Can Image-Level Labels Replace Pixel-Level Labels for Image Parsing

no code implementations7 Mar 2014 Zhiwu Lu, Zhen-Yong Fu, Tao Xiang, Li-Wei Wang, Ji-Rong Wen

By oversegmenting all the images into regions, we formulate noisily tagged image parsing as a weakly supervised sparse learning problem over all the regions, where the initial labels of each region are inferred from image-level labels.

Sparse Learning

A Game-theoretic Machine Learning Approach for Revenue Maximization in Sponsored Search

no code implementations3 Jun 2014 Di He, Wei Chen, Li-Wei Wang, Tie-Yan Liu

Sponsored search is an important monetization channel for search engines, in which an auction mechanism is used to select the ads shown to users and determine the prices charged from advertisers.

BIG-bench Machine Learning Bilevel Optimization

Pairwise Constraint Propagation on Multi-View Data

no code implementations18 Jan 2015 Zhiwu Lu, Li-Wei Wang

This paper presents a graph-based learning approach to pairwise constraint propagation on multi-view data.

graph construction Retrieval

Image classification by visual bag-of-words refinement and reduction

no code implementations18 Jan 2015 Zhiwu Lu, Li-Wei Wang, Ji-Rong Wen

This paper presents a new framework for visual bag-of-words (BOW) refinement and reduction to overcome the drawbacks associated with the visual BOW model which has been widely used for image classification.

Classification Clustering +2

On the Depth of Deep Neural Networks: A Theoretical View

no code implementations17 Jun 2015 Shizhao Sun, Wei Chen, Li-Wei Wang, Xiaoguang Liu, Tie-Yan Liu

First, we derive an upper bound for RA of DNN, and show that it increases with increasing depth.

Dual Learning for Machine Translation

1 code implementation NeurIPS 2016 Yingce Xia, Di He, Tao Qin, Li-Wei Wang, Nenghai Yu, Tie-Yan Liu, Wei-Ying Ma

Based on the feedback signals generated during this process (e. g., the language-model likelihood of the output of a model, and the reconstruction error of the original sentence after the primal and dual translations), we can iteratively update the two models until convergence (e. g., using the policy gradient methods).

Language Modelling Machine Translation +4

Stable Memory Allocation in the Hippocampus: Fundamental Limits and Neural Realization

no code implementations14 Dec 2016 Wenlong Mou, Zhi Wang, Li-Wei Wang

In Valiant's neuroidal model, the hippocampus was described as a randomly connected graph, the computation on which maps input to a set of activated neuroids with stable size.

Hippocampus Memorization

Quadratic Upper Bound for Recursive Teaching Dimension of Finite VC Classes

no code implementations18 Feb 2017 Lunjia Hu, Ruihan Wu, Tianhong Li, Li-Wei Wang

The RTD of a concept class $\mathcal C \subseteq \{0, 1\}^n$, introduced by Zilles et al. (2011), is a combinatorial complexity measure characterized by the worst-case number of examples necessary to identify a concept in $\mathcal C$ according to the recursive teaching model.

Efficient Private ERM for Smooth Objectives

no code implementations29 Mar 2017 Jiaqi Zhang, Kai Zheng, Wenlong Mou, Li-Wei Wang

For strongly convex and smooth objectives, we prove that gradient descent with output perturbation not only achieves nearly optimal utility, but also significantly improves the running time of previous state-of-the-art private optimization algorithms, for both $\epsilon$-DP and $(\epsilon, \delta)$-DP.

Randomness in Deconvolutional Networks for Visual Representation

no code implementations2 Apr 2017 Kun He, Jingbo Wang, Haochuan Li, Yao Shu, Mengxiao Zhang, Man Zhu, Li-Wei Wang, John E. Hopcroft

Toward a deeper understanding on the inner work of deep neural networks, we investigate CNN (convolutional neural network) using DCN (deconvolutional network) and randomization technique, and gain new insights for the intrinsic property of this network architecture.

General Classification Image Reconstruction

Collect at Once, Use Effectively: Making Non-interactive Locally Private Learning Possible

no code implementations ICML 2017 Kai Zheng, Wenlong Mou, Li-Wei Wang

For learning with smooth generalized linear losses, we propose an approximate stochastic gradient oracle estimated from non-interactive LDP channel, using Chebyshev expansion.

regression

Zero-Shot Fine-Grained Classification by Deep Feature Learning with Semantics

no code implementations4 Jul 2017 Aoxue Li, Zhiwu Lu, Li-Wei Wang, Tao Xiang, Xinqi Li, Ji-Rong Wen

In this paper, to address the two issues, we propose a two-phase framework for recognizing images from unseen fine-grained classes, i. e. zero-shot fine-grained classification.

Classification Domain Adaptation +3

Generalization Bounds of SGLD for Non-convex Learning: Two Theoretical Viewpoints

no code implementations19 Jul 2017 Wenlong Mou, Li-Wei Wang, Xiyu Zhai, Kai Zheng

This is the first algorithm-dependent result with reasonable dependence on aggregated step sizes for non-convex learning, and has important implications to statistical learning aspects of stochastic gradient methods in complicated models such as deep learning.

Generalization Bounds Learning Theory +1

The Expressive Power of Neural Networks: A View from the Width

1 code implementation NeurIPS 2017 Zhou Lu, Hongming Pu, Feicheng Wang, Zhiqiang Hu, Li-Wei Wang

That is, there are classes of deep networks which cannot be realized by any shallow network whose size is no more than an exponential bound.

Decoding with Value Networks for Neural Machine Translation

no code implementations NeurIPS 2017 Di He, Hanqing Lu, Yingce Xia, Tao Qin, Li-Wei Wang, Tie-Yan Liu

Inspired by the success and methodology of AlphaGo, in this paper we propose using a prediction network to improve beam search, which takes the source sentence $x$, the currently available decoding output $y_1,\cdots, y_{t-1}$ and a candidate word $w$ at step $t$ as inputs and predicts the long-term value (e. g., BLEU score) of the partial target sentence if it is completed by the NMT model.

Machine Translation NMT +2

A Comparison of Word Embeddings for the Biomedical Natural Language Processing

2 code implementations1 Feb 2018 Yanshan Wang, Sijia Liu, Naveed Afzal, Majid Rastegar-Mojarad, Li-Wei Wang, Feichen Shen, Paul Kingsbury, Hongfang Liu

First, the word embeddings trained on clinical notes and biomedical publications can capture the semantics of medical terms better, and find more relevant similar medical terms, and are closer to human experts' judgments, compared to these trained on Wikipedia and news.

Information Retrieval

Learning Region Features for Object Detection

no code implementations ECCV 2018 Jiayuan Gu, Han Hu, Li-Wei Wang, Yichen Wei, Jifeng Dai

While most steps in the modern object detection methods are learnable, the region feature extraction step remains largely hand-crafted, featured by RoI pooling methods.

Object object-detection +1

Fast, Diverse and Accurate Image Captioning Guided By Part-of-Speech

no code implementations CVPR 2019 Aditya Deshpande, Jyoti Aneja, Li-Wei Wang, Alexander Schwing, D. A. Forsyth

We achieve the trifecta: (1) High accuracy for the diverse captions as evaluated by standard captioning metrics and user studies; (2) Faster computation of diverse captions compared to beam search and diverse beam search; and (3) High diversity as evaluated by counting novel sentences, distinct n-grams and mutual overlap (i. e., mBleu-4) scores.

Caption Generation Image Captioning

Towards Binary-Valued Gates for Robust LSTM Training

1 code implementation ICML 2018 Zhuohan Li, Di He, Fei Tian, Wei Chen, Tao Qin, Li-Wei Wang, Tie-Yan Liu

Long Short-Term Memory (LSTM) is one of the most widely used recurrent structures in sequence modeling.

MedSTS: A Resource for Clinical Semantic Textual Similarity

4 code implementations28 Aug 2018 Yanshan Wang, Naveed Afzal, Sunyang Fu, Li-Wei Wang, Feichen Shen, Majid Rastegar-Mojarad, Hongfang Liu

A subset of MedSTS (MedSTS_ann) containing 1, 068 sentence pairs was annotated by two medical experts with semantic similarity scores of 0-5 (low to high similarity).

Decision Making Semantic Similarity +3

Learning to Navigate for Fine-grained Classification

12 code implementations ECCV 2018 Ze Yang, Tiange Luo, Dong Wang, Zhiqiang Hu, Jun Gao, Li-Wei Wang

In consideration of intrinsic consistency between informativeness of the regions and their probability being ground-truth class, we design a novel training paradigm, which enables Navigator to detect most informative regions under the guidance from Teacher.

Fine-Grained Image Classification General Classification +1

FRAGE: Frequency-Agnostic Word Representation

2 code implementations NeurIPS 2018 Chengyue Gong, Di He, Xu Tan, Tao Qin, Li-Wei Wang, Tie-Yan Liu

Continuous word representation (aka word embedding) is a basic building block in many neural network-based models used in natural language processing tasks.

Language Modelling Machine Translation +5

Improving the Generalization of Adversarial Training with Domain Adaptation

2 code implementations ICLR 2019 Chuanbiao Song, Kun He, Li-Wei Wang, John E. Hopcroft

Our intuition is to regard the adversarial training on FGSM adversary as a domain adaption task with limited number of target domain samples.

Adversarial Attack Domain Adaptation

Transferrable Feature and Projection Learning with Class Hierarchy for Zero-Shot Learning

no code implementations19 Oct 2018 Aoxue Li, Zhiwu Lu, Jiechao Guan, Tao Xiang, Li-Wei Wang, Ji-Rong Wen

Inspired by the fact that an unseen class is not exactly `unseen' if it belongs to the same superclass as a seen class, we propose a novel inductive ZSL model that leverages superclasses as the bridge between seen and unseen classes to narrow the domain gap.

Attribute Clustering +2

Gradient Descent Finds Global Minima of Deep Neural Networks

no code implementations9 Nov 2018 Simon S. Du, Jason D. Lee, Haochuan Li, Li-Wei Wang, Xiyu Zhai

Gradient descent finds a global minimum in training deep neural networks despite the objective function being non-convex.

Mimicking the In-Camera Color Pipeline for Camera-Aware Object Compositing

no code implementations27 Mar 2019 Jun Gao, Xiao Li, Li-Wei Wang, Sanja Fidler, Stephen Lin

We present a method for compositing virtual objects into a photograph such that the object colors appear to have been processed by the photo's camera imaging pipeline.

Distributed Bandit Learning: Near-Optimal Regret with Efficient Communication

no code implementations ICLR 2020 Yuanhao Wang, Jiachen Hu, Xiaoyu Chen, Li-Wei Wang

We study the problem of regret minimization for distributed bandits learning, in which $M$ agents work collaboratively to minimize their total regret under the coordination of a central server.

Multi-Armed Bandits

AT-GAN: An Adversarial Generator Model for Non-constrained Adversarial Examples

no code implementations16 Apr 2019 Xiaosen Wang, Kun He, Chuanbiao Song, Li-Wei Wang, John E. Hopcroft

In this way, AT-GAN can learn the distribution of adversarial examples that is very close to the distribution of real data.

Adversarial Attack

Hint-based Training for Non-Autoregressive Translation

no code implementations ICLR 2019 Zhuohan Li, Di He, Fei Tian, Tao Qin, Li-Wei Wang, Tie-Yan Liu

To improve the accuracy of NART models, in this paper, we propose to leverage the hints from a well-trained ART model to train the NART model.

Machine Translation Translation

Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems

no code implementations28 May 2019 Tianle Cai, Ruiqi Gao, Jikai Hou, Siyu Chen, Dong Wang, Di He, Zhihua Zhang, Li-Wei Wang

First-order methods such as stochastic gradient descent (SGD) are currently the standard algorithm for training deep neural networks.

regression Second-order methods

Cross-lingual Knowledge Graph Alignment via Graph Matching Neural Network

1 code implementation ACL 2019 Kun Xu, Li-Wei Wang, Mo Yu, Yansong Feng, Yan Song, Zhiguo Wang, Dong Yu

Previous cross-lingual knowledge graph (KG) alignment studies rely on entity embeddings derived only from monolingual KG structural information, which may fail at matching entities that have different facts in two KGs.

Entity Embeddings Graph Attention +1

Equipping Experts/Bandits with Long-term Memory

no code implementations NeurIPS 2019 Kai Zheng, Haipeng Luo, Ilias Diakonikolas, Li-Wei Wang

We propose the first reduction-based approach to obtaining long-term memory guarantees for online learning in the sense of Bousquet and Warmuth, 2002, by reducing the problem to achieving typical switching regret.

Multi-Armed Bandits

Adversarially Robust Generalization Just Requires More Unlabeled Data

1 code implementation3 Jun 2019 Runtian Zhai, Tianle Cai, Di He, Chen Dan, Kun He, John Hopcroft, Li-Wei Wang

Neural network robustness has recently been highlighted by the existence of adversarial examples.

Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View

2 code implementations ICLR 2020 Yiping Lu, Zhuohan Li, Di He, Zhiqing Sun, Bin Dong, Tao Qin, Li-Wei Wang, Tie-Yan Liu

In this paper, we provide a novel perspective towards understanding the architecture: we show that the Transformer can be mathematically interpreted as a numerical Ordinary Differential Equation (ODE) solver for a convection-diffusion equation in a multi-particle dynamic system.

Position Sentence

Convergence of Adversarial Training in Overparametrized Neural Networks

no code implementations NeurIPS 2019 Ruiqi Gao, Tianle Cai, Haochuan Li, Li-Wei Wang, Cho-Jui Hsieh, Jason D. Lee

Neural networks are vulnerable to adversarial examples, i. e. inputs that are imperceptibly perturbed from natural data and yet incorrectly classified by the network.

Representation Degeneration Problem in Training Natural Language Generation Models

1 code implementation ICLR 2019 Jun Gao, Di He, Xu Tan, Tao Qin, Li-Wei Wang, Tie-Yan Liu

We study an interesting problem in training neural network-based models for natural language generation tasks, which we call the \emph{representation degeneration problem}.

Language Modelling Machine Translation +3

Few-Shot Learning with Global Class Representations

2 code implementations ICCV 2019 Tiange Luo, Aoxue Li, Tao Xiang, Weiran Huang, Li-Wei Wang

In this paper, we propose to tackle the challenging few-shot learning (FSL) problem by learning global class representations using both base and novel class training samples.

Few-Shot Learning Generalized Few-Shot Classification

Nesterov Accelerated Gradient and Scale Invariance for Adversarial Attacks

3 code implementations ICLR 2020 Jiadong Lin, Chuanbiao Song, Kun He, Li-Wei Wang, John E. Hopcroft

While SIM is based on our discovery on the scale-invariant property of deep learning models, for which we leverage to optimize the adversarial perturbations over the scale copies of the input images so as to avoid "overfitting" on the white-box model being attacked and generate more transferable adversarial examples.

Adversarial Attack

Hint-Based Training for Non-Autoregressive Machine Translation

1 code implementation IJCNLP 2019 Zhuohan Li, Zi Lin, Di He, Fei Tian, Tao Qin, Li-Wei Wang, Tie-Yan Liu

Due to the unparallelizable nature of the autoregressive factorization, AutoRegressive Translation (ART) models have to generate tokens sequentially during decoding and thus suffer from high inference latency.

Machine Translation Translation

Robust Local Features for Improving the Generalization of Adversarial Training

1 code implementation ICLR 2020 Chuanbiao Song, Kun He, Jiadong Lin, Li-Wei Wang, John E. Hopcroft

We continue to propose a new approach called Robust Local Features for Adversarial Training (RLFAT), which first learns the robust local features by adversarial training on the RBS-transformed adversarial examples, and then transfers the robust local features into the training of normal adversarial examples.

On the Anomalous Generalization of GANs

no code implementations27 Sep 2019 Jinchen Xuan, Yunchang Yang, Ze Yang, Di He, Li-Wei Wang

Motivated by this observation, we discover two specific problems of GANs leading to anomalous generalization behaviour, which we refer to as the sample insufficiency and the pixel-wise combination.

eCNN: A Block-Based and Highly-Parallel CNN Accelerator for Edge Inference

no code implementations13 Oct 2019 Chao-Tsung Huang, Yu-Chun Ding, Huan-Ching Wang, Chi-Wen Weng, Kai-Ping Lin, Li-Wei Wang, Li-De Chen

In this paper, we approach this goal by considering the inference flow, network model, instruction set, and processor design jointly to optimize hardware performance and image quality.

4k Denoising +3

Clinical Concept Extraction: a Methodology Review

no code implementations24 Oct 2019 Sunyang Fu, David Chen, Huan He, Sijia Liu, Sungrim Moon, Kevin J Peterson, Feichen Shen, Li-Wei Wang, Yanshan Wang, Andrew Wen, Yiqing Zhao, Sunghwan Sohn, Hongfang Liu

Background Concept extraction, a subdomain of natural language processing (NLP) with a focus on extracting concepts of interest, has been adopted to computationally extract clinical information from text for a wide range of applications ranging from clinical decision support to care quality improvement.

Clinical Concept Extraction Decision Making

Defective Convolutional Networks

1 code implementation19 Nov 2019 Tiange Luo, Tianle Cai, Mengxiao Zhang, Siyu Chen, Di He, Li-Wei Wang

Robustness of convolutional neural networks (CNNs) has gained in importance on account of adversarial examples, i. e., inputs added as well-designed perturbations that are imperceptible to humans but can cause the model to predict incorrectly.

Dense RepPoints: Representing Visual Objects with Dense Point Sets

2 code implementations ECCV 2020 Ze Yang, Yinghao Xu, Han Xue, Zheng Zhang, Raquel Urtasun, Li-Wei Wang, Stephen Lin, Han Hu

We present a new object representation, called Dense RepPoints, that utilizes a large set of points to describe an object at multiple levels, including both box level and pixel level.

Object Object Detection

MACER: Attack-free and Scalable Robust Training via Maximizing Certified Radius

2 code implementations ICLR 2020 Runtian Zhai, Chen Dan, Di He, huan zhang, Boqing Gong, Pradeep Ravikumar, Cho-Jui Hsieh, Li-Wei Wang

Adversarial training is one of the most popular ways to learn robust models but is usually attack-dependent and time costly.

Combinatorial Semi-Bandit in the Non-Stationary Environment

no code implementations10 Feb 2020 Wei Chen, Li-Wei Wang, Haoyu Zhao, Kai Zheng

In a special case where the reward function is linear and we have an exact oracle, we design a parameter-free algorithm that achieves nearly optimal regret both in the switching case and in the dynamic case without knowing the parameters in advance.

Memory Enhanced Global-Local Aggregation for Video Object Detection

2 code implementations CVPR 2020 Yihong Chen, Yue Cao, Han Hu, Li-Wei Wang

We argue that there are two important cues for humans to recognize objects in videos: the global semantic information and the local localization information.

Object object-detection +1

MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning

1 code implementation ACL 2020 Jie Lei, Li-Wei Wang, Yelong Shen, Dong Yu, Tamara L. Berg, Mohit Bansal

Generating multi-sentence descriptions for videos is one of the most challenging captioning tasks due to its high requirements for not only visual relevance but also discourse-based coherence across the sentences in the paragraph.

Sentence

Improve bone age assessment by learning from anatomical local regions

no code implementations27 May 2020 Dong Wang, Kexin Zhang, Jia Ding, Li-Wei Wang

In the clinical practice, Tanner and Whitehouse (TW2) method is a widely-used method for radiologists to perform BAA.

Boosting Few-Shot Learning With Adaptive Margin Loss

no code implementations CVPR 2020 Aoxue Li, Weiran Huang, Xu Lan, Jiashi Feng, Zhenguo Li, Li-Wei Wang

Few-shot learning (FSL) has attracted increasing attention in recent years but remains challenging, due to the intrinsic difficulty in learning to generalize from a few examples.

Few-Shot Image Classification Few-Shot Learning +2

METASET: Exploring Shape and Property Spaces for Data-Driven Metamaterials Design

1 code implementation1 Jun 2020 Yu-Chin Chan, Faez Ahmed, Li-Wei Wang, Wei Chen

In answer, we posit that a smaller yet diverse set of unit cells leads to scalable search and unbiased learning.

Physical Simulations Point Processes

(Locally) Differentially Private Combinatorial Semi-Bandits

no code implementations ICML 2020 Xiaoyu Chen, Kai Zheng, Zixin Zhou, Yunchang Yang, Wei Chen, Li-Wei Wang

In this paper, we study Combinatorial Semi-Bandits (CSB) that is an extension of classic Multi-Armed Bandits (MAB) under Differential Privacy (DP) and stronger Local Differential Privacy (LDP) setting.

Multi-Armed Bandits Privacy Preserving

MC-BERT: Efficient Language Pre-Training via a Meta Controller

1 code implementation10 Jun 2020 Zhenhui Xu, Linyuan Gong, Guolin Ke, Di He, Shuxin Zheng, Li-Wei Wang, Jiang Bian, Tie-Yan Liu

Pre-trained contextual representations (e. g., BERT) have become the foundation to achieve state-of-the-art results on many NLP tasks.

Binary Classification Cloze Test +4

RepPoints V2: Verification Meets Regression for Object Detection

1 code implementation NeurIPS 2020 Yihong Chen, Zheng Zhang, Yue Cao, Li-Wei Wang, Stephen Lin, Han Hu

Though RepPoints provides high performance, we find that its heavy reliance on regression for object localization leaves room for improvement.

Instance Segmentation Object +6

Comprehensive Image Captioning via Scene Graph Decomposition

1 code implementation ECCV 2020 Yiwu Zhong, Li-Wei Wang, Jianshu Chen, Dong Yu, Yin Li

We address the challenging problem of image captioning by revisiting the representation of image scene graph.

Image Captioning Sentence

Transferred Discrepancy: Quantifying the Difference Between Representations

no code implementations24 Jul 2020 Yunzhen Feng, Runtian Zhai, Di He, Li-Wei Wang, Bin Dong

Our experiments show that TD can provide fine-grained information for varied downstream tasks, and for the models trained from different initializations, the learned features are not the same in terms of downstream-task predictions.

Efficient Reinforcement Learning in Factored MDPs with Application to Constrained RL

no code implementations ICLR 2021 Xiaoyu Chen, Jiachen Hu, Lihong Li, Li-Wei Wang

The regret of FMDP-BF is shown to be exponentially smaller than that of optimal algorithms designed for non-factored MDPs, and improves on the best previous result for FMDPs~\citep{osband2014near} by a factored of $\sqrt{H|\mathcal{S}_i|}$, where $|\mathcal{S}_i|$ is the cardinality of the factored state subspace and $H$ is the planning horizon.

reinforcement-learning Reinforcement Learning (RL)

GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training

1 code implementation7 Sep 2020 Tianle Cai, Shengjie Luo, Keyulu Xu, Di He, Tie-Yan Liu, Li-Wei Wang

We provide an explanation by showing that InstanceNorm serves as a preconditioner for GNNs, but such preconditioning effect is weaker with BatchNorm due to the heavy batch noise in graph datasets.

Graph Classification Graph Representation Learning

Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot

1 code implementation NeurIPS 2020 Jingtong Su, Yihang Chen, Tianle Cai, Tianhao Wu, Ruiqi Gao, Li-Wei Wang, Jason D. Lee

In this paper, we conduct sanity checks for the above beliefs on several recent unstructured pruning methods and surprisingly find that: (1) A set of methods which aims to find good subnetworks of the randomly-initialized network (which we call "initial tickets"), hardly exploits any information from the training data; (2) For the pruned networks obtained by these methods, randomly changing the preserved weights in each layer, while keeping the total number of preserved weights unchanged per layer, does not affect the final performance.

Network Pruning

Cannot find the paper you are looking for? You can Submit a new open access paper.