Theoretically, we show that the small networks pruned using our method achieve provably lower loss than small networks trained from scratch with the same size.
The idea is to generate a set of augmented data with some random perturbations or transforms, and minimize the maximum, or worst case loss over the augmented data.
Motivated by the rising abundance of observational data with continuous treatments, we investigate the problem of estimating the average dose-response curve (ADRF).
The original contributions of this paper are summarized as follows: (1) Model the packets collision probability of broadcast or NACK transmission in VANET with the combination theory and investigate the potential influence of miss my packets (MMP) problem.
Networking and Internet Architecture
With the rising abundance of observational data with continuous treatments, we investigate the problem of estimating average dose-response curve (ADRF).
Despite the great success of deep learning, recent works show that large deep neural networks are often highly redundant and can be significantly reduced in size.
To the best of our knowledge, this is the first work to provide in-depth analysis and discussion of applying pruning to online recommendation systems with non-stationary data distribution.
This is achieved by layerwise imitation, that is, forcing the thin network to mimic the intermediate outputs of the wide network from layer to layer.
For security reasons, it is of critical importance to develop models with certified robustness that can provably guarantee that the prediction is can not be altered by any possible synonymous word substitution.
Our main contributions are a novel feature section approach which uses multi-step transition probability to characterize the data structure, and three algorithms proposed from the positive and negative aspects for keeping data structure.
By this reconstructor, we can construct prototypes for the original features using class prototypes and domain prototypes correspondingly.
Recently, Liu et al. proposed a splitting steepest descent (S2D) method that jointly optimizes the neural parameters and architectures based on progressively growing network structures by splitting neurons into multiple copies in a steepest descent fashion.
Taking them together, we formulate a novel Distribution-Aware coordinate Representation for Keypoint (DARK) method.
This differs from the existing methods based on backward elimination, which remove redundant neurons from the large network.
Randomized classifiers have been shown to provide a promising approach for achieving certified robustness against adversarial attacks in deep learning.
The idea is to generate a set of augmented data with some random perturbations or transforms and minimize the maximum, or worst case loss over the augmented data.
Ranked #52 on Image Classification on ImageNet
We propose multipoint quantization, a quantization method that approximates a full-precision weight vector using a linear combination of multiple vectors of low-bit numbers; this is in contrast to typical quantization methods that approximate each weight using a single low precision number.
Stochastic gradient Markov chain Monte Carlo (MCMC) algorithms have received much attention in Bayesian computing for big data problems, but they are only applicable to a small class of problems for which the parameter space has a fixed dimension and the log-posterior density is differentiable with respect to the parameters.
Interestingly, we found that the process of decoding the predicted heatmaps into the final joint coordinates in the original image space is surprisingly significant for human pose estimation performance, which nevertheless was not recognised before.
Ranked #1 on Multi-Person Pose Estimation on COCO (using extra training data)
no code implementations • 3 May 2019 • Xiong Deng, Chao Chen, Deyang Chen, Xiangbin Cai, Xiaozhe Yin, Chao Xu, Fei Sun, Caiwen Li, Yan Li, Han Xu, Mao Ye, Guo Tian, Zhen Fan, Zhipeng Hou, Minghui Qin, Yu Chen, Zhenlin Luo, Xubing Lu, Guofu Zhou, Lang Chen, Ning Wang, Ye Zhu, Xingsen Gao, Jun-Ming Liu
The limitation of commercially available single-crystal substrates and the lack of continuous strain tunability preclude the ability to take full advantage of strain engineering for further exploring novel properties and exhaustively studying fundamental physics in complex oxides.
We propose a variable selection method for high dimensional regression models, which allows for complex, nonlinear, and high-order interactions among variables.
We present visual-analytics methods to reveal and analyze this hierarchy of similar classes in relation with CNN-internal data.
We present a novel sensor fusion algorithm that first segments the depth map into different categories such as opaque/transparent/infinity (e. g., too far to measure) and then updates the depth map based on the segmentation outcome.
In this paper we present a novel real-time algorithm for simultaneous pose and shape estimation for articulated objects, such as human beings and animals.
With the wide-spread of consumer 3D-TV technology, stereoscopic videoconferencing systems are emerging.