no code implementations • 14 Oct 2021 • Chu Han, Jiatai Lin, Jinhai Mai, Yi Wang, Qingling Zhang, Bingchao Zhao, Xin Chen, Xipeng Pan, Zhenwei Shi, Xiaowei Xu, Su Yao, Lixu Yan, Huan Lin, Zeyan Xu, Xiaomei Huang, Guoqiang Han, Changhong Liang, Zaiyi Liu
In the segmentation phase, we achieved tissue semantic segmentation by our proposed Multi-Layer Pseudo-Supervision.
However, this option traditionally hurts the detection performance much.
To synthesize a realistic action sequence based on a single human image, it is crucial to model both motion patterns and diversity in the action video.
We introduce a new image segmentation task, termed Entity Segmentation (ES) with the aim to segment all visual entities in an image without considering semantic category labels.
In this paper, we propose a weakly supervised Part-Mentored Attention Network (PMANet) composed of a Part Attention Network (PANet) for vehicle part localization with self-attention and a Part-Mentored Network (PMNet) for mentoring the global and local feature aggregation.
Scenario generation is a fundamental and crucial tool for decision-making in power systems with high-penetration renewables.
Non-intrusive load monitoring (NILM) helps disaggregate the household's main electricity consumption to energy usages of individual appliances, thus greatly cutting down the cost in fine-grained household load monitoring.
Non-intrusive load monitoring (NILM) is a well-known single-channel blind source separation problem that aims to decompose the household energy consumption into itemised energy usage of individual appliances.
The CenterTrack tracking algorithm achieves state-of-the-art tracking performance using a simple detection model and single-frame spatial offsets to localize objects and predict their associations in a single network.
In this paper, an end-to-end deep reinforcement learning framework is proposed to solve this type of combinatorial optimization problems.
Specifically, all the feature embeddings of query and gallery images are expanded and enhanced by a linear combination of their neighbors, with the correlation prediction serves as discriminative combination weights.
An approach to reduce motion artifacts in Quantitative Susceptibility Mapping using deep learning is proposed.
This paper presents a novel patch-based adversarial attack pipeline that trains adversarial patches on 3D human meshes.
Quantitative imaging in MRI usually involves acquisition and reconstruction of a series of images at multi-echo time points, which possibly requires more scan time and specific reconstruction technique compared to conventional qualitative imaging.
For C1-smooth strongly monotone discrete-time dynamical systems, it is shown that ``convergence to linearly stable cycles" is a prevalent asymptotic behavior in the measuretheoretic sense.
We introduce Neural Representation of Distribution (NeRD) technique, a module for convolutional neural networks (CNNs) that can estimate the feature distribution by optimizing an underlying function mapping image coordinates to the feature distribution.
This survey is an effort to provide a detailed survey of recent progress in single-image super-resolution in the perspective of deep learning while also informing about the initial classical methods used for image super-resolution.
We formulate the knowledge distillation as a multi-task learning problem so that the teacher transfers knowledge to the student only if the student can benefit from learning such knowledge.
This paper showcases the system on the segmentation analysis using an electricity consumption data set and validates the effectiveness of the system.
In contrast to the previous methods, RANet configures the information pathways between the pixels in different regions, enabling the region interaction to exchange the regional context for enhancing all of the pixels in the image.
Our study illustrates the outstanding design of ALPR with four insights: (1) the resampling-based cascaded framework is beneficial to both speed and accuracy; (2) the highly efficient license plate recognition should abundant additional character segmentation and recurrent neural network (RNN), but adopt a plain convolutional neural network (CNN); (3) in the case of CNN, taking advantage of vertex information on license plates improves the recognition performance; and (4) the weight-sharing character classifier addresses the lack of training images in small-scale datasets.
In this report, we discribe the submission of Tongji University undergraduate team to the CLOSE track of the VoxCeleb Speaker Recognition Challenge (VoxSRC) 2020 at Interspeech 2020.
Our core idea is straightforward: A diverse ensemble of low precision and high recall models are likely to make different false positive errors (classifying background as foreground in different parts of the image), but the true positives will tend to be consistent.
In this paper, we seek to quantify the bias in terms of the impact that different levels of motion artifacts have on the performance of neural networks engaged in a lesion segmentation task.
Multiple sclerosis (MS) lesions occupy a small fraction of the brain volume, and are heterogeneous with regards to shape, size and locations, which poses a great challenge for training deep learning based segmentation models.
Recently, 3D medical image reconstruction (MIR) and segmentation (MIS) based on deep neural networks have been developed with promising results, and attention mechanism has been further designed to capture global contextual information for performance enhancement.
A learning-based posterior distribution estimation method, Probabilistic Dipole Inversion (PDI), is proposed to solve the quantitative susceptibility mapping (QSM) inverse problem in MRI with uncertainty estimation.
The efficacy of our network is verified from a collected dataset of 418 patients with 145 benign tumors and 273 malignant tumors.
The previously established LOUPE (Learning-based Optimization of the Under-sampling Pattern) framework for optimizing the k-space sampling pattern in MRI was extended in three folds: firstly, fully sampled multi-coil k-space data from the scanner, rather than simulated k-space data from magnitude MR images in LOUPE, was retrospectively under-sampled to optimize the under-sampling pattern of in-vivo k-space data; secondly, binary stochastic k-space sampling, rather than approximate stochastic k-space sampling of LOUPE during training, was applied together with a straight-through (ST) estimator to estimate the gradient of the threshold operation in a neural network; thirdly, modified unrolled optimization network, rather than modified U-Net in LOUPE, was used as the reconstruction network in order to reconstruct multi-coil data properly and reduce the dependency on training data.
In this paper, we propose a novel self-training approach named Crowd-SDNet that enables a typical object detector trained only with point-level annotations (i. e., objects are labeled with points) to estimate both the center points and sizes of crowded objects.
We formulate a joint optimization problem of UAV deployment, caching placement and user association for maximizing QoE of users, which is evaluated by mean opinion score (MOS).
Contrary to the traditional adversarial patch, this new form of attack is mapped into the 3D object world and back-propagates to the 2D image domain through differentiable rendering.
In this paper, we propose the first fully-automated solution to segment the whole fetal head in US volumes.
However, the existing networks usually suffer from either redundancy of convolutional layers or insufficient utilization of parameters.
Traditional convolution-based generative adversarial networks synthesize images based on hierarchical local operations, where long-range dependency relation is implicitly modeled with a Markov chain.
Efficiently synthesizing realistic, editable and high resolution US images can solve the problems.
In this paper, we explore the mask representation in instance segmentation with Point-of-Interest (PoI) features.
Brain lesion volume measured on T2 weighted MRI images is a clinically important disease marker in multiple sclerosis (MS).
Previous database systems extended their SQL dialect to support ML.
(i) This is the first work about 3D pose estimation of fetus in the literature.
That is, the regularization strength is fixed to a predefined schedule, and manual adjustments are required to adapt to various network architectures.
In this paper, we present a unified framework to integrate icorpp's reasoning and planning components.
Our attention module utilizes the attention mechanism to selectively leverage the multilevel features integrated from different layers to refine the features at each individual layer, suppressing the non-prostate noise at shallow layers of the CNN and increasing more prostate details into features at deep layers.
Training neural networks with back-propagation (BP) requires a sequential passing of activations and gradients, which forces the network modules to work in a synchronous fashion.
This paper studies the fundamental problem of extrapolating visual context using deep generative models, i. e., extending image borders with plausible structure and details.
Alternatively, the semantics of pBC+ can also be defined in terms of Markov Decision Process (MDP), which in turn allows for representing MDP in a succinct and elaboration tolerant way as well as to leverage an MDP solver to compute pBC+.
In this paper, we propose a fully-automated framework to segment left atrium in gadolinium-enhanced MR volumes.
This paper devotes to investigating the methodological characteristics and performance of representative global and local scalable GPs including sparse approximations and local aggregations from four main perspectives: scalability, capability, controllability and robustness.
Learning in LPMLN is in accordance with the stable model semantics, thereby it learns parameters for probabilistic extensions of knowledge-rich domains where answer set programming has shown to be useful but limited to the deterministic case, such as reachability analysis and reasoning about actions in dynamic domains.
Like the conditional randomization test of Cand\`es et al. (2018), our test relies on the availability of an approximation to the distribution of $X \mid Z$.
Methodology Statistics Theory Statistics Theory
In order to scale standard Gaussian process (GP) regression to large-scale datasets, aggregation models employ factorized training process and then combine predictions from distributed experts.
In single image deblurring, the "coarse-to-fine" scheme, i. e. gradually restoring the sharp image on different resolutions in a pyramid, is very successful in both traditional optimization-based methods and recent neural-network-based approaches.
Ranked #2 on Deblurring on RealBlur-R
The proposed method integrates the subspace learning, transformed IGO reconstruction and image alignment into a unified online framework, which is robust for aligning images with severe intensity distortions.
Additionally, our approach is general and can be extended to other medical image segmentation tasks, where boundary incompleteness is one of the main challenges.
Markov Logic Networks (MLN) and Probabilistic Soft Logic (PSL) are widely applied formalisms in Statistical Relational Learning, an emerging area in Artificial Intelligence that is concerned with combining logical and statistical AI.
This paper considers the two-stage capacitated facility location problem (TSCFLP) in which products manufactured in plants are delivered to customers via storage depots.
25 code implementations • 8 Dec 2015 • Dario Amodei, Rishita Anubhai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, Erich Elsen, Jesse Engel, Linxi Fan, Christopher Fougner, Tony Han, Awni Hannun, Billy Jun, Patrick LeGresley, Libby Lin, Sharan Narang, Andrew Ng, Sherjil Ozair, Ryan Prenger, Jonathan Raiman, Sanjeev Satheesh, David Seetapun, Shubho Sengupta, Yi Wang, Zhiqian Wang, Chong Wang, Bo Xiao, Dani Yogatama, Jun Zhan, Zhenyao Zhu
We show that an end-to-end deep learning approach can be used to recognize either English or Mandarin Chinese speech--two vastly different languages.
Ranked #1 on Noisy Speech Recognition on CHiME clean
The Nearest subspace classifier (NSS) finds an estimation of the underlying subspace within each class and assigns data points to the class that corresponds to its nearest subspace.
In this paper, a novel pattern classification approach is proposed by regularizing the classifier learning to maximize mutual information between the classification response and the true class label.
User's mental state is concerned gradually, during the interaction course of human robot.
Many prevalent multi-class classification approaches can be unified and generalized by the output coding framework which usually consists of three phases: (1) coding, (2) learning binary classifiers, and (3) decoding.
Current music recommender systems typically act in a greedy fashion by recommending songs with the highest user ratings.
Due to its NP-hard nature, a hybrid discrete cuckoo search algorithm is proposed to solve this problem.
We introduce the MathGR package, written in Mathematica.
Mathematical Software Cosmology and Nongalactic Astrophysics General Relativity and Quantum Cosmology High Energy Physics - Theory Computational Physics
Bayesian reinforcement learning (BRL) encodes prior knowledge of the world in a model and represents uncertainty in model parameters by maintaining a probability distribution over them.