The new ideas in the current paper are: (a) new variants of mixup with negative as well as positive coefficients, and extend the sample-wise mixup to be pixel-wise.
The design of MAFENN framework and algorithm are dedicated to enhance the learning capability of the feedfoward DL networks or their variations with the simple data feedback.
In this paper, we present a novel attack method FILM for federated learning of language models -- for the first time, we show the feasibility of recovering text from large batch sizes of up to 128 sentences.
We study few-shot debugging of transformer based natural language understanding models, using recently popularized test suites to not just diagnose but correct a problem.
The downlink channel covariance matrix (CCM) acquisition is the key step for the practical performance of massive multiple-input and multiple-output (MIMO) systems, including beamforming, channel tracking, and user scheduling.
Based on the identified latent directions of attributes, we propose Compositional Attribute Adjustment to adjust the latent code, resulting in better compositionality of image synthesis.
Federated learning (FL) has been increasingly considered to preserve data training privacy from eavesdropping attacks in mobile edge computing-based Internet of Thing (EdgeIoT).
Gradient inversion attack (or input recovery from gradient) is an emerging threat to the security and privacy preservation of Federated learning, whereby malicious eavesdroppers or participants in the protocol can recover (partially) the clients' private data.
The deep policy gradient method has demonstrated promising results in many large-scale games, where the agent learns purely from its own experience.
In this paper, we present a lightweight encoder-decoder architecture, CarNet, for efficient and high-quality crack detection.
In scientific literature, XANES/Raman data are usually plotted in line graphs which is a visually appropriate way to represent the information when the end-user is a human reader.
Numerous detection problems in computer vision, including road crack detection, suffer from exceedingly foreground-background imbalance.
They also fail to sense the entire space of the input, which is critical for high-quality MR image SR. To address those problems, we propose squeeze and excitation reasoning attention networks (SERAN) for accurate MR image SR. We propose to squeeze attention from global spatial information of the input and obtain global descriptors.
Ranked #2 on Super-Resolution on IXI
In addition, we use a novel agent network named Population Invariant agent with Transformer (PIT) to realize the coordination transfer in more varieties of scenarios.
In this paper, we propose a cooperative MARL method with sequential credit assignment (SeCA) that deduces each agent's contribution to the team's success one by one to learn better cooperation.
This paper studies Semi-Supervised Domain Adaptation (SSDA), a practical yet under-investigated research topic that aims to learn a model of good performance using unlabeled samples and a few labeled samples in the target domain, with the help of labeled samples from a source domain.
Contrasted with prior work, this paper provides a complementary solution to align domains by learning the same auxiliary tasks in both domains simultaneously.
By utilizing this dataset, we propose an object-detection based image retrieval framework that models the UI context and hierarchical structure.
With the goal of tuning up the brightness, low-light image enhancement enjoys numerous applications, such as surveillance, remote sensing and computational photography.
This paper proposes a neural network for multi-level low-light image enhancement, which is user-friendly to meet various requirements by selecting different images as brightness reference.
In this paper, we propose to enhance the temporal coherence by Consistency-Regularized Graph Neural Networks (CRGNN) with the aid of a synthesized video matting dataset.
Unsupervised domain adaptation (UDA) is to make predictions for unlabeled data in a target domain with labeled data from source domain available.
Owning to the unremitting efforts by a few institutes, significant progress has recently been made in designing superhuman AIs in No-limit Texas Hold'em (NLTH), the primary testbed for large-scale imperfect-information game research.
To obtain a high-performance vehicle ReID model, we present a novel Distance Shrinking with Angular Marginalizing (DSAM) loss function to perform hybrid learning in both the Original Feature Space (OFS) and the Feature Angular Space (FAS) using the local verification and the global identification information.
1 code implementation • 10 Nov 2020 • Andrey Ignatov, Radu Timofte, Zhilu Zhang, Ming Liu, Haolin Wang, WangMeng Zuo, Jiawei Zhang, Ruimao Zhang, Zhanglin Peng, Sijie Ren, Linhui Dai, Xiaohong Liu, Chengqi Li, Jun Chen, Yuichi Ito, Bhavya Vasudeva, Puneesh Deora, Umapada Pal, Zhenyu Guo, Yu Zhu, Tian Liang, Chenghua Li, Cong Leng, Zhihong Pan, Baopu Li, Byung-Hoon Kim, Joonyoung Song, Jong Chul Ye, JaeHyun Baek, Magauiya Zhussip, Yeskendir Koishekenov, Hwechul Cho Ye, Xin Liu, Xueying Hu, Jun Jiang, Jinwei Gu, Kai Li, Pengliang Tan, Bingxin Hou
This paper reviews the second AIM learned ISP challenge and provides the description of the proposed solutions and results.
In this paper, we address this limitation with an efficient learning objective that considers the discriminative feature distributions between the visual objects and sentence words.
There is a fundamental trade-off between the channel representation resolution of codebooks and the overheads of feedback communications in the fifth generation new radio (5G NR) frequency division duplex (FDD) massive multiple-input and multiple-output (MIMO) systems.
To address the issue that deep neural networks (DNNs) are vulnerable to model inversion attacks, we design an objective function, which adjusts the separability of the hidden data representations, as a way to control the trade-off between data utility and vulnerability to inversion attacks.
In addition, TextHide fits well with the popular framework of fine-tuning pre-trained language models (e. g., BERT) for any sentence or sentence-pair task.
This paper introduces InstaHide, a simple encryption of training images, which can be plugged into existing distributed deep learning pipelines.
This paper presents a learning-based approach to synthesize the view from an arbitrary camera position given a sparse set of images.
We establish a benchmark suite consisting of different types of PDF document datasets that can be utilized for cross-domain DOD model training and evaluation.
The recent flourish of deep learning in various tasks is largely accredited to the rich and accessible labeled data.
This paper attempts to answer the question whether neural network pruning can be used as a tool to achieve differential privacy without losing much data utility.
This paper presents a new approach for caching in CDNs that uses machine learning to approximate the Belady MIN algorithm.
We instead reformulate ZSL as a conditioned visual classification problem, i. e., classifying visual features based on the classifiers learned from the semantic descriptions.
It outperforms the current best method by 6. 8% relatively for image retrieval and 4. 8% relatively for caption retrieval on MS-COCO (Recall@1 using 1K test set).
Ranked #7 on Image Retrieval on Flickr30K 1K test
A key challenge is online MPT and data collection in the presence of on-board control of a UAV (e. g., patrolling velocity) for preventing battery drainage and data queue overflow of the sensing devices, while up-to-date knowledge on battery level and data queue of the devices is not available at the UAV.
However, capturing the short-term effects of drugs and therapeutic interventions on patient physiological state remains challenging.
To circumvent the limited enclave memory (128 MB with the latest Intel CPUs), we propose to place the memory buffer of the eLSM store outside the enclave and protect the buffer using a new authenticated data structure by digesting individual LSM-tree levels.
Cryptography and Security Databases Distributed, Parallel, and Cluster Computing Data Structures and Algorithms
Face Anti-spoofing gains increased attentions recently in both academic and industrial fields.
To address this issue, we design local and non-local attention blocks to extract features that capture the long-range dependencies between pixels and pay more attention to the challenging parts.
Convolutional nets have been shown to achieve state-of-the-art accuracy in many biomedical image analysis tasks.
This paper presents a very simple but efficient algorithm for 3D line segment detection from large scale unorganized point cloud.
To capture the graph dynamics, we use the graph prediction stream to predict the dynamic graph structures, and the predicted structures are fed into the flow prediction stream.
To ensure scalability and separability, a softmax-like function is formulated to push apart the positive and negative support sets.
To solve these problems, we propose the very deep residual channel attention networks (RCAN).
Ranked #11 on Image Super-Resolution on BSD100 - 4x upscaling
First, a novel cost-sensitive multi-task loss function is designed to learn transferable aging features by training on the source population.
However, most existing deep hashing methods directly learn the hash functions by encoding the global semantic information, while ignoring the local spatial information of images.
Gaussian processes (GPs), or distributions over arbitrary functions in a continuous domain, can be generalized to the multi-output case: a linear model of coregionalization (LMC) is one approach.
In the scenario of real-time monitoring of hospital patients, high-quality inference of patients' health status using all information available from clinical covariates and lab tests is essential to enable successful medical interventions and improve patient outcomes.
We detect the topological properties of Chern insulators with strong Coulomb interactions by use of cluster perturbation theory and variational cluster approach.
Strongly Correlated Electrons
We argue that the problem lies in the mix-up of two interpretations of the extensive form game structures: game rules or game runs which do not always coincide.
With the prevalence of the commodity depth cameras, the new paradigm of user interfaces based on 3D motion capturing and recognition have dramatically changed the way of interactions between human and computers.
Fisheye image rectification and estimation of intrinsic parameters for real scenes have been addressed in the literature by using line information on the distorted images.
In this work, we propose a novel hash learning framework that encodes feature's rank orders instead of numeric values in a number of optimal low-dimensional ranking subspaces.