In contrast to the abundant research focusing on large-scale models, the progress in lightweight semantic segmentation appears to be advancing at a comparatively slower pace.
This paper proposes a novel paradigm for facial privacy protection that unifies multiple characteristics including anonymity, diversity, reversibility and security within a single lightweight framework.
Adversarial examples (AE) with good transferability enable practical black-box attacks on diverse target models, where insider knowledge about the target models is not required.
Experimental results on Stanford2D3D Panoramic datasets show that SGAT4PASS significantly improves performance and robustness, with approximately a 2% increase in mIoU, and when small 3D disturbances occur in the data, the stability of our performance is improved by an order of magnitude.
Ranked #4 on Semantic Segmentation on Stanford2D3D Panoramic
Then we derive two specific attention modules, named SpatialFormer Semantic Attention (SFSA) and SpatialFormer Target Attention (SFTA), to enhance the target object regions while reduce the background distraction.
Instead of sampling positive and negative links and heuristically computing the features of their enclosing subgraphs, GraphLP utilizes the feature learning ability of deep-learning models to automatically extract the structural patterns of graphs for link prediction under the assumption that real-world graphs are not locally isolated.
To assist form designers, in this work we present FormLM to model online forms (by enhancing pre-trained language model with form structural information) and recommend form creation ideas (including question / options recommendations and block type suggestion).
To begin with, we investigate the efficacy of transfer learning in IIR, by comparing off-the-shelf features learned by a pre-trained deep neural network (DNN) classifier with features learned by a CL model.
In particular, the stuff layouts can take amorphous shapes and fill up the missing regions left out by the instance layouts.
Ranked #2 on Layout-to-Image Generation on Visual Genome 128x128
There has been an increasing interest in deploying non-line-of-sight (NLOS) imaging systems for recovering objects behind an obstacle.
In the focus search strategy, if there is no knowledge source benefit the optimization of a task, then all knowledge sources in the task's pool are forbidden to be utilized except the task, which helps to improve the performance of the proposed algorithm.
Instead, from a perspective on temporal grounding as a metric-learning problem, we present a Mutual Matching Network (MMN), to directly model the similarity between language queries and video moments in a joint embedding space.
Ranked #3 on Temporal Sentence Grounding on Charades-STA
Recently, person re-identification (Re-ID) has achieved great progress.
Ranked #3 on Person Re-Identification on PRCC
The purpose of this work is to highlight the content of the Microsoft Recommenders repository and show how it can be used to reduce the time involved in developing recommender systems.
no code implementations • 7 Aug 2020 • Tao Wu, Ellie Ka-In Chio, Heng-Tze Cheng, Yu Du, Steffen Rendle, Dima Kuzmin, Ritesh Agarwal, Li Zhang, John Anderson, Sarvjeet Singh, Tushar Chandra, Ed H. Chi, Wen Li, Ankit Kumar, Xiang Ma, Alex Soares, Nitin Jindal, Pei Cao
In light of these problems, we observed that most online content platforms have both a search and a recommender system that, while having heterogeneous input spaces, can be connected through their common output item space and a shared semantic representation.
3D moving object detection is one of the most critical tasks in dynamic scene analysis.
Structured convex optimization on weighted graphs finds numerous applications in machine learning and computer vision.
Users issue queries to Search Engines, and try to find the desired information in the results produced.
We tackle the challenge of disentangled representation learning in generative adversarial networks (GANs) from the perspective of regularized optimal transport (OT).
Photometric stereo (PS) techniques nowadays remain constrained to an ideal laboratory setup where modeling and calibration of lighting is amenable.
In this work, we consider nonconvex composite problems that involve inf-convolution with a Legendre function, which gives rise to an anisotropic generalization of the proximal mapping and Moreau-envelope.
Probabilistic graphical models are traditionally known for their successes in generative modeling.
Dense pixelwise prediction such as semantic segmentation is an up-to-date challenge for deep convolutional neural networks (CNNs).
In addition, the model is capable of measuring the importance of microscopic network elements, i. e., nodes and links, in terms of network regularity thereby allowing us to regulate the reconstructability of networks based on them.
We present a novel preconditioning technique for proximal optimization methods that relies on graph algorithms to construct effective preconditioners.
The second one directly recovers the depth, by formulating photometric stereo as a system of PDEs which are partially linearized using image ratios.
This paper tackles the photometric stereo problem in the presence of inaccurate lighting, obtained either by calibration or by an uncalibrated photometric stereo method.
Spectral clustering and co-clustering are well-known techniques in data analysis, and recent work has extended spectral clustering to square, symmetric tensors and hypermatrices derived from a network.
Previous studies have shown that the pure Price Of Anarchy (POA) of GSP is 1. 25 when there are two ad slots and 1. 259 when three ad slots.
Computer Science and Game Theory