However, existing methods failed to simultaneously address the three main challenges in mmWave radar pointcloud reconstruction: specular information lost, low angular resolution, and strong interference and noise.
Recent advances in attention-based multiple instance learning (MIL) have improved our insights into the tissue regions that models rely on to make predictions in digital pathology.
This paper introduces a new formulation for risk-sensitive MDPs, which assesses risk in a slightly different manner compared to the classical Markov risk measure (Ruszczy\'nski 2010), and establishes its equivalence with a class of regularized robust MDP (RMDP) problems, including the standard RMDP as a special case.
Risk and uncertainty in each stage of CLSC have greatly increased the complexity and reduced process efficiency of the closed-loop networks, impeding the sustainable and resilient development of industries and the circular economy.
On CIFAR-10, we obtain a FID of 2. 80 by sampling in 15 steps under one-session training and the new state-of-the-art FID of 3. 37 by sampling in one step with additional training.
High-Dimensional and Incomplete matrices, which usually contain a large amount of valuable latent information, can be well represented by a Latent Factor Analysis model.
Since their binarization processes are not a component of the network, the learning-based binary descriptor cannot fully utilize the advances of deep learning.
Different from existing methods that predict human poses from RF signals on the signal level directly, we consider the structure difference between the RF signals and the human poses, propose to transform the RF signals to the pose domain on the feature level based on Optimal Transport (OT) theory, and generate human poses from the transformed features.
In this paper, we propose a Deep Hierarchical Optimal Transport method (DeepHOT) for unsupervised domain adaptation.
As pointed out by previous works, this two-step procedure results in low discriminating power, as 1-WL-GNNs by nature learn node-level representations instead of link-level.
no code implementations • 25 May 2022 • Eduardo Pérez-Pellitero, Sibi Catley-Chandar, Richard Shaw, Aleš Leonardis, Radu Timofte, Zexin Zhang, Cen Liu, Yunbo Peng, Yue Lin, Gaocheng Yu, Jin Zhang, Zhe Ma, Hongbin Wang, Xiangyu Chen, Xintao Wang, Haiwei Wu, Lin Liu, Chao Dong, Jiantao Zhou, Qingsen Yan, Song Zhang, Weiye Chen, Yuhang Liu, Zhen Zhang, Yanning Zhang, Javen Qinfeng Shi, Dong Gong, Dan Zhu, Mengdi Sun, Guannan Chen, Yang Hu, Haowei Li, Baozhu Zou, Zhen Liu, Wenjie Lin, Ting Jiang, Chengzhi Jiang, Xinpeng Li, Mingyan Han, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Juan Marín-Vega, Michael Sloth, Peter Schneider-Kamp, Richard Röttger, Chunyang Li, Long Bao, Gang He, Ziyao Xu, Li Xu, Gen Zhan, Ming Sun, Xing Wen, Junlin Li, Shuang Feng, Fei Lei, Rui Liu, Junxiang Ruan, Tianhong Dai, Wei Li, Zhan Lu, Hengyan Liu, Peian Huang, Guangyu Ren, Yonglin Luo, Chang Liu, Qiang Tu, Fangya Li, Ruipeng Gang, Chenghua Li, Jinjing Li, Sai Ma, Chenming Liu, Yizhen Cao, Steven Tel, Barthelemy Heyrman, Dominique Ginhac, Chul Lee, Gahyeon Kim, Seonghyun Park, An Gia Vien, Truong Thanh Nhat Mai, Howoon Yoon, Tu Vo, Alexander Holston, Sheir Zaheer, Chan Y. Park
The challenge is composed of two tracks with an emphasis on fidelity and complexity constraints: In Track 1, participants are asked to optimize objective fidelity scores while imposing a low-complexity constraint (i. e. solutions can not exceed a given number of operations).
Due to the complex volatility of the stock market, the research and prediction on the change of the stock price, can avoid the risk for the investors.
We also propose a Layer-sharing technique in the deep layer that can achieve better accuracy with less computational overhead.
In this work we propose an AutoML technique SapientML, that can learn from a corpus of existing datasets and their human-written pipelines, and efficiently generate a high-quality pipeline for a predictive task on a new dataset.
To overcome such limitations, in this paper, we propose to utilize the radio signals, which can traverse obstacles and are unaffected by the lighting conditions to achieve silhouette segmentation.
In this paper, we propose a radio-assisted human detection framework by incorporating radio information into the state-of-the-art detection methods, including anchor-based onestage detectors and two-stage detectors.
Next, a refinement block is introduced to enhance the visual tokens with self-attention and cross-attention.
Ranked #2 on Image Retrieval on RParis (Medium)
Experimental results in an exemplary environment show that our MARL approach is able to demonstrate the effectiveness and necessity of restrictions on individual liberty for collaborative supply of public goods.
To tackle this challenge, we propose an unsupervised domain adaptation framework for device free gesture recognition by making effective use of the unlabeled target domain data.
To enhance the robustness of the system and reduce data collecting efforts, we design a data augmentation framework for mmWave signals based on correlations between signal patterns and gesture variations.
We propose EdgePipe, a distributed framework for edge systems that uses pipeline parallelism to both speed up inference and enable running larger (and more accurate) models that otherwise cannot fit on single edge devices.
Aiming to make GNN improvements practical, this paper proposes an approach called NeuroBack, which builds on two insights: (1) predicting phases (i. e., values) of variables appearing in the majority (or even all) of the satisfying assignments are essential for CDCL SAT solving, and (2) it is sufficient to query the neural model only once for the predictions before the SAT solving starts.
Multi-agent reinforcement learning tasks put a high demand on the volume of training samples.
In this paper we propose a new method to stabilize the training process of the latent variables of adversarial auto-encoders, which we name Intervention Adversarial auto-encoder (IVAAE).
The main idea is to disentangle the latent space of a pre-trained generation model and precisely control the face attributes of child images with clear semantics.
To accommodate the variety of users' preferences, we characterize each user with a set of anchors, i. e. a group of learnable latent vectors in the outfit space that are the representatives of the outfits the user likes.
Supervised machine learning has several drawbacks that make it difficult to use in many situations.
Graph Neural Network (GNN) has been demonstrated its effectiveness in dealing with non-Euclidean structural data.
As an emerging technology that has attracted huge attention, non-line-of-sight (NLOS) imaging can reconstruct hidden objects by analyzing the diffuse reflection on a relay surface, with broad application prospects in the fields of autonomous driving, medical imaging, and defense.
In this work, we aim to extensively explore the above system design challenges and these challenges motivate us to propose a comprehensive framework that synergistically handles the heterogeneous hardware accelerator design principles, system design criteria, and task scheduling mechanism.
Zero-shot learning uses semantic attributes to connect the search space of unseen objects.
Finally, a new framework for Chinese herbal recognition is proposed as a new application of APN.
Pedestrian attribute recognition is an important multi-label classification problem.
This paper presents a new neural architecture that combines a modulated Hebbian network (MOHN) with DQN, which we call modulated Hebbian plus Q network architecture (MOHQA).
We demonstrate its performance compared to a state-of-the-art approach and several ablation cases, visualize and interpret the hidden factors, and identify avenues for future improvements.
Few attempts have been made for urban topography which is typically an integration of complex man-made and natural features.
To deal with these problems, a novel Inner-Imaging architecture is proposed in this paper, which allows relationships between channels to meet the above requirement.
According to the characteristics of herbal images, we proposed the competitive attentional fusion pyramid networks to model the features of herbal image, which mdoels the relationship of feature maps from different levels, and re-weights multi-level channels with channel-wise attention mechanism.
In order to mine features from different granularities of faces, we design a multi-scale convolutional neural network based on three-grained face, which mines the patient's face information from the organs, local regions, and the entire face.
This report demonstrates our solution for the Open Images 2018 Challenge.
The performance is first evaluated on a synthetic dataset that encompasses typical characteristics of condition monitoring data.
Our system, dubbed FashionNet, consists of two components, a feature network for feature extraction and a matching network for compatibility computation.
The proposed network includes two sub-networks: a two-stream late fusion network (TSLFN) that predicts the foreground at a reduced resolution, and a multi-scale refining network (MSRN) that refines the foreground at full resolution.
To evaluate the performance of our proposed method, we conduct experiments on three sizes of tongue datasets, in which deep convolutional neural network method and traditional digital image analysis method are respectively applied to extract features for tongue images.
Facial expression recognition (FER) has always been a challenging issue in computer vision.
In order to adapt to the tongue image in a variety of photographic environments and construct herbal prescriptions, a neural network framework for prescription construction is designed.
In robotic surgery, task automation and learning from demonstration combined with human supervision is an emerging trend for many new surgical robot platforms.
Intra-operative measurements of tissue shape and multi/ hyperspectral information have the potential to provide surgical guidance and decision making support.
Person re-identification is becoming a hot research for developing both machine learning algorithms and video surveillance applications.
In this paper, we propose an effective feature representation called Local Maximal Occurrence (LOMO), and a subspace and metric learning method called Cross-view Quadratic Discriminant Analysis (XQDA).
Ranked #88 on Person Re-Identification on DukeMTMC-reID
However, the use of this method is very generic and not limited in face recognition, which can be easily generalized to other biometrics as a post-processing module.