However, whether there exists semantic correlations/connections between the visual representations in ANNs and those in BNNs remains largely unexplored due to both the lack of an effective tool to link and couple two different domains, and the lack of a general and effective framework of representing the visual semantics in BNNs such as human functional brain networks (FBNs).
In this work, we propose a novel and effective saliency-guided vision transformer (SGT) model to rectify shortcut learning in ViT with the absence of eye-gaze data.
During inference, the learned blurring transform can be inverted to a sharpening transform leveraging the network's invertibility.
Our experimental results show that: 1) the learned embedding vectors can quantitatively encode the commonality and individuality of cortical folding patterns; 2) with the embeddings we can robustly infer the complicated many-to-many anatomical correspondences among different brains and 3) our model can be successfully transferred to new populations with very limited training samples.
no code implementations • 25 May 2022 • Chong Ma, Lin Zhao, Yuzhong Chen, Lu Zhang, Zhenxiang Xiao, Haixing Dai, David Liu, Zihao Wu, Zhengliang Liu, Sheng Wang, Jiaxing Gao, Changhe Li, Xi Jiang, Tuo Zhang, Qian Wang, Dinggang Shen, Dajiang Zhu, Tianming Liu
To address this problem, we propose to infuse human experts' intelligence and domain knowledge into the training of deep neural networks.
no code implementations • 20 May 2022 • Yuzhong Chen, Zhenxiang Xiao, Lin Zhao, Lu Zhang, Haixing Dai, David Weizhong Liu, Zihao Wu, Changhe Li, Tuo Zhang, Changying Li, Dajiang Zhu, Tianming Liu, Xi Jiang
However, for data-intensive models such as vision transformer (ViT), current fine-tuning based FSL approaches are inefficient in knowledge generalization and thus degenerate the downstream task performances.
The key characteristic of these ViT models is to adopt different aggregation strategies of spatial patch information within the artificial neural networks (ANNs).
In this work, we propose a novel Twin-Transformers framework to simultaneously infer common and individual functional networks in both spatial and temporal space, in a self-supervised manner.
This paper tackles the more challenging case of a constant learning rate, and develops new analytical tools that improve the existing convergence rate by orders of magnitude.
We introduce MedMNIST v2, a large-scale MNIST-like dataset collection of standardized biomedical images, including 12 datasets for 2D and 6 datasets for 3D.
However, two major issues of the fusion between camera and LiDAR hinder its performance, \ie, how to effectively fuse these two modalities and how to precisely align them (suffering from the weak spatiotemporal synchronization problem).
Double Q-learning (Hasselt 2010) has gained significant success in practice due to its effectiveness in overcoming the overestimation issue of Q-learning.
no code implementations • 16 Dec 2020 • Defa Liu, Xianxin Wu, Fangsen Li, Yong Hu, Jianwei Huang, Yu Xu, Cong Li, Yunyi Zang, Junfeng He, Lin Zhao, Shaolong He, Chenjia Tang, Zhi Li, Lili Wang, Qingyan Wang, Guodong Liu, Zuyan Xu, Xu-Cun Ma, Qi-Kun Xue, Jiangping Hu, X. J. Zhou
These observations not only show the first direct evidence that the electronic structure of single-layer FeSe/SrTiO3 films originates from bulk FeSe through a combined effect of an electronic phase transition and an interfacial charge transfer, but also provide a quantitative basis for theoretical models in describing the electronic structure and understanding the superconducting mechanism in single-layer FeSe/SrTiO3 films.
Band Gap Superconductivity Strongly Correlated Electrons
Although Q-learning is one of the most successful algorithms for finding the best action-value function (and thus the optimal policy) in reinforcement learning, its implementation often suffers from large overestimation of Q-function values incurred by random sampling.
For the infinite state-action space case, we establish the convergence guarantee for MomentumQ with linear function approximations and Markovian sampling.
In this paper, we propose a novel joint instance and semantic segmentation approach, which is called JSNet, in order to address the instance and semantic segmentation of 3D point clouds simultaneously.
Ranked #2 on Semantic Segmentation on ShapeNet
In this paper, based on discrimination-aware channel pruning (DCP) which is state-of-the-art pruning method for classification, we propose a localization-aware auxiliary network to find out the channels with key information for classification and regression so that we can conduct channel pruning directly for object detection, which saves lots of time and computing resources.
Based on classification results, the extreme wind speeds calculated based on mixed wind hazard types is compared with those obtained from conventional methods, and the effects on structural design for different return periods are discussed.
We present a generative neural network model for slot filling based on a sequence-to-sequence (Seq2Seq) model together with a pointer network, in the situation where only sentence-level slot annotations are available in the spoken dialogue data.
In this paper, we present structure-infused copy mechanisms to facilitate copying important words and relations from the source sentence to summary sentence.
Ranked #30 on Text Summarization on GigaWord
We present a spectrally-accurate scheme to turn a boundary integral formulation for an elliptic PDE on a single unit cell geometry into one for the fully periodic problem.
Numerical Analysis 65N38, 65N80, 76D07, 76M50
Question generation from a knowledge base (KB) is the task of generating questions related to the domain of the input KB.