Further, we design two pre-training tasks named object position regression (OPR) and spatial relation classification (SRC) to learn to reconstruct the spatial relation graph respectively.
Deep Neural Networks (DNNs) are expected to provide explanation for users to understand their black-box predictions.
Deep learning in digital pathology brings intelligence and automation as substantial enhancements to pathological analysis, the gold standard of clinical diagnosis.
Pathological captioning of Whole Slide Images (WSIs), though is essential in computer-aided pathological diagnosis, has rarely been studied due to the limitations in datasets and model training efficacy.
With the advancement of deep learning technologies, general-purpose large models such as GPT-4 have demonstrated exceptional capabilities across various domains.
Our approach enforces the Hessian of the neural implicit function to have a zero determinant for points near the surface.
Weakly supervised object localization (WSOL) is one of the most popular and challenging tasks in computer vision.
no code implementations • 11 Jul 2023 • Zhouhon Gu, Zihan Li, Lin Zhang, Zhuozhi Xiong, Haoning Ye, Yikai Zhang, Wenhao Huang, Xiaoxuan Zhu, Qianyu He, Rui Xu, Sihang Jiang, Shusen Wang, Zili Wang, Hongwei Feng, Zhixu Li, Yanghua Xiao
Informal reasoning ability is the ability to reason based on common sense, experience, and intuition. Humans use informal reasoning every day to extract the most influential elements for their decision-making from a large amount of life-like information. With the rapid development of language models, the realization of general artificial intelligence has emerged with hope.
1 code implementation • 9 Jun 2023 • Zhouhong Gu, Xiaoxuan Zhu, Haoning Ye, Lin Zhang, Jianchen Wang, Sihang Jiang, Zhuozhi Xiong, Zihan Li, Qianyu He, Rui Xu, Wenhao Huang, Zili Wang, Shusen Wang, Weiguo Zheng, Hongwei Feng, Yanghua Xiao
New Natural Langauge Process~(NLP) benchmarks are urgently needed to align with the rapid development of large language models (LLMs).
With the rapid development of geometric deep learning techniques, many mesh-based convolutional operators have been proposed to bridge irregular mesh structures and popular backbone networks.
Vulnerability detection is a critical problem in software security and attracts growing attention both from academia and industry.
no code implementations • 12 Apr 2023 • Guoyu Lu, Sheng Li, Gengchen Mai, Jin Sun, Dajiang Zhu, Lilong Chai, Haijian Sun, Xianqiao Wang, Haixing Dai, Ninghao Liu, Rui Xu, Daniel Petti, Tianming Liu, Changying Li
Artificial General Intelligence (AGI) is poised to revolutionize a variety of sectors, including healthcare, finance, transportation, and education.
This motivates us to propose a Source-free Unsupervised cross-domain method for Pulmonary nodule detection (SUP).
To address this issue, we propose a slice grouped domain attention (SGDA) module to enhance the generalization capability of the pulmonary nodule detection networks.
POSTER achieves the state-of-the-art (SOTA) performance in FER by effectively combining facial landmark and image features through two-stream pyramid cross-fusion design.
Ranked #1 on Facial Expression Recognition (FER) on AffectNet
We extensively evaluate ASIT on facial datasets such as FFHQ and CelebA-HQ, showing that our approach achieves state-of-the-art facial inversion performance.
Given a possibly false claim sentence, how can we automatically correct it with minimal editing?
Convolutional neural networks (CNNs) have been demonstrated to be highly effective in the field of pulmonary nodule detection.
Spatial-temporal data contains rich information and has been widely studied in recent years due to the rapid development of relevant applications in many fields.
To this end, we propose a plug-in algorithm for this line of work, i. e., Aligned Constrained Training (ACT), which alleviates this problem by familiarizing the model with the source-side context of the constraints.
Holding the belief that models capable of reasoning should be right for the right reasons, we propose a first-of-its-kind Explainable Knowledge-intensive Analogical Reasoning benchmark (E-KAR).
Then, we develop a generative adversarial network that combines the domain-specific features of the seen categories with the aligned domain-invariant features to synthesize samples, where the synthesized samples of the unseen categories are generated by using the corresponding word embeddings.
Inspired by the concept of self-supervised learning (e. g., setting the pretext task to generate a universal model for the downstream task), we propose a Self-Supervised Dictionary Learning (SSDL) framework to address this challenge.
Inspired by this assumption, we propose a novel method Multi-Decision Fusing Model (MDFM), which comprehensively considers the decisions based on multiple FEMs to enhance the efficacy and robustness of the model.
The surrogate models are used to conduct uncertainty quantification considering a stochastic permeability field, as well as to infer unknown permeability information based on limited well production data and observation data of formation properties.
Unlike ML on the edge, TinyML with a limited energy supply has higher demands on low-power execution.
First, we design a cross-task distillation scheme that encourages DSR and DE networks to learn from each other in a teacher-student role-exchanging fashion.
The size of deep neural networks (DNNs) grows rapidly as the complexity of the machine learning algorithm increases.
In this work, taking SinGAN and StyleGAN2 as examples, we show that such capability, to a large extent, is brought by the implicit positional encoding when using zero padding in the generators.
Feature reassembly, i. e. feature downsampling and upsampling, is a key operation in a number of modern convolutional network architectures, e. g., residual networks and feature pyramids.
The Standard-Model Extension (SME) is an effective-field-theoretic framework that catalogs all Lorentz-violating field operators.
General Relativity and Quantum Cosmology High Energy Astrophysical Phenomena High Energy Physics - Phenomenology
In this paper, we propose a hypergraph based sparse attention mechanism to tackle this issue and embed it into dictionary learning.
To tackle this issue, we propose a Dynamic Label Dictionary Learning (DLDL) algorithm to generate the soft label matrix for unlabeled data.
We construct a parallel connection structure based on the group convolution and feature aggregation to build a 3D CNN that is as wide as possible with few parameters.
By bringing together the best of both paradigms, we propose a new deep inpainting framework where texture generation is guided by a texture memory of patch samples extracted from unmasked regions.
In the weak form, high order derivatives in the PDE can be transferred to the test functions by performing integration-by-parts, which reduces computational error.
Starting from the low-resolution point clouds, with the bilateral interpolation and max-pooling operations, the deconvolution network can progressively output high-resolution local and global feature maps.
The technique works as follows: we first encourage sparse latent representations when we train a GNN in a supervised setting, then we apply symbolic regression to components of the learned model to extract explicit physical relations.
Many deep learning based methods have been proposed for retinal vessel segmentation, however few of them focus on the connectivity of segmented vessels, which is quite important for a practical computer-aided diagnosis system on retinal images.
The outbreak of COVID-19 caused by SARS-CoV-2 has rapidly spread worldwide and has caused over 1, 400, 000 infections and 80, 000 deaths.
In this paper, we propose a new method, termed as Lip by Speech (LIBS), of which the goal is to strengthen lip reading by learning from speech recognizers.
Ranked #2 on Lipreading on CMLR
Recent advances in adversarial attacks uncover the intrinsic vulnerability of modern deep neural networks.
We introduce an approach for imposing physically motivated inductive biases on graph networks to learn interpretable representations and improved zero-shot generalization.
When trained on CMLR dataset, the proposed CSSMCM surpasses the performance of state-of-the-art lip reading frameworks, which confirms the effectiveness of explicit modeling of tones for Chinese Mandarin lip reading.
Ranked #3 on Lipreading on CMLR
Then the synthesized flow field is used to guide the propagation of pixels to fill up the missing regions in the video.
Ranked #9 on Video Inpainting on DAVIS
CARAFE introduces little computational overhead and can be readily integrated into modern network architectures.
Dictionary learning methods can be split into: i) class specific dictionary learning ii) class shared dictionary learning.
Recently, label consistent k-svd (LC-KSVD) algorithm has been successfully applied in image classification.
Power plant is a complex and nonstationary system for which the traditional machine learning modeling approaches fall short of expectations.
Data-driven predictive analytics are in use today across a number of industrial applications, but further integration is hindered by the requirement of similarity among model training and test data distributions.
At sub-ion-Larmor scales, we discover an overstability driven by the electron temperature gradient of kinetic-Alfv\'en drift waves -- the electron MTI (eMTI) -- whose growth rate is even larger than the standard MTI.
High Energy Astrophysical Phenomena Plasma Physics