The fast growth of computational power and scales of modern super-computing systems have raised great challenges for the management of exascale scientific data.
Palmprint recently shows great potential in recognition applications as it is a privacy-friendly and stable biometric.
This observation inspired us to propose the Partial Diffusion Model (PartDiff), which diffuses the image to an intermediate latent state instead of pure random noise, where the intermediate latent state is approximated by the latent of diffusing the low-resolution image.
no code implementations • 19 Jul 2023 • Xiaohong Liu, Xiongkuo Min, Wei Sun, Yulun Zhang, Kai Zhang, Radu Timofte, Guangtao Zhai, Yixuan Gao, Yuqin Cao, Tengchuan Kou, Yunlong Dong, Ziheng Jia, Yilin Li, Wei Wu, Shuming Hu, Sibin Deng, Pengxiang Xiao, Ying Chen, Kai Li, Kai Zhao, Kun Yuan, Ming Sun, Heng Cong, Hao Wang, Lingzhi Fu, Yusheng Zhang, Rongyu Zhang, Hang Shi, Qihang Xu, Longan Xiao, Zhiliang Ma, Mirko Agarla, Luigi Celona, Claudio Rota, Raimondo Schettini, Zhiwei Huang, Yanan Li, Xiaotao Wang, Lei Lei, Hongye Liu, Wei Hong, Ironhead Chuang, Allen Lin, Drake Guan, Iris Chen, Kae Lou, Willy Huang, Yachun Tasi, Yvonne Kao, Haotian Fan, Fangyuan Kong, Shiqi Zhou, Hao liu, Yu Lai, Shanshan Chen, Wenqi Wang, HaoNing Wu, Chaofeng Chen, Chunzheng Zhu, Zekun Guo, Shiling Zhao, Haibing Yin, Hongkui Wang, Hanene Brachemi Meftah, Sid Ahmed Fezza, Wassim Hamidouche, Olivier Déforges, Tengfei Shi, Azadeh Mansouri, Hossein Motamednia, Amir Hossein Bakhtiari, Ahmad Mahmoudi Aznaveh
61 participating teams submitted their prediction results during the development phase, with a total of 3168 submissions.
Offline reinforcement learning (RL) is a learning paradigm where an agent learns from a fixed dataset of experience.
In the graph node embedding problem, embedding spaces can vary significantly for different data types, leading to the need for different GNN model types.
Graph neural networks (GNNs) have shown promising results across various graph learning tasks, but they often assume homophily, which can result in poor performance on heterophilic graphs.
Recent work on knowledge graph completion (KGC) focused on learning embeddings of entities and relations in knowledge graphs.
We introduce the key notion of label non-uniformity, which is derived from the Wasserstein distance between the softmax distribution of the logits and the uniform distribution.
Video quality assessment (VQA) aims to simulate the human perception of video quality, which is influenced by factors ranging from low-level color and texture details to high-level semantic content.
In graph neural networks (GNNs), both node features and labels are examples of graph signals, a key notion in graph signal processing (GSP).
LiDAR relocalization plays a crucial role in many fields, including robotics, autonomous driving, and computer vision.
Graph neural networks (GNNs) have achieved success in various inference tasks on graph-structured data.
Blind image quality assessment (BIQA) aims to automatically evaluate the perceived quality of a single image, whose performance has been improved by deep learning-based methods in recent years.
To capture the temporal and multivariate correlations among subsequences, we design a pattern discovery model, that constructs correlations via diverse pattern functions.
It is nontrivial to achieve exponential stability even for time-invariant nonlinear systems with matched uncertainties and persistent excitation (PE) condition.
With the wide application of sparse ToF sensors in mobile devices, RGB image-guided sparse depth completion has attracted extensive attention recently, but still faces some problems.
This paper presents a new image processing algorithm to determine the amount of vegetation cover present in a given area, called fractional vegetation cover.
We present the first comprehensive video polyp segmentation (VPS) study in the deep learning era.
Ranked #2 on Video Polyp Segmentation on SUN-SEG-Easy (Unseen)
In this framework, annotated masks of seen categories and pseudo masks of unseen categories serve as a prior for contrastive learning, where features from the mask regions (foreground) are pulled together, and are contrasted against those from the background, and vice versa.
In this paper, by observing that palmar creases are the key information to deep-learning-based palmprint recognition, we propose to synthesize training data by manipulating palmar creases.
Jointly exploiting multiple different yet complementary domain information has been proven to be an effective way to perform robust object tracking.
Based on this observation, we propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
(1) We provide an in-depth investigation of the characteristics of various autoencoder models and develop an error-bounded autoencoder-based framework in terms of the SZ model.
However, these methods usually encounter boundary-related imbalance problem, leading to limited generation capability.
To the best of our knowledge, cuSZ is the first error-bounded lossy compressor on GPUs for scientific data.
Distributed, Parallel, and Cluster Computing
(1) We propose several systematic ABFT schemes based on checksum techniques and analyze their fault protection ability and runtime thoroughly. Unlike traditional ABFT based on matrix-matrix multiplication, our schemes support any convolution implementations.
In addition to the proposed method, we design an evaluation metric to assess the quality of line detection and construct a large scale dataset for the line detection task.
Ranked #2 on Line Detection on NKL
Recent advances in convolutional neural networks(CNNs) usually come with the expense of excessive computational overhead and memory footprint.
Both of them connect split nodes to the top layer of convolutional neural networks (CNNs) and deal with inhomogeneous data by jointly learning input-dependent data partitions at the split nodes and age distributions at the leaf nodes.
We consider the face recognition task where facial images of the same identity (person) is expected to be closer in the representation space, while different identities be far apart.
Empirically, we verify that this new semi-supervised setting is able to further enhance the performance of recognition network.
We evaluate the Res2Net block on all these models and demonstrate consistent performance gains over baseline models on widely-used datasets, e. g., CIFAR-100 and ImageNet.
Ranked #2 on Image Classification on GasHisSDB
In neural text generation such as neural machine translation, summarization, and image captioning, beam search is widely used to improve the output text quality.
In natural images, the scales (thickness) of object skeletons may dramatically vary among objects and object parts.
By reformulating the standard F-measure we propose the relaxed F-measure which is differentiable w. r. t the posterior and can be easily appended to the back of CNNs as the loss function.
In natural images, the scales (thickness) of object skeletons may dramatically vary among objects and object parts, making object skeleton detection a challenging problem.
Ranked #2 on Object Skeleton Detection on SK-LARGE
This paper describes Oregon State University's submissions to the shared WMT'17 task "multimodal translation task I".
In order to utilize the potential benefits from their correlations, we propose a jointly trained model for learning the two tasks simultaneously via Long Short-Term Memory (LSTM) networks.
This paper presents label distribution learning forests (LDLFs) - a novel label distribution learning algorithm based on differentiable decision trees, which have several advantages: 1) Decision trees have the potential to model any general form of label distributions by a mixture of leaf node predictions.
By observing the relationship between the receptive field sizes of the different layers in the network and the skeleton scales they can capture, we introduce two scale-associated side outputs to each stage of the network.
Object skeleton is a useful cue for object detection, complementary to the object contour, as it provides a structural representation to describe the relationship among object parts.
Semantic parsing has made significant progress, but most current semantic parsers are extremely slow (CKY-based) and rather primitive in representation.
Ranked #4 on Semantic Parsing on ATIS