In this paper, we propose an asymmetric two-stream architecture taking account of the inherent differences between RGB and depth data for saliency detection.
Ranked #19 on Thermal Image Segmentation on RGB-T-Glass-Segmentation
To address this, we need a method to obtain misalignment states, aiding in the reconstruction of accurate point spread functions for data processing methods or facilitating adjustments of optical components for improved image quality.
However, the manual creation of high-quality instruction datasets is costly, leading to the adoption of automatic generation of instruction pairs by LLMs as a popular alternative in the training of open-source LLMs.
To further improve algorithm performance and alleviate local heterogeneous overfitting in Federated Learning (FL), our algorithm combines the Sharpness Aware Minimization (SAM) optimizer and local momentum.
Specifically, we categorize existing deep model fusion methods as four-fold: (1) "Mode connectivity", which connects the solutions in weight space via a path of non-increasing loss, in order to obtain better initialization for model fusion; (2) "Alignment" matches units between neural networks to create better conditions for fusion; (3) "Weight average", a classical model fusion method, averages the weights of multiple models to obtain more accurate results closer to the optimal solution; (4) "Ensemble learning" combines the outputs of diverse models, which is a foundational technique for improving the accuracy and robustness of the final model.
The high-accuracy and resource-intensive deep neural networks (DNNs) have been widely adopted by live video analytics (VA), where camera videos are streamed over the network to resource-rich edge/cloud servers for DNN inference.
Modern deep neural networks, particularly recent large language models, come with massive model sizes that require significant computational and storage resources.
It motivates us to develop a technique to evaluate true loss changes without retraining, with which channels to prune can be selected more reliably and confidently.
Specifically, SFGC contains two collaborative components: (1) a training trajectory meta-matching scheme for effectively synthesizing small-scale graph-free data; (2) a graph neural feature score metric for dynamically evaluating the quality of the condensed data.
The dominant latent space further reveals a strong relevance with the key flow features located in the boundary layers downstream of shock.
Images captured under low-light conditions are often plagued by several challenges, including diminished contrast, increased noise, loss of fine details, and unnatural color reproduction.
In the proposed model, a primary network is responsible for representing the relationship between the lift and angle of attack, while the geometry information is encoded into a hyper network to predict the unknown parameters involved in the primary network.
First, we propose adaptive ensemble distillation that assigns adaptive weights to different base models such that their varying classification capabilities contribute purposefully to the training of the lightweight model.
Therefore, in this paper, we propose a novel automated graph neural network on heterophilic graphs, namely Auto-HeG, to automatically build heterophilic GNN models with expressive learning abilities.
We then propose a resource-aware search strategy to explore the search space to find the best PINN model under different resource constraints.
To overcome these limitations, we propose SEARCH, a joint, scalable framework, to automatically devise effective CTS forecasting models.
Contrastive self-supervised learning is widely employed in visual recognition for geographic image data (remote or proximal sensing), but because of landscape heterogeneity, models can show disparate performance across spatial units.
Adapter Tuning, which freezes the pretrained language models (PLMs) and only fine-tunes a few extra modules, becomes an appealing efficient alternative to the full model fine-tuning.
The results reveal that DNN exhibit best production prediction accuracy compared to RF and SVM.
As the efficiency of training in the ring topology prefers devices with homogeneous resources, the classification based on the computing capacity mitigates the impact of straggler effects.
In contrast, the hard-constrained scheme produces airfoils with a wider range of geometric diversity while strictly adhering to the geometric constraints.
This work highlights the need to conduct fairness analysis for satellite imagery segmentation models and motivates the development of methods for fair transfer learning in order not to introduce disparities between places, particularly urban and rural locations.
Recent years have witnessed fast developments of graph neural networks (GNNs) that have benefited myriads of graph analytic tasks and applications.
The multiple accurate cues from multiple DFs are then simultaneously propagated to the saliency network with a multi-guidance loss.
As a by-product, a CapS dataset is constructed by augmenting existing benchmark training set with additional image tags and captions.
Differentiable Architecture Search (DARTS) has received massive attention in recent years, mainly because it significantly reduces the computational cost through weight sharing and continuous relaxation.
Last, to enhance the embedding space learning, an additional pixel-wise metric learning module is introduced with triplet loss formulated on the pixel-level embedding of the input image.
For the goal of automated design of high-performance deep convolutional neural networks (CNNs), Neural Architecture Search (NAS) methodology is becoming increasingly important for both academia and industries. Due to the costly stochastic gradient descent (SGD) training of CNNs for performance evaluation, most existing NAS methods are computationally expensive for real-world deployments.
Despite of the success of previous works, explorations on an effective training strategy for the saliency network and accurate matches between image-level annotations and salient objects are still inadequate.
In this study, we propose a novel heterogeneity-aware federated learning method, SplitAVG, to overcome the performance drops from data heterogeneity in federated learning.
Specifically, instead of directly training a model for task performance, we develop a novel dual model architecture: a primary model learns the desired task, and an auxiliary "generative replay model" allows aggregating knowledge from the heterogenous clients.
We investigate this question through the lens of edge connectivity, and provide an affirmative answer by defining a connectivity concept, ZERo-cost Operation Sensitivity (ZEROS), to score the importance of candidate operations in DARTS at initialization.
A key challenge to the scalability and quality of the learned architectures is the need for differentiating through the inner-loop optimisation.
Ranked #21 on Neural Architecture Search on NAS-Bench-201, CIFAR-10
Complex backgrounds and similar appearances between objects and their surroundings are generally recognized as challenging scenarios in Salient Object Detection (SOD).
Ranked #13 on Thermal Image Segmentation on RGB-T-Glass-Segmentation
We first excavate the internal spatial correlation by designing a context reasoning unit which separately extracts comprehensive contextual information from the focal stack and RGB images.
To evaluate the performance of this IRS assisted WPSN, we are interested in maximizing its system sum throughput to jointly optimize the energy beamforming of the PS, the transmission time allocation, as well as the phase shifts of the WET and WIT phases.
Inspired by our theoretical insights on trainability, we propose Critical DropEdge, a connectivity-aware and graph-adaptive sampling method, to alleviate the exponential decay problem more fundamentally.
Our bidirectional dynamic fusion strategy encourages the interaction of spatial and temporal information in a dynamic manner.
Ranked #12 on Video Polyp Segmentation on SUN-SEG-Easy (Unseen)
The success of learning-based light field saliency detection is heavily dependent on how a comprehensive dataset can be constructed for higher generalizability of models, how high dimensional light field data can be effectively exploited, and how a flexible model can be designed to achieve versatility for desktop computers and mobile devices.
A probabilistic exploration enhancement method is accordingly devised to encourage intelligent exploration during the architecture search in the latent space, to avoid local optimal in architecture search.
The detection of thoracic abnormalities challenge is organized by the Deepwise AI Lab.
no code implementations • 3 Sep 2020 • Holger R. Roth, Ken Chang, Praveer Singh, Nir Neumark, Wenqi Li, Vikash Gupta, Sharut Gupta, Liangqiong Qu, Alvin Ihsani, Bernardo C. Bizzo, Yuhong Wen, Varun Buch, Meesam Shah, Felipe Kitamura, Matheus Mendonça, Vitor Lavor, Ahmed Harouni, Colin Compas, Jesse Tetreault, Prerna Dogra, Yan Cheng, Selnur Erdal, Richard White, Behrooz Hashemian, Thomas Schultz, Miao Zhang, Adam McCarthy, B. Min Yun, Elshaimaa Sharaf, Katharina V. Hoebel, Jay B. Patel, Bryan Chen, Sean Ko, Evan Leibovitz, Etta D. Pisano, Laura Coombs, Daguang Xu, Keith J. Dreyer, Ittai Dayan, Ram C. Naidu, Mona Flores, Daniel Rubin, Jayashree Kalpathy-Cramer
Building robust deep learning-based models requires large quantities of diverse training data.
The explicitly extracted edge information goes together with saliency to give more emphasis to the salient regions and object boundaries.
Ranked #19 on RGB-D Salient Object Detection on NJU2K
Earthquake early warning systems are required to report earthquake locations and magnitudes as quickly as possible before the damaging S wave arrival to mitigate seismic hazards.
In this paper, we formulate the supernet training in the One-Shot NAS as a constrained optimization problem of continual learning that the learning of current architecture should not degrade the performance of previous architectures during the supernet training.
Depth data containing a preponderance of discriminative power in location have been proven beneficial for accurate saliency prediction.
Ranked #15 on RGB-D Salient Object Detection on NJU2K (using extra training data)
Our solution is based on a strong baseline with bag of tricks (BoT-BS) proposed in person ReID.
In this paper, we present a deep-learning-based method where a novel memory-oriented decoder is tailored for light field saliency detection.
The best architecture obtained by our algorithm with the same search space achieves the state-of-the-art test error rate of 2. 51\% on CIFAR-10 with only 7. 5 hours search time in a single GPU, and a validation perplexity of 60. 02 and a test perplexity of 57. 36 on PTB.
Furthermore, a kernel trick is developed to reduce computational complexity and learn nonlinear subset of the unknowing function when applying SIR to extremely high dimensional BO.
In this paper, a non-stationary kernel is proposed which allows the surrogate model to adapt to functions whose smoothness varies with the spatial location of inputs, and a multi-level convolutional neural network (ML-CNN) is built for lung nodule classification whose hyperparameter configuration is optimized by using the proposed non-stationary kernel based Gaussian surrogate model.
By doing so, the ancient painting processing problems become natural image processing problems and models trained on natural images can be directly applied to the transferred paintings.
In this paper, we first test the state of the art semantic segmentation deep learning classifiers for LUCC mapping with 7 categories in the TGRA area with rapideye 5m resolution data.
Trivial events are ubiquitous in human to human conversations, e. g., cough, laugh and sniff.
Then these metrics are input to neural network for supervised learning, the weights of which are output by PSO and BP hybrid algorithm.