In this paper, we propose a deep homography alignment network to precisely match two aerial images by progressively estimating the various transformation parameters.
Deep learning has played a major role in the interpretation of dermoscopic images for detecting skin defects and abnormalities.
As an intuitive assessment metric for explanations, we report the performance of intersection of Union between visual explanation and bounding box of lesions.
In addition, when comparing the performance between w/o and w/ EEG features of overt speech, there was a performance improvement of 7. 42% when including EEG features of overt speech.
In the classification model, CNN is responsible for spatial feature extraction and GRU is responsible for temporal feature extraction.
Wafer map pattern classification is a typical way of quality assurance.
In this paper, we present a single video conferencing solution using gaze estimation in preparation for these problems.
Moreover, our proposed TINN showed the highest accuracy of 0. 93 compared to the previous methods for classifying three different types of mental imagery tasks (MI, VI, and SI).
Communication between humans and a drone using electroencephalogram (EEG) signals is one of the most challenging issues in the BCI domain.
Experimental results also show the superiority of our proposed model compared to other state-of-the-art TTS models with internal and external aligners.
Although recent works on neural vocoder have improved the quality of synthesized audio, there still exists a gap between generated and ground-truth audio in frequency space.
To address these problems, we propose an acne detection network which consists of three components, specifically: Composite Feature Refinement, Dynamic Context Enhancement, and Mask-Aware Multi-Attention.
With this design, we compare FBCNet with state-of-the-art (SOTA) BCI algorithm on four MI datasets: The BCI competition IV dataset 2a (BCIC-IV-2a), the OpenBMI dataset, and two large datasets from chronic stroke patients.
Recently, practical brain-computer interface is actively carried out, especially, in an ambulatory environment.
To enable a deep learning-based system to be used in the medical domain as a computer-aided diagnosis system, it is essential to not only classify diseases but also present the locations of the diseases.
We quantitatively and qualitatively evaluated the proposed method on the VQA v2 dataset and compared it with state-of-the-art methods in terms of answer prediction.
In this paper, we propose a novel framework that simultaneously considers both implicit and explicit representations of human interactions by fusing information of local image where the interaction actively occurred, primitive motion with the posture of individual subject's body parts, and the co-occurrence of overall appearance change.
As a result, it is possible to assign the bi-polar relevance scores of the target (positive) and hostile (negative) attributions while maintaining each attribution aligned with the importance.
In this work, we introduce a novel method for retrieving aerial images by merging group convolution with attention mechanism and metric learning, resulting in robustness to rotational variations.
One affinity, for position and motion, is computed by using the GMPHD filter, and the other affinity, for appearance is computed by using the responses from a single object tracker such as a kernalized correlation filter.
Our findings showed that EEG signals are possible to differentiate when subjects are presented with visual stimulus of different semantic categories and at an exemplar-level with a high classification accuracy; this demonstrates its viability to be applied it in a real-world BMI.
The proposed framework would help to increase the classification performance of imagined speech for a small amount of data and implement an intuitive communication system.
The masking step aims to select an important feature from the input data to be classified as a target class.
As a result, the reconstructed signals had important components such as N200 and P300 similar to ERP during standing.
As a result, the averaged classification performance of the proposed architecture for 4 classes from 16 channels was 67. 50 % across all subjects.
Seven participants performed two memory tasks (word-pairs and visuo-spatial) before and after the nap to assess the memory consolidation during unconsciousness.
In a real-time BCI environment, a calibration procedure is particularly necessary for each user and each session.
These results showed that it is possible to predict performance of a memory task using ear-EEG signals and it could be used for predicting memory retrieval in a practical brain-computer interface.
For five sleep stage classification, the classification performance 85. 6% and 91. 1% using the raw input data and the proposed input, respectively.
Few-shot learning aims to classify unseen classes with a few training examples.
Online temporal action localization from an untrimmed video stream is a challenging problem in computer vision.
First-person interaction recognition is a challenging task because of unstable video conditions resulting from the camera wearer's movement.
As a result, the training process of the deep network is regularized and the network becomes robust for the variance of aerial images.
Hence, we could confirm the feasibility of the drone swarm control system based on EEG signals for performing high-level tasks.
A brain-computer interface (BCI) provides a direct communication pathway between user and external devices.
In addition, we proposed new policies (i. e., frequency warping, loudness and time length control) for more data variations.
Here a model is proposed that bridges Neuroscience, Machine Learning and Evolutionary Algorithms to evolve individual soma and synaptic compartment models of neurons in a scalable manner.
To tackle this issue, in this paper, we propose an explanation method that visualizes undesirable regions to classify an image as a target class.
As Deep Neural Networks (DNNs) have demonstrated superhuman performance in a variety of fields, there is an increasing interest in understanding the complex internal mechanisms of DNNs.
Many real-world applications of reinforcement learning require an agent to select optimal actions from continuous spaces.
In this paper we study the role of context in existing state-of-the-art detection and segmentation approaches.