We show that under this new taxonomy, many of the applications where transfer learning has been shown to be ineffective or even hinder performance are to be expected when taking into account the source and target datasets and the techniques used.
To this end, we present a review in the form of a taxonomy on existing works of skeleton-based action recognition.
This article aims to use graphic engines to simulate a large number of training data that have free annotations and possibly strongly resemble to real-world data.
Because of the expensive data collection process, micro-expression (MiE) datasets are generally much smaller in scale than those in other computer vision fields, rendering large-scale training less feasible.
Varying the proportions of male and female faces in the training data can have a substantial effect on behavior on the test data: we found that the seemingly obvious choice of 50:50 proportions was not the best for this dataset to reduce biased behavior on female faces, which was 71% unbiased as compared to our top unbiased rate of 84%.
A model is either pre-trained or not pre-trained.
Ranked #1 on Fine-Grained Image Classification on FGVC Aircraft
Such GNNs are incapable of learning relative positions between graph nodes within a graph.
The prevalent convolutional neural network (CNN) based image denoising methods extract features of images to restore the clean ground truth, achieving high denoising accuracy.
Recent skeleton-based action recognition methods extract features from 3D joint coordinates as spatial-temporal cues, using these representations in a graph neural network for feature fusion to boost recognition performance.
Ranked #4 on Skeleton Based Action Recognition on NTU RGB+D 120
InvDN transforms the noisy input into a low-resolution clean image and a latent representation containing noise.
Instead of constraining the translation process by using a reference image, the users can command the model to retouch the generated images by involving the semantic information in the generation process.
To illustrate this, we first investigate the performance of our networks with supervised learning, then with unsupervised learning.
Smiles play a vital role in the understanding of social interactions within different communities, and reveal the physical state of mind of people in both real and deceptive ways.
Sentence compression is a Natural Language Processing (NLP) task aimed at shortening original sentences and preserving their key information.
This inspired our research which explores the performance of two models from pixel transformation in frontal facial synthesis, Pix2Pix and CycleGAN.
Identifying the information lossless condition for deep neural architectures is important, because tasks such as image restoration require keep the detailed information of the input data as much as possible.
For example, acted anger can be expressed when stimuli is not genuinely angry with an aim to manipulate the observer.
Between synthetic and real data, there is a two-level domain gap, i. e., content level and appearance level.
We show that optimising the parameters of classification neural networks with softmax cross-entropy is equivalent to maximising the mutual information between inputs and labels under the balanced data assumption.
We study the factors that influence the perception of group-level cohesion and propose methods for estimating the human-perceived cohesion on the group cohesiveness scale.
Wagner's modularity inducing problem domain is a key contribution to the study of the evolution of modularity, including both evolutionary theory and evolutionary computation.
This paper describes our approach, called EPUTION, for the open trial of the SemEval- 2018 Task 2, Multilingual Emoji Prediction.