Given a source face image and a sequence of sparse face landmarks, our goal is to generate a video of the face imitating the motion of landmarks.
However, it is not well studied yet on how well a pure transformer based approach can achieve for image segmentation.
Here we propose a suite of Markovian model reduction methods with varying levels of complexity and applied it to spiking network models exhibiting heterogeneous dynamical regimes, ranging from homogeneous firing to strong synchrony in the gamma band.
In this paper, we propose a generalization of Adam, called Adambs, that allows us to also adapt to different training examples based on their importance in the model's convergence.
GI unit is further improved by the SC-loss to enhance the semantic representations over the exemplar-based semantic graph.
However, classification networks are dominated by the discriminative portion, so directly applying classification networks to scene parsing will result in inconsistent parsing predictions within one instance and among instances of the same category.
To address this issue, we propose a method called Untraceable GAN, which has a novel source classifier to differentiate which domain an image is translated from, and determines whether the translated image still retains the characteristics of the source domain.
Therefore, it can capture partial information and enlarge the receptive field of filters simultaneously without introducing extra parameters.
To tackle this problem, we propose a novel Context Guided Network (CGNet), which is a light-weight and efficient network for semantic segmentation.