Similar to the perspective, or design, position by Turing, we provide a solution of how to achieve HLR AI machines without constructing them and conducting real experiments.
Towards this end, we propose Multi-Channel Correction network (MCCNet), which can be trained to fuse the exemplar style features and input content features for efficient style transfer while naturally maintaining the coherence of input videos.
Currently, there are few methods that can perform both multimodal and multi-domain stylization simultaneously.
Arbitrary style transfer is a significant topic with research value and application prospect.
However, the detection of oriented and densely packed objects remains challenging because of following inherent reasons: (1) receptive fields of neurons are all axis-aligned and of the same shape, whereas objects are usually of diverse shapes and align along various directions; (2) detection models are typically trained with generic knowledge and may not generalize well to handle specific objects at test time; (3) the limited dataset hinders the development on this task.
In this paper, we revisit the problem of image aesthetic assessment from the self-supervised feature learning perspective.
The main reason for catastrophic forgetting is that the past concept data is not available and neural weights are changed during incrementally learning new concepts.
The TargetNet module is a neural network for solving a specific task and the MetaNet module aims at learning to generate functional weights for TargetNet by observing training samples.
"Ge She Zhi Zhi" is a novel saying in Chinese, stated as "To investigate things from the underlying principle(s) and to acquire knowledge in the form of mathematical representations".
In this study, we present the Gourmet Photography Dataset (GPD), which is the first large-scale dataset for aesthetic assessment of food photographs.
Aggregation structures with explicit information, such as image attributes and scene semantics, are effective and popular for intelligent systems for assessing aesthetics of visual data.
Ranked #1 on Aesthetics Quality Assessment on AVA
The majority of methods directly apply supervised learning techniques to AU intensity estimation while few methods exploit unlabeled samples to improve the performance.
Facial action unit (AU) intensity estimation plays an important role in affective computing and human-computer interaction.
To alleviate this issue, we propose a knowledge-driven method for jointly learning multiple AU classifiers without any AU annotation by leveraging prior probabilities on AUs, including expression-independent and expression-dependent AU probabilities.
In this work, we introduce the notion of image retargetability to describe how well a particular image can be handled by content-aware image retargeting.
Based on their cost functions, we are able to conclude that G-means of accuracy rates and BER are suitable measures because they show "proper" cost behaviors in terms of "a misclassification from a small class will cause a greater cost than that from a large class".
Instead of employing the minimum spanning tree (MST) and its variants, a new tree structure, "Segment-Tree", is proposed for non-local matching cost aggregation.