We introduce an approach for selecting objects in neural volumetric 3D representations, such as multi-plane images (MPI) and neural radiance fields (NeRF).
We then develop a multi-scale neural network and show that when properly trained using our new dataset, this neural network can already handle dynamic scenes to some extent.
Most of the existing work on automatic facial expression analysis focuses on discrete emotion recognition, or facial action unit detection.
We combine the advantages of these two methods by training a deep network that learns to synthesize video frames by flowing pixel values from existing ones, which we call deep voxel flow.
Ranked #2 on Video Prediction on DAVIS 2017
High-level manipulation of facial expressions in images --- such as changing a smile to a neutral expression --- is challenging because facial expression changes are highly non-linear, and vary depending on the appearance of the face.
As font is one of the core design concepts, automatic font identification and similar font suggestion from an image or photo has been on the wish list of many designers.
Ranked #1 on Font Recognition on VFR-Wild
We address a challenging fine-grain classification problem: recognizing a font style from an image of text.
We present a domain adaption framework to address a domain mismatch between synthetic training and real-world testing data.
These tools are complementary; a user may search for "graceful" fonts, select a reasonable one, and then refine the results from a list of fonts similar to the selection.
This paper addresses the large-scale visual font recognition (VFR) problem, which aims at automatic identification of the typeface, weight, and slope of the text in an image or photo without any knowledge of content.
Ranked #1 on Font Recognition on VFR-447
The style of an image plays a significant role in how it is viewed, but style has received little attention in computer vision research.