Based on the observation that the depth complexity in local image regions is lower than that over the entire image, we split an MPI into many small, tiled regions, each with only a few depth planes.
The product carbon footprint (PCF) is crucial for decarbonizing the supply chain, as it measures the direct and indirect greenhouse gas emissions caused by all activities during the product's life cycle.
Among existing approaches for avatar stylization, direct optimization methods can produce excellent results for arbitrary styles but they are unpleasantly slow.
Scene boundary detection breaks down long videos into meaningful story-telling units and plays a crucial role in high-level video understanding.
In this paper, we propose to measure the task dominance degree of a parameter by the total updates of each task on this parameter.
Virtual reality (VR) headsets provide an immersive, stereoscopic visual experience, but at the cost of blocking users from directly observing their physical environment.
Image view synthesis has seen great success in reconstructing photorealistic visuals, thanks to deep learning and various novel representations.
The delayed feedback problem is one of the imperative challenges in online advertising, which is caused by the highly diversified feedback delay of a conversion varying from a few minutes to several days.
Existing Multi-Plane Image (MPI) based view-synthesis methods generate an MPI aligned with the input view using a fixed number of planes in one forward pass.
This paper introduces a novel multi-task model called Mixture of Virtual-Kernel Experts (MVKE) to learn user preferences on various actions and topics unitedly.
Text-based visual question answering (VQA) requires to read and understand text in an image to correctly answer a given question.
Comparative results demonstrate that the proposed graph clustering algorithm is accurate yet efficient for large networks, which also means that it can be further used to evaluate the intra-cluster and inter-cluster trusts on large networks.
First, pairs of hazy and clean images are used as inputs to supervise the encoding process and obtain high-quality feature maps.
Due to higher resolutions and refresh rates, as well as more photorealistic effects, real-time rendering has become increasingly challenging for video games and emerging virtual reality headsets.
It is often observed that the probabilistic predictions given by a machine learning model can disagree with averaged actual outcomes on specific subsets of data, which is also known as the issue of miscalibration.
Recognizing text in the wild is a really challenging task because of complex backgrounds, various illuminations and diverse distortions, even with deep neural networks (convolutional neural networks and recurrent neural networks).
Then we extend the model family to a variety of bayesian online models with increasing feature embedding capabilities, such as Sparse-MLP, FM-MLP and FFM-MLP.
Recently, several discriminative learning approaches have been proposed for effective image restoration, achieving convincing trade-off between image quality and computational efficiency.
Continuous-wave time-of-flight (ToF) cameras show great promise as low-cost depth image sensors in mobile applications.
The functional difference between a diffuse wall and a mirror is well understood: one scatters back into all directions, and the other one preserves the directionality of reflected light.