In this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets).
Ranked #1 on Copy Detection on Copydays strong subset
Inductive transfer learning has greatly impacted computer vision, but existing approaches in NLP still require task-specific modifications and training from scratch.
Ranked #3 on Text Classification on TREC-6
Although manipulating the latent vectors controls the synthesized outputs, editing real images with GANs suffers from i) time-consuming optimization for projecting real images to the latent vectors, ii) or inaccurate embedding through an encoder.
This paper proposes a novel method of learning by predicting view assignments with support samples (PAWS).
Ranked #312 on Image Classification on ImageNet
However, such an upgrade is not applicable to instance segmentation, due to its significantly higher output dimensions compared to object detection.
Ranked #11 on Instance Segmentation on COCO test-dev (APS metric)
Invariance and equivariance to the rotation group have been widely discussed in the 3D deep learning community for pointclouds.
This assumption greatly simplifies the learning problem, factorizing the dynamics into a nonreactive world model and a low-dimensional and compact forward model of the ego-vehicle.
Ranked #1 on Autonomous Driving on CARLA Leaderboard
But popularity is still a challenge because there is no easy, ready-to-use library like Sci-Kit Learn for deep learning.
To facilitate animation and prevent the leakage of the shape of the driving object, we disentangle shape and pose of objects in the region space.
Ranked #1 on Video Reconstruction on Tai-Chi-HD (256)