Trending Research

Adversarial Open Domain Adaption for Sketch-to-Photo Synthesis

12 Apr 2021Mukosame/Anime2Sketch

In this paper, we explore the open-domain sketch-to-photo translation, which aims to synthesize a realistic photo from a freehand sketch with its class label, even if the sketches of that class are missing in the training data.

DOMAIN ADAPTATION

730
3.46 stars / hour

Do You Even Need Attention? A Stack of Feed-Forward Layers Does Surprisingly Well on ImageNet

6 May 2021lukemelas/do-you-even-need-attention

These results indicate that aspects of vision transformers other than attention, such as the patch embedding, may be more responsible for their strong performance than previously thought.

IMAGE CLASSIFICATION

301
2.77 stars / hour

ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision

5 Feb 2021dandelin/vilt

Vision-and-Language Pretraining (VLP) has improved performance on various joint vision-and-language downstream tasks.

IMAGE-TO-TEXT RETRIEVAL TEXT-TO-IMAGE RETRIEVAL VISUAL QUESTION ANSWERING VISUAL REASONING

64
2.00 stars / hour

LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference

2 Apr 2021facebookresearch/LeViT

We design a family of image classification architectures that optimize the trade-off between accuracy and efficiency in a high-speed regime.

CLASSIFICATION IMAGE CLASSIFICATION

141
1.79 stars / hour

MLP-Mixer: An all-MLP Architecture for Vision

4 May 2021lucidrains/mlp-mixer-pytorch

Convolutional Neural Networks (CNNs) are the go-to model for computer vision.

Ranked #9 on Image Classification on ImageNet (using extra training data)

IMAGE CLASSIFICATION

191
1.60 stars / hour

Emerging Properties in Self-Supervised Vision Transformers

29 Apr 2021facebookresearch/dino

In this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets).

COPY DETECTION SELF-SUPERVISED IMAGE CLASSIFICATION SELF-SUPERVISED LEARNING SEMANTIC SEGMENTATION VIDEO OBJECT DETECTION

1,807
1.49 stars / hour

Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks

5 May 2021MenghaoGuo/-EANet

Attention mechanisms, especially self-attention, play an increasingly important role in deep feature representation in visual tasks.

IMAGE CLASSIFICATION IMAGE GENERATION POINT CLOUD CLASSIFICATION SEMANTIC SEGMENTATION

74
0.85 stars / hour

RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition

5 May 2021DingXiaoH/RepMLP

We propose RepMLP, a multi-layer-perceptron-style neural network building block for image recognition, which is composed of a series of fully-connected (FC) layers.

FACE RECOGNITION IMAGE CLASSIFICATION SEMANTIC SEGMENTATION

77
0.85 stars / hour

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

ICLR 2021 google-research/vision_transformer

While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited.

 Ranked #1 on Fine-Grained Image Classification on Oxford-IIIT Pets (using extra training data)

DOCUMENT IMAGE CLASSIFICATION FINE-GRAINED IMAGE CLASSIFICATION

2,482
0.67 stars / hour