With ResNet50 backbone, our MaskConver achieves 53. 6% PQ on the COCO panoptic val set, outperforming the modern convolution-based model, Panoptic FCN, by 9. 3% as well as transformer-based models such as Mask2Former (+1. 7% PQ) and kMaX-DeepLab (+0. 6% PQ).
Ranked #8 on
Panoptic Segmentation
on COCO test-dev
In this paper, we comprehensively study three architecture design choices on ViT -- spatial reduction, doubled channels, and multiscale features -- and demonstrate that a vanilla ViT architecture can fulfill this goal without handcrafting multiscale features, maintaining the original ViT design philosophy.
In this paper, we bring them together and introduce the task of unified scene text detection and layout analysis.
Modern self-supervised learning algorithms typically enforce persistency of instance representations across views.
A common practice in transfer learning is to initialize the downstream model weights by pre-training on a data-abundant upstream task.
We extend adversarial and virtual adversarial training to the text domain by applying perturbations to the word embeddings in a recurrent neural network rather than to the original input itself.
Ranked #22 on
Sentiment Analysis
on IMDb
General Classification
Semi-Supervised Text Classification
+2
In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes.
Ranked #7 on
Retinal OCT Disease Classification
on OCT2017
In our implementation, we have designed a search space where a policy consists of many sub-policies, one of which is randomly chosen for each image in each mini-batch.
Ranked #6 on
Data Augmentation
on ImageNet
Deep residual networks have emerged as a family of extremely deep architectures showing compelling accuracy and nice convergence behaviors.
Ranked #20 on
Image Classification
on Kuzushiji-MNIST
The accumulated belief of the world enables the agent to track visited regions of the environment.