Search Results

MaskConver: Revisiting Pure Convolution Model for Panoptic Segmentation

tensorflow/models 11 Dec 2023

With ResNet50 backbone, our MaskConver achieves 53. 6% PQ on the COCO panoptic val set, outperforming the modern convolution-based model, Panoptic FCN, by 9. 3% as well as transformer-based models such as Mask2Former (+1. 7% PQ) and kMaX-DeepLab (+0. 6% PQ).

Decoder model +1

A Simple Single-Scale Vision Transformer for Object Localization and Instance Segmentation

tensorflow/models 17 Dec 2021

In this paper, we comprehensively study three architecture design choices on ViT -- spatial reduction, doubled channels, and multiscale features -- and demonstrate that a vanilla ViT architecture can fulfill this goal without handcrafting multiscale features, maintaining the original ViT design philosophy.

Image Classification Instance Segmentation +6

Towards End-to-End Unified Scene Text Detection and Layout Analysis

tensorflow/models CVPR 2022

In this paper, we bring them together and introduce the task of unified scene text detection and layout analysis.

Document Layout Analysis Scene Text Detection +1

Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision

tensorflow/models CVPR 2022

Modern self-supervised learning algorithms typically enforce persistency of instance representations across views.

Action Recognition Contrastive Learning +4

Proper Reuse of Image Classification Features Improves Object Detection

tensorflow/models CVPR 2022

A common practice in transfer learning is to initialize the downstream model weights by pre-training on a data-abundant upstream task.

Classification Image Classification +4

Adversarial Training Methods for Semi-Supervised Text Classification

tensorflow/models 25 May 2016

We extend adversarial and virtual adversarial training to the text domain by applying perturbations to the word embeddings in a recurrent neural network rather than to the original input itself.

General Classification Semi-Supervised Text Classification +2

MobileNetV2: Inverted Residuals and Linear Bottlenecks

tensorflow/models CVPR 2018

In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes.

Image Classification Image Segmentation +4

AutoAugment: Learning Augmentation Policies from Data

tensorflow/models 24 May 2018

In our implementation, we have designed a search space where a policy consists of many sub-policies, one of which is randomly chosen for each image in each mini-batch.

Domain Generalization Fine-Grained Image Classification +1

Identity Mappings in Deep Residual Networks

tensorflow/models 16 Mar 2016

Deep residual networks have emerged as a family of extremely deep architectures showing compelling accuracy and nice convergence behaviors.

Image Classification

Cognitive Mapping and Planning for Visual Navigation

tensorflow/models CVPR 2017

The accumulated belief of the world enables the agent to track visited regions of the environment.

Visual Navigation