A Simple Single-Scale Vision Transformer for Object Localization and Instance Segmentation

tensorflow/models 17 Dec 2021

In this paper, we comprehensively study three architecture design choices on ViT -- spatial reduction, doubled channels, and multiscale features -- and demonstrate that a vanilla ViT architecture can fulfill this goal without handcrafting multiscale features, maintaining the original ViT design philosophy.

Towards End-to-End Unified Scene Text Detection and Layout Analysis

tensorflow/models CVPR 2022

In this paper, we bring them together and introduce the task of unified scene text detection and layout analysis.

Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision

tensorflow/models CVPR 2022

Modern self-supervised learning algorithms typically enforce persistency of instance representations across views.

Proper Reuse of Image Classification Features Improves Object Detection

tensorflow/models CVPR 2022

A common practice in transfer learning is to initialize the downstream model weights by pre-training on a data-abundant upstream task.

Adversarial Training Methods for Semi-Supervised Text Classification

tensorflow/models 25 May 2016

We extend adversarial and virtual adversarial training to the text domain by applying perturbations to the word embeddings in a recurrent neural network rather than to the original input itself.

MobileNetV2: Inverted Residuals and Linear Bottlenecks

tensorflow/models CVPR 2018

In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes.

AutoAugment: Learning Augmentation Policies from Data

tensorflow/models 24 May 2018

In our implementation, we have designed a search space where a policy consists of many sub-policies, one of which is randomly chosen for each image in each mini-batch.

Identity Mappings in Deep Residual Networks

tensorflow/models 16 Mar 2016

Deep residual networks have emerged as a family of extremely deep architectures showing compelling accuracy and nice convergence behaviors.

Cognitive Mapping and Planning for Visual Navigation

tensorflow/models CVPR 2017

The accumulated belief of the world enables the agent to track visited regions of the environment.

Ensemble Adversarial Training: Attacks and Defenses

tensorflow/models ICLR 2018

We show that this form of adversarial training converges to a degenerate global minimum, wherein small curvature artifacts near the data points obfuscate a linear approximation of the loss.