Search Results for author: Yutong Lin

Found 9 papers, 8 papers with code

A Simple Baseline for Zero-shot Semantic Segmentation with Pre-trained Vision-language Model

1 code implementation29 Dec 2021 Mengde Xu, Zheng Zhang, Fangyun Wei, Yutong Lin, Yue Cao, Han Hu, Xiang Bai

Recently, zero-shot image classification by vision-language pre-training has demonstrated incredible achievements, that the model can classify arbitrary category without seeing additional annotated images of that category.

Image Classification Language Modelling +3

Swin Transformer V2: Scaling Up Capacity and Resolution

5 code implementations18 Nov 2021 Ze Liu, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie, Yixuan Wei, Jia Ning, Yue Cao, Zheng Zhang, Li Dong, Furu Wei, Baining Guo

Our techniques are generally applicable for scaling up vision models, which has not been widely explored as that of NLP language models, partly due to the following difficulties in training and applications: 1) vision models often face instability issues at scale and 2) many downstream vision tasks require high resolution images or windows and it is not clear how to effectively transfer models pre-trained at low resolutions to higher resolution ones.

 Ranked #1 on Object Detection on COCO test-dev (using extra training data)

Action Classification Image Classification +3

SimMIM: A Simple Framework for Masked Image Modeling

2 code implementations18 Nov 2021 Zhenda Xie, Zheng Zhang, Yue Cao, Yutong Lin, Jianmin Bao, Zhuliang Yao, Qi Dai, Han Hu

We also leverage this approach to facilitate the training of a 3B model (SwinV2-G), that by $40\times$ less data than that in previous practice, we achieve the state-of-the-art on four representative vision benchmarks.

Representation Learning Self-Supervised Image Classification

Bootstrap Your Object Detector via Mixed Training

1 code implementation NeurIPS 2021 Mengde Xu, Zheng Zhang, Fangyun Wei, Yutong Lin, Yue Cao, Stephen Lin, Han Hu, Xiang Bai

We introduce MixTraining, a new training paradigm for object detection that can improve the performance of existing detectors for free.

Data Augmentation Object Detection

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

43 code implementations ICCV 2021 Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo

This paper presents a new vision Transformer, called Swin Transformer, that capably serves as a general-purpose backbone for computer vision.

Ranked #3 on Semantic Segmentation on FoodSeg103 (using extra training data)

Image Classification Instance Segmentation +2

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning

5 code implementations CVPR 2021 Zhenda Xie, Yutong Lin, Zheng Zhang, Yue Cao, Stephen Lin, Han Hu

We argue that the power of contrastive learning has yet to be fully unleashed, as current methods are trained only on instance-level pretext tasks, leading to representations that may be sub-optimal for downstream tasks requiring dense pixel predictions.

Contrastive Learning Object Detection +2

Cannot find the paper you are looking for? You can Submit a new open access paper.