Search Results for author: Yutong Bai

Found 22 papers, 15 papers with code

SIMILE: Introducing Sequential Information towards More Effective Imitation Learning

no code implementations ICLR 2019 Yutong Bai, Lingxi Xie

Reinforcement learning (RL) is a metaheuristic aiming at teaching an agent to interact with an environment and maximizing the reward in a complex task.

Imitation Learning OpenAI Gym +3

Finding Visual Task Vectors

1 code implementation8 Apr 2024 Alberto Hojel, Yutong Bai, Trevor Darrell, Amir Globerson, Amir Bar

In this work, we analyze the activations of MAE-VQGAN, a recent Visual Prompting model, and find task vectors, activations that encode task-specific information.

Visual Prompting

Sequential Modeling Enables Scalable Learning for Large Vision Models

1 code implementation1 Dec 2023 Yutong Bai, Xinyang Geng, Karttikeya Mangalam, Amir Bar, Alan Yuille, Trevor Darrell, Jitendra Malik, Alexei A Efros

We introduce a novel sequential modeling approach which enables learning a Large Vision Model (LVM) without making use of any linguistic data.

Understanding Pan-Sharpening via Generalized Inverse

no code implementations4 Oct 2023 Shiqi Liu, Yutong Bai, Xinyang Han, Alan Yuille

By the generalized inverse theory, we derived two forms of general inverse matrix formulations that can correspond to the two prominent classes of Pan-sharpening methods, that is, component substitution and multi-resolution analysis methods.

Discovering Failure Modes of Text-guided Diffusion Models via Adversarial Search

no code implementations1 Jun 2023 Qihao Liu, Adam Kortylewski, Yutong Bai, Song Bai, Alan Yuille

(2) We find regions in the latent space that lead to distorted images independent of the text prompt, suggesting that parts of the latent space are not well-structured.

Adversarial Attack Efficient Exploration +1

Delving into Masked Autoencoders for Multi-Label Thorax Disease Classification

1 code implementation23 Oct 2022 Junfei Xiao, Yutong Bai, Alan Yuille, Zongwei Zhou

We hope that this study can direct future research on the application of Transformers to a larger variety of medical imaging tasks.

Computational Efficiency Transfer Learning

Making Your First Choice: To Address Cold Start Problem in Vision Active Learning

1 code implementation5 Oct 2022 Liangyu Chen, Yutong Bai, Siyu Huang, Yongyi Lu, Bihan Wen, Alan L. Yuille, Zongwei Zhou

However, we uncover a striking contradiction to this promise: active learning fails to select data as efficiently as random selection at the first few choices.

Active Learning Contrastive Learning

Masked Autoencoders Enable Efficient Knowledge Distillers

1 code implementation CVPR 2023 Yutong Bai, Zeyu Wang, Junfei Xiao, Chen Wei, Huiyu Wang, Alan Yuille, Yuyin Zhou, Cihang Xie

For example, by distilling the knowledge from an MAE pre-trained ViT-L into a ViT-B, our method achieves 84. 0% ImageNet top-1 accuracy, outperforming the baseline of directly distilling a fine-tuned ViT-L by 1. 2%.

Knowledge Distillation

Can CNNs Be More Robust Than Transformers?

1 code implementation7 Jun 2022 Zeyu Wang, Yutong Bai, Yuyin Zhou, Cihang Xie

The recent success of Vision Transformers is shaking the long dominance of Convolutional Neural Networks (CNNs) in image recognition for a decade.

Fast AdvProp

1 code implementation ICLR 2022 Jieru Mei, Yucheng Han, Yutong Bai, Yixiao Zhang, Yingwei Li, Xianhang Li, Alan Yuille, Cihang Xie

Specifically, our modifications in Fast AdvProp are guided by the hypothesis that disentangled learning with adversarial examples is the key for performance improvements, while other training recipes (e. g., paired clean and adversarial training samples, multi-step adversarial attackers) could be largely simplified.

Data Augmentation object-detection +1

Glance-and-Gaze Vision Transformer

1 code implementation NeurIPS 2021 Qihang Yu, Yingda Xia, Yutong Bai, Yongyi Lu, Alan Yuille, Wei Shen

It is motivated by the Glance and Gaze behavior of human beings when recognizing objects in natural scenes, with the ability to efficiently model both long-range dependencies and local context.

CateNorm: Categorical Normalization for Robust Medical Image Segmentation

1 code implementation29 Mar 2021 Junfei Xiao, Lequan Yu, Zongwei Zhou, Yutong Bai, Lei Xing, Alan Yuille, Yuyin Zhou

We propose a new normalization strategy, named categorical normalization (CateNorm), to normalize the activations according to categorical statistics.

Image Segmentation Medical Image Segmentation +2

TransFG: A Transformer Architecture for Fine-grained Recognition

2 code implementations14 Mar 2021 Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang

Fine-grained visual classification (FGVC) which aims at recognizing objects from subcategories is a very challenging task due to the inherently subtle inter-class differences.

Fine-Grained Image Classification

Mask Guided Matting via Progressive Refinement Network

1 code implementation CVPR 2021 Qihang Yu, Jianming Zhang, He Zhang, Yilin Wang, Zhe Lin, Ning Xu, Yutong Bai, Alan Yuille

We propose Mask Guided (MG) Matting, a robust matting framework that takes a general coarse mask as guidance.

Image Matting

Unsupervised Part Discovery via Feature Alignment

no code implementations1 Dec 2020 Mengqi Guo, Yutong Bai, Zhishuai Zhang, Adam Kortylewski, Alan Yuille

Specifically, given a training image, we find a set of similar images that show instances of the same object category in the same pose, through an affine alignment of their corresponding feature maps.

Object Object Recognition

C2FNAS: Coarse-to-Fine Neural Architecture Search for 3D Medical Image Segmentation

no code implementations CVPR 2020 Qihang Yu, Dong Yang, Holger Roth, Yutong Bai, Yixiao Zhang, Alan L. Yuille, Daguang Xu

3D convolution neural networks (CNN) have been proved very successful in parsing organs or tumours in 3D medical images, but it remains sophisticated and time-consuming to choose or design proper 3D networks given different task contexts.

Image Segmentation Medical Image Segmentation +3

CLEVR-Ref+: Diagnosing Visual Reasoning with Referring Expressions

3 code implementations CVPR 2019 Runtao Liu, Chenxi Liu, Yutong Bai, Alan Yuille

Yet there has been evidence that current benchmark datasets suffer from bias, and current state-of-the-art models cannot be easily evaluated on their intermediate reasoning process.

Image Segmentation object-detection +8

Semantic Part Detection via Matching: Learning to Generalize to Novel Viewpoints from Limited Training Data

1 code implementation ICCV 2019 Yutong Bai, Qing Liu, Lingxi Xie, Weichao Qiu, Yan Zheng, Alan Yuille

In particular, this enables images in the training dataset to be matched to a virtual 3D model of the object (for simplicity, we assume that the object viewpoint can be estimated by standard techniques).

Clustering Object +1

Cannot find the paper you are looking for? You can Submit a new open access paper.