Search Results for author: Weihao Yu

Found 10 papers, 9 papers with code

MetaFormer is Actually What You Need for Vision

2 code implementations22 Nov 2021 Weihao Yu, Mi Luo, Pan Zhou, Chenyang Si, Yichen Zhou, Xinchao Wang, Jiashi Feng, Shuicheng Yan

Based on this observation, we hypothesize that the general architecture of the transformers, instead of the specific token mixer module, is more essential to the model's performance.

Image Classification Semantic Segmentation

FDA: Feature Decomposition and Aggregation for Robust Airway Segmentation

no code implementations7 Sep 2021 Minghui Zhang, Xin Yu, Hanxiao Zhang, Hao Zheng, Weihao Yu, Hong Pan, Xiangran Cai, Yun Gu

Compared to other state-of-the-art transfer learning methods, our method accurately segmented more bronchi in the noisy CT scans.

Transfer Learning

LV-BERT: Exploiting Layer Variety for BERT

1 code implementation Findings (ACL) 2021 Weihao Yu, Zihang Jiang, Fei Chen, Qibin Hou, Jiashi Feng

In this paper, beyond this stereotyped layer pattern, we aim to improve pre-trained models by exploiting layer variety from two aspects: the layer type set and the layer order.

Refiner: Refining Self-attention for Vision Transformers

1 code implementation7 Jun 2021 Daquan Zhou, Yujun Shi, Bingyi Kang, Weihao Yu, Zihang Jiang, Yuan Li, Xiaojie Jin, Qibin Hou, Jiashi Feng

Vision Transformers (ViTs) have shown competitive accuracy in image classification tasks compared with CNNs.

Image Classification

Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

8 code implementations ICCV 2021 Li Yuan, Yunpeng Chen, Tao Wang, Weihao Yu, Yujun Shi, Zihang Jiang, Francis EH Tay, Jiashi Feng, Shuicheng Yan

To overcome such limitations, we propose a new Tokens-To-Token Vision Transformer (T2T-ViT), which incorporates 1) a layer-wise Tokens-to-Token (T2T) transformation to progressively structurize the image to tokens by recursively aggregating neighboring Tokens into one Token (Tokens-to-Token), such that local structure represented by surrounding tokens can be modeled and tokens length can be reduced; 2) an efficient backbone with a deep-narrow structure for vision transformer motivated by CNN architecture design after empirical study.

Image Classification Language Modelling +1

ConvBERT: Improving BERT with Span-based Dynamic Convolution

7 code implementations NeurIPS 2020 Zi-Hang Jiang, Weihao Yu, Daquan Zhou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan

The novel convolution heads, together with the rest self-attention heads, form a new mixed attention block that is more efficient at both global and local context learning.

Language understanding Natural Language Understanding

ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning

1 code implementation ICLR 2020 Weihao Yu, Zi-Hang Jiang, Yanfei Dong, Jiashi Feng

Empirical results show that state-of-the-art models have an outstanding ability to capture biases contained in the dataset with high accuracy on EASY set.

Logical Reasoning Question Answering Logical Reasoning Reading Comprehension +1

Heterogeneous Graph Learning for Visual Commonsense Reasoning

1 code implementation NeurIPS 2019 Weijiang Yu, Jingwen Zhou, Weihao Yu, Xiaodan Liang, Nong Xiao

Our HGL consists of a primal vision-to-answer heterogeneous graph (VAHG) module and a dual question-to-answer heterogeneous graph (QAHG) module to interactively refine reasoning paths for semantic agreement.

Graph Learning Visual Commonsense Reasoning

Knowledge-Embedded Routing Network for Scene Graph Generation

3 code implementations CVPR 2019 Tianshui Chen, Weihao Yu, Riquan Chen, Liang Lin

More specifically, we show that the statistical correlations between objects appearing in images and their relationships, can be explicitly represented by a structured knowledge graph, and a routing mechanism is learned to propagate messages through the graph to explore their interactions.

Graph Generation Scene Graph Generation

Deep Reasoning with Knowledge Graph for Social Relationship Understanding

1 code implementation2 Jul 2018 Zhouxia Wang, Tianshui Chen, Jimmy Ren, Weihao Yu, Hui Cheng, Liang Lin

And this structured knowledge can be efficiently integrated into the deep neural network architecture to promote social relationship understanding by an end-to-end trainable Graph Reasoning Model (GRM), in which a propagation mechanism is learned to propagate node message through the graph to explore the interaction between persons of interest and the contextual objects.

Cannot find the paper you are looking for? You can Submit a new open access paper.