HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling

no code implementations30 May 2022 Xiaosong Zhang, Yunjie Tian, Wei Huang, Qixiang Ye, Qi Dai, Lingxi Xie, Qi Tian

A key idea of efficient implementation is to discard the masked image patches (or tokens) throughout the target network (encoder), which requires the encoder to be a plain vision transformer (e. g., ViT), albeit hierarchical vision transformers (e. g., Swin Transformer) have potentially better properties in formulating vision inputs.

Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual Object Detection

2 code implementations19 May 2022 Feng Liu, Xiaosong Zhang, Zhiliang Peng, Zonghao Guo, Fang Wan, Xiangyang Ji, Qixiang Ye

Except for the backbone networks, however, other components such as the detector head and the feature pyramid network (FPN) remain trained from scratch, which hinders fully tapping the potential of representation models.

Long-tailed Distribution Adaptation

1 code implementation6 Oct 2021 Zhiliang Peng, Wei Huang, Zonghao Guo, Xiaosong Zhang, Jianbin Jiao, Qixiang Ye

We propose to jointly optimize empirical risks of the unbalanced and balanced domains and approximate their domain divergence by intra-class and inter-class distances, with the aim to adapt models trained on the long-tailed distribution to general distributions in an interpretable way.

FreeAnchor: Learning to Match Anchors for Visual Object Detection

3 code implementations NeurIPS 2019 Xiaosong Zhang, Fang Wan, Chang Liu, Rongrong Ji, Qixiang Ye

In this study, we propose a learning-to-match approach to break IoU restriction, allowing objects to match anchors in a flexible manner.

Adversarial Samples on Android Malware Detection Systems for IoT Systems

no code implementations12 Feb 2019 Xiaolei Liu, Xiaojiang Du, Xiaosong Zhang, Qingxin Zhu, Mohsen Guizani

An automated testing framework is needed to help these learning-based malware detection systems for IoT devices perform security analysis.

A Black-box Attack on Neural Networks Based on Swarm Evolutionary Algorithm

no code implementations26 Jan 2019 Xiaolei Liu, Yuheng Luo, Xiaosong Zhang, Qingxin Zhu

Our experimental results show that both the MNIST images and the CIFAR-10 images can be perturbed to successful generate a black-box attack with 100\% probability on average.

Weighted-Sampling Audio Adversarial Example Attack

no code implementations26 Jan 2019 Xiaolei Liu, Xiaosong Zhang, Kun Wan, Qingxin Zhu, Yufei Ding

In this paper, we propose~\textit{weighted-sampling audio adversarial examples}, focusing on the numbers and the weights of distortion to reinforce the attack.

