2 code implementations • 4 Oct 2022 • Chenglin Yang, Siyuan Qiao, Qihang Yu, Xiaoding Yuan, Yukun Zhu, Alan Yuille, Hartwig Adam, Liang-Chieh Chen
The tiny-MOAT family is also benchmarked on downstream tasks, serving as a baseline for the community.
Ranked #28 on
Image Classification
on ImageNet
(using extra training data)
1 code implementation • 8 Jul 2022 • Qihang Yu, Huiyu Wang, Siyuan Qiao, Maxwell Collins, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
However, we observe that most existing transformer-based vision models simply borrow the idea from NLP, neglecting the crucial difference between languages and images, particularly the extremely large sequence length of spatially flattened pixel features.
Ranked #2 on
Panoptic Segmentation
on COCO test-dev
no code implementations • CVPR 2022 • Qihang Yu, Huiyu Wang, Dahun Kim, Siyuan Qiao, Maxwell Collins, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
We propose Clustering Mask Transformer (CMT-DeepLab), a transformer-based framework for panoptic segmentation designed around clustering.
Ranked #6 on
Panoptic Segmentation
on COCO test-dev
1 code implementation • 15 Jun 2022 • Jieru Mei, Alex Zihao Zhu, Xinchen Yan, Hang Yan, Siyuan Qiao, Yukun Zhu, Liang-Chieh Chen, Henrik Kretzschmar, Dragomir Anguelov
We therefore present the Waymo Open Dataset: Panoramic Video Panoptic Segmentation Dataset, a large-scale dataset that offers high-quality panoptic segmentation labels for autonomous driving.
no code implementations • CVPR 2022 • Dahun Kim, Jun Xie, Huiyu Wang, Siyuan Qiao, Qihang Yu, Hong-Seok Kim, Hartwig Adam, In So Kweon, Liang-Chieh Chen
We present TubeFormer-DeepLab, the first attempt to tackle multiple core video segmentation tasks in a unified manner.
3 code implementations • 12 Jul 2021 • Chenglin Yang, Siyuan Qiao, Adam Kortylewski, Alan Yuille
Self-Attention has become prevalent in computer vision models.
1 code implementation • 17 Jun 2021 • Mark Weber, Huiyu Wang, Siyuan Qiao, Jun Xie, Maxwell D. Collins, Yukun Zhu, Liangzhe Yuan, Dahun Kim, Qihang Yu, Daniel Cremers, Laura Leal-Taixe, Alan L. Yuille, Florian Schroff, Hartwig Adam, Liang-Chieh Chen
DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a state-of-the-art and easy-to-use TensorFlow codebase for general dense pixel prediction problems in computer vision.
1 code implementation • CVPR 2021 • Siyuan Qiao, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
We name this joint task as Depth-aware Video Panoptic Segmentation, and propose a new evaluation metric along with two derived datasets for it, which will be made available to the public.
Depth-aware Video Panoptic Segmentation
Monocular Depth Estimation
+2
1 code implementation • 28 Nov 2020 • Yuhui Xu, Lingxi Xie, Cihang Xie, Jieru Mei, Siyuan Qiao, Wei Shen, Hongkai Xiong, Alan Yuille
Batch normalization (BN) is a fundamental unit in modern deep networks, in which a linear transformation module was designed for improving BN's flexibility of fitting complex data distributions.
no code implementations • 23 Nov 2020 • Liang-Chieh Chen, Huiyu Wang, Siyuan Qiao
The Wide Residual Networks (Wide-ResNets), a shallow but wide model variant of the Residual Networks (ResNets) by stacking a small number of residual blocks with large channel sizes, have demonstrated outstanding performance on multiple dense prediction tasks.
Ranked #2 on
Panoptic Segmentation
on Cityscapes test
(using extra training data)
6 code implementations • CVPR 2021 • Siyuan Qiao, Liang-Chieh Chen, Alan Yuille
In this paper, we explore this mechanism in the backbone design for object detection.
Ranked #2 on
Object Detection
on AI-TOD
1 code implementation • CVPR 2021 • Hao Ding, Siyuan Qiao, Alan Yuille, Wei Shen
The key to a successful cascade architecture for precise instance segmentation is to fully leverage the relationship between bounding box detection and mask segmentation across multiple stages.
1 code implementation • 21 Nov 2019 • Siyuan Qiao, Huiyu Wang, Chenxi Liu, Wei Shen, Alan Yuille
To address this issue, we propose BatchChannel Normalization (BCN), which uses batch knowledge to avoid the elimination singularities in the training of channel-normalized models.
no code implementations • 9 Sep 2019 • Mingqing Xiao, Adam Kortylewski, Ruihai Wu, Siyuan Qiao, Wei Shen, Alan Yuille
Despite deep convolutional neural networks' great success in object classification, it suffers from severe generalization performance drop under occlusion due to the inconsistency between training and testing data.
8 code implementations • 25 Mar 2019 • Siyuan Qiao, Huiyu Wang, Chenxi Liu, Wei Shen, Alan Yuille
Batch Normalization (BN) has become an out-of-box technique to improve deep network training.
Ranked #61 on
Instance Segmentation
on COCO minival
1 code implementation • CVPR 2019 • Siyuan Qiao, Zhe Lin, Jianming Zhang, Alan Yuille
By simply replacing standard optimizers with Neural Rejuvenation, we are able to improve the performances of neural networks by a very large margin while using similar training efforts and maintaining their original resource usages.
no code implementations • 29 Nov 2018 • Xutong Ren, Lingxi Xie, Chen Wei, Siyuan Qiao, Chi Su, Jiaying Liu, Qi Tian, Elliot K. Fishman, Alan L. Yuille
Computer vision is difficult, partly because the desired mathematical function connecting input and output data is often complex, fuzzy and thus hard to learn.
1 code implementation • 28 Nov 2018 • Zhishuai Zhang, Wei Shen, Siyuan Qiao, Yan Wang, Bo wang, Alan Yuille
In this paper, we propose that the robustness of a face detector against hard faces can be improved by learning small faces on hard images.
Ranked #7 on
Face Detection
on WIDER Face (Medium)
no code implementations • 15 May 2018 • Chenglin Yang, Lingxi Xie, Siyuan Qiao, Alan Yuille
We focus on the problem of training a deep neural network in generations.
no code implementations • ECCV 2018 • Yan Wang, Lingxi Xie, Siyuan Qiao, Ya zhang, Wenjun Zhang, Alan L. Yuille
Convolution is spatially-symmetric, i. e., the visual features are independent of its position in the image, which limits its ability to utilize contextual cues for visual recognition.
1 code implementation • ECCV 2018 • Siyuan Qiao, Wei Shen, Zhishuai Zhang, Bo wang, Alan Yuille
We present Deep Co-Training, a deep learning based method inspired by the Co-Training framework.
no code implementations • ICLR 2018 • Boyang Deng, Qing Liu, Siyuan Qiao, Alan Yuille
Our models are based on the idea of encoding objects in terms of visual concepts, which are interpretable visual cues represented by the feature vectors within CNNs.
no code implementations • CVPR 2018 • Zhishuai Zhang, Siyuan Qiao, Cihang Xie, Wei Shen, Bo wang, Alan L. Yuille
Our motivation is to enrich the semantics of object detection features within a typical deep detector, by a semantic segmentation branch and a global activation module.
no code implementations • ICML 2018 • Siyuan Qiao, Zhishuai Zhang, Wei Shen, Bo wang, Alan Yuille
Our method is by introducing computation orderings to the channels within convolutional layers or blocks, based on which we gradually compute the outputs in a channel-wise manner.
no code implementations • 22 Nov 2017 • Boyang Deng, Qing Liu, Siyuan Qiao, Alan Yuille
In this work, we address these limitations of CNNs by developing novel, flexible, and interpretable models for few-shot learning.
1 code implementation • CVPR 2018 • Siyuan Qiao, Chenxi Liu, Wei Shen, Alan Yuille
In this paper, we are interested in the few-shot learning problem.
no code implementations • ICCV 2017 • Siyuan Qiao, Wei Shen, Weichao Qiu, Chenxi Liu, Alan Yuille
We argue that estimation of object scales in images is helpful for generating object proposals, especially for supermarket images where object scales are usually within a small range.