Search Results for author: Qihang Yu

Found 28 papers, 20 papers with code

MaXTron: Mask Transformer with Trajectory Attention for Video Panoptic Segmentation

1 code implementation30 Nov 2023 Ju He, Qihang Yu, Inkyu Shin, Xueqing Deng, Xiaohui Shen, Alan Yuille, Liang-Chieh Chen

To alleviate the issue, we propose to adapt the trajectory attention for both the dense pixel features and object queries, aiming to improve the short-term and long-term tracking results, respectively.

Object Video Classification +3

Towards Open-Ended Visual Recognition with Large Language Model

1 code implementation14 Nov 2023 Qihang Yu, Xiaohui Shen, Liang-Chieh Chen

Localizing and recognizing objects in the open-ended physical world poses a long-standing challenge within the domain of machine perception.

Language Modelling Large Language Model +2

3D TransUNet: Advancing Medical Image Segmentation through Vision Transformers

2 code implementations11 Oct 2023 Jieneng Chen, Jieru Mei, Xianhang Li, Yongyi Lu, Qihang Yu, Qingyue Wei, Xiangde Luo, Yutong Xie, Ehsan Adeli, Yan Wang, Matthew Lungren, Lei Xing, Le Lu, Alan Yuille, Yuyin Zhou

In this paper, we extend the 2D TransUNet architecture to a 3D network by building upon the state-of-the-art nnU-Net architecture, and fully exploring Transformers' potential in both the encoder and decoder design.

Image Segmentation Medical Image Segmentation +3

Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP

1 code implementation NeurIPS 2023 Qihang Yu, Ju He, Xueqing Deng, Xiaohui Shen, Liang-Chieh Chen

The proposed FC-CLIP, benefits from the following observations: the frozen CLIP backbone maintains the ability of open-vocabulary classification and can also serve as a strong mask generator, and the convolutional CLIP generalizes well to a larger input resolution than the one used during contrastive image-text pretraining.

Open Vocabulary Panoptic Segmentation Open Vocabulary Semantic Segmentation +1

ReMaX: Relaxing for Better Training on Efficient Panoptic Segmentation

1 code implementation NeurIPS 2023 Shuyang Sun, Weijun Wang, Qihang Yu, Andrew Howard, Philip Torr, Liang-Chieh Chen

This paper presents a new mechanism to facilitate the training of mask transformers for efficient panoptic segmentation, democratizing its deployment.

Panoptic Segmentation Segmentation

Compositor: Bottom-up Clustering and Compositing for Robust Part and Object Segmentation

1 code implementation CVPR 2023 Ju He, Jieneng Chen, Ming-Xian Lin, Qihang Yu, Alan Yuille

Compositor achieves state-of-the-art performance on PartImageNet and Pascal-Part by outperforming previous methods by around 0. 9% and 1. 3% on PartImageNet, 0. 4% and 1. 7% on Pascal-Part in terms of part and object mIoU and demonstrates better robustness against occlusion by around 4. 4% and 7. 1% on part and object respectively.

Clustering Object +2

Video-kMaX: A Simple Unified Approach for Online and Near-Online Video Panoptic Segmentation

no code implementations10 Apr 2023 Inkyu Shin, Dahun Kim, Qihang Yu, Jun Xie, Hong-Seok Kim, Bradley Green, In So Kweon, Kuk-Jin Yoon, Liang-Chieh Chen

The meta architecture of the proposed Video-kMaX consists of two components: within clip segmenter (for clip-level segmentation) and cross-clip associater (for association beyond clips).

Scene Understanding Segmentation +2

kMaX-DeepLab: k-means Mask Transformer

2 code implementations8 Jul 2022 Qihang Yu, Huiyu Wang, Siyuan Qiao, Maxwell Collins, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

However, we observe that most existing transformer-based vision models simply borrow the idea from NLP, neglecting the crucial difference between languages and images, particularly the extremely large sequence length of spatially flattened pixel features.

Clustering Object Detection +1

PartImageNet: A Large, High-Quality Dataset of Parts

1 code implementation2 Dec 2021 Ju He, Shuo Yang, Shaokang Yang, Adam Kortylewski, Xiaoding Yuan, Jie-Neng Chen, Shuai Liu, Cheng Yang, Qihang Yu, Alan Yuille

To help address this problem, we propose PartImageNet, a large, high-quality dataset with part segmentation annotations.

Activity Recognition Few-Shot Learning +6

DeepLab2: A TensorFlow Library for Deep Labeling

4 code implementations17 Jun 2021 Mark Weber, Huiyu Wang, Siyuan Qiao, Jun Xie, Maxwell D. Collins, Yukun Zhu, Liangzhe Yuan, Dahun Kim, Qihang Yu, Daniel Cremers, Laura Leal-Taixe, Alan L. Yuille, Florian Schroff, Hartwig Adam, Liang-Chieh Chen

DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a state-of-the-art and easy-to-use TensorFlow codebase for general dense pixel prediction problems in computer vision.

Glance-and-Gaze Vision Transformer

1 code implementation NeurIPS 2021 Qihang Yu, Yingda Xia, Yutong Bai, Yongyi Lu, Alan Yuille, Wei Shen

It is motivated by the Glance and Gaze behavior of human beings when recognizing objects in natural scenes, with the ability to efficiently model both long-range dependencies and local context.

TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

20 code implementations8 Feb 2021 Jieneng Chen, Yongyi Lu, Qihang Yu, Xiangde Luo, Ehsan Adeli, Yan Wang, Le Lu, Alan L. Yuille, Yuyin Zhou

Medical image segmentation is an essential prerequisite for developing healthcare systems, especially for disease diagnosis and treatment planning.

Cardiac Segmentation Image Segmentation +3

Mask Guided Matting via Progressive Refinement Network

1 code implementation CVPR 2021 Qihang Yu, Jianming Zhang, He Zhang, Yilin Wang, Zhe Lin, Ning Xu, Yutong Bai, Alan Yuille

We propose Mask Guided (MG) Matting, a robust matting framework that takes a general coarse mask as guidance.

Image Matting

Shape-Texture Debiased Neural Network Training

1 code implementation ICLR 2021 Yingwei Li, Qihang Yu, Mingxing Tan, Jieru Mei, Peng Tang, Wei Shen, Alan Yuille, Cihang Xie

To prevent models from exclusively attending on a single cue in representation learning, we augment training data with images with conflicting shape and texture information (eg, an image of chimpanzee shape but with lemon texture) and, most importantly, provide the corresponding supervisions from shape and texture simultaneously.

Adversarial Robustness Data Augmentation +2

Neural Architecture Search for Lightweight Non-Local Networks

2 code implementations CVPR 2020 Yingwei Li, Xiaojie Jin, Jieru Mei, Xiaochen Lian, Linjie Yang, Cihang Xie, Qihang Yu, Yuyin Zhou, Song Bai, Alan Yuille

However, it has been rarely explored to embed the NL blocks in mobile neural networks, mainly due to the following challenges: 1) NL blocks generally have heavy computation cost which makes it difficult to be applied in applications where computational resources are limited, and 2) it is an open problem to discover an optimal configuration to embed NL blocks into mobile neural networks.

Image Classification Neural Architecture Search

CAKES: Channel-wise Automatic KErnel Shrinking for Efficient 3D Networks

1 code implementation28 Mar 2020 Qihang Yu, Yingwei Li, Jieru Mei, Yuyin Zhou, Alan L. Yuille

3D Convolution Neural Networks (CNNs) have been widely applied to 3D scene understanding, such as video analysis and volumetric image recognition.

3D Medical Imaging Segmentation Action Recognition +3

When Radiology Report Generation Meets Knowledge Graph

no code implementations19 Feb 2020 Yixiao Zhang, Xiaosong Wang, Ziyue Xu, Qihang Yu, Alan Yuille, Daguang Xu

In addition, we proposed a new evaluation metric for radiology image reporting with the assistance of the same composed graph.

Graph Embedding Image Captioning

C2FNAS: Coarse-to-Fine Neural Architecture Search for 3D Medical Image Segmentation

no code implementations CVPR 2020 Qihang Yu, Dong Yang, Holger Roth, Yutong Bai, Yixiao Zhang, Alan L. Yuille, Daguang Xu

3D convolution neural networks (CNN) have been proved very successful in parsing organs or tumours in 3D medical images, but it remains sophisticated and time-consuming to choose or design proper 3D networks given different task contexts.

Image Segmentation Medical Image Segmentation +3

Thickened 2D Networks for Efficient 3D Medical Image Segmentation

no code implementations2 Apr 2019 Qihang Yu, Yingda Xia, Lingxi Xie, Elliot K. Fishman, Alan L. Yuille

With this design, we achieve a higher performance while maintaining a lower inference latency on a few abdominal organs from CT scans, in particular when the organ has a peculiar 3D shape and thus strongly requires contextual information, demonstrating our method's effectiveness and ability in capturing 3D information.

Image Segmentation Medical Image Segmentation +2

Recurrent Saliency Transformation Network: Incorporating Multi-Stage Visual Cues for Small Organ Segmentation

2 code implementations CVPR 2018 Qihang Yu, Lingxi Xie, Yan Wang, Yuyin Zhou, Elliot K. Fishman, Alan L. Yuille

The key innovation is a saliency transformation module, which repeatedly converts the segmentation probability map from the previous iteration as spatial weights and applies these weights to the current iteration.

Organ Segmentation Pancreas Segmentation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.