Search Results for author: Liang-Chieh Chen

Found 51 papers, 37 papers with code

SPFormer: Enhancing Vision Transformer with Superpixel Representation

no code implementations5 Jan 2024 Jieru Mei, Liang-Chieh Chen, Alan Yuille, Cihang Xie

In this work, we introduce SPFormer, a novel Vision Transformer enhanced by superpixel representation.

Superpixels

MaskConver: Revisiting Pure Convolution Model for Panoptic Segmentation

1 code implementation11 Dec 2023 Abdullah Rashwan, Jiageng Zhang, Ali Taalimi, Fan Yang, Xingyi Zhou, Chaochao Yan, Liang-Chieh Chen, Yeqing Li

With ResNet50 backbone, our MaskConver achieves 53. 6% PQ on the COCO panoptic val set, outperforming the modern convolution-based model, Panoptic FCN, by 9. 3% as well as transformer-based models such as Mask2Former (+1. 7% PQ) and kMaX-DeepLab (+0. 6% PQ).

Panoptic Segmentation

MaXTron: Mask Transformer with Trajectory Attention for Video Panoptic Segmentation

1 code implementation30 Nov 2023 Ju He, Qihang Yu, Inkyu Shin, Xueqing Deng, Xiaohui Shen, Alan Yuille, Liang-Chieh Chen

To alleviate the issue, we propose to adapt the trajectory attention for both the dense pixel features and object queries, aiming to improve the short-term and long-term tracking results, respectively.

Object Video Classification +3

Towards Open-Ended Visual Recognition with Large Language Model

1 code implementation14 Nov 2023 Qihang Yu, Xiaohui Shen, Liang-Chieh Chen

Localizing and recognizing objects in the open-ended physical world poses a long-standing challenge within the domain of machine perception.

Language Modelling Large Language Model +2

PolyMaX: General Dense Prediction with Mask Transformer

1 code implementation9 Nov 2023 Xuan Yang, Liangzhe Yuan, Kimberly Wilber, Astuti Sharma, Xiuye Gu, Siyuan Qiao, Stephanie Debats, Huisheng Wang, Hartwig Adam, Mikhail Sirotenko, Liang-Chieh Chen

Despite this shift, methods based on the per-pixel prediction paradigm still dominate the benchmarks on the other dense prediction tasks that require continuous outputs, such as depth estimation and surface normal prediction.

Monocular Depth Estimation Semantic Segmentation +2

Superpixel Transformers for Efficient Semantic Segmentation

no code implementations28 Sep 2023 Alex Zihao Zhu, Jieru Mei, Siyuan Qiao, Hang Yan, Yukun Zhu, Liang-Chieh Chen, Henrik Kretzschmar

Finally, we directly project the superpixel class predictions back into the pixel space using the associations between the superpixels and the image pixel features.

Autonomous Driving Segmentation +2

Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP

1 code implementation NeurIPS 2023 Qihang Yu, Ju He, Xueqing Deng, Xiaohui Shen, Liang-Chieh Chen

The proposed FC-CLIP, benefits from the following observations: the frozen CLIP backbone maintains the ability of open-vocabulary classification and can also serve as a strong mask generator, and the convolutional CLIP generalizes well to a larger input resolution than the one used during contrastive image-text pretraining.

Open Vocabulary Panoptic Segmentation Open Vocabulary Semantic Segmentation +1

ReMaX: Relaxing for Better Training on Efficient Panoptic Segmentation

1 code implementation NeurIPS 2023 Shuyang Sun, Weijun Wang, Qihang Yu, Andrew Howard, Philip Torr, Liang-Chieh Chen

This paper presents a new mechanism to facilitate the training of mask transformers for efficient panoptic segmentation, democratizing its deployment.

Panoptic Segmentation Segmentation

Video-kMaX: A Simple Unified Approach for Online and Near-Online Video Panoptic Segmentation

no code implementations10 Apr 2023 Inkyu Shin, Dahun Kim, Qihang Yu, Jun Xie, Hong-Seok Kim, Bradley Green, In So Kweon, Kuk-Jin Yoon, Liang-Chieh Chen

The meta architecture of the proposed Video-kMaX consists of two components: within clip segmenter (for clip-level segmentation) and cross-clip associater (for association beyond clips).

Scene Understanding Segmentation +2

kMaX-DeepLab: k-means Mask Transformer

2 code implementations8 Jul 2022 Qihang Yu, Huiyu Wang, Siyuan Qiao, Maxwell Collins, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

However, we observe that most existing transformer-based vision models simply borrow the idea from NLP, neglecting the crucial difference between languages and images, particularly the extremely large sequence length of spatially flattened pixel features.

Clustering Object Detection +1

Waymo Open Dataset: Panoramic Video Panoptic Segmentation

1 code implementation15 Jun 2022 Jieru Mei, Alex Zihao Zhu, Xinchen Yan, Hang Yan, Siyuan Qiao, Yukun Zhu, Liang-Chieh Chen, Henrik Kretzschmar, Dragomir Anguelov

We therefore present the Waymo Open Dataset: Panoramic Video Panoptic Segmentation Dataset, a large-scale dataset that offers high-quality panoptic segmentation labels for autonomous driving.

Autonomous Driving Image Segmentation +4

DeepLab2: A TensorFlow Library for Deep Labeling

4 code implementations17 Jun 2021 Mark Weber, Huiyu Wang, Siyuan Qiao, Jun Xie, Maxwell D. Collins, Yukun Zhu, Liangzhe Yuan, Dahun Kim, Qihang Yu, Daniel Cremers, Laura Leal-Taixe, Alan L. Yuille, Florian Schroff, Hartwig Adam, Liang-Chieh Chen

DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a state-of-the-art and easy-to-use TensorFlow codebase for general dense pixel prediction problems in computer vision.

ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic Segmentation

1 code implementation CVPR 2021 Siyuan Qiao, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

We name this joint task as Depth-aware Video Panoptic Segmentation, and propose a new evaluation metric along with two derived datasets for it, which will be made available to the public.

 Ranked #1 on Video Panoptic Segmentation on Cityscapes-VPS (using extra training data)

Depth-aware Video Panoptic Segmentation Monocular Depth Estimation +2

MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers

3 code implementations CVPR 2021 Huiyu Wang, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

As a result, MaX-DeepLab shows a significant 7. 1% PQ gain in the box-free regime on the challenging COCO dataset, closing the gap between box-based and box-free methods for the first time.

Panoptic Segmentation

Scaling Wide Residual Networks for Panoptic Segmentation

no code implementations23 Nov 2020 Liang-Chieh Chen, Huiyu Wang, Siyuan Qiao

The Wide Residual Networks (Wide-ResNets), a shallow but wide model variant of the Residual Networks (ResNets) by stacking a small number of residual blocks with large channel sizes, have demonstrated outstanding performance on multiple dense prediction tasks.

Ranked #2 on Panoptic Segmentation on Cityscapes test (using extra training data)

Instance Segmentation Panoptic Segmentation +1

Naive-Student: Leveraging Semi-Supervised Learning in Video Sequences for Urban Scene Segmentation

1 code implementation ECCV 2020 Liang-Chieh Chen, Raphael Gontijo Lopes, Bowen Cheng, Maxwell D. Collins, Ekin D. Cubuk, Barret Zoph, Hartwig Adam, Jonathon Shlens

We view this work as a notable step towards building a simple procedure to harness unlabeled video sequences and extra images to surpass state-of-the-art performance on core computer vision tasks.

Image Segmentation Optical Flow Estimation +4

Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation

9 code implementations CVPR 2020 Bowen Cheng, Maxwell D. Collins, Yukun Zhu, Ting Liu, Thomas S. Huang, Hartwig Adam, Liang-Chieh Chen

In this work, we introduce Panoptic-DeepLab, a simple, strong, and fast system for panoptic segmentation, aiming to establish a solid baseline for bottom-up methods that can achieve comparable performance of two-stage methods while yielding fast inference speed.

Ranked #6 on Panoptic Segmentation on Cityscapes test (using extra training data)

Instance Segmentation Panoptic Segmentation +1

SegSort: Segmentation by Discriminative Sorting of Segments

1 code implementation ICCV 2019 Jyh-Jing Hwang, Stella X. Yu, Jianbo Shi, Maxwell D. Collins, Tien-Ju Yang, Xiao Zhang, Liang-Chieh Chen

The proposed SegSort further produces an interpretable result, as each choice of label can be easily understood from the retrieved nearest segments.

Ranked #10 on Unsupervised Semantic Segmentation on PASCAL VOC 2012 val (using extra training data)

Clustering Metric Learning +2

Panoptic-DeepLab

2 code implementations10 Oct 2019 Bowen Cheng, Maxwell D. Collins, Yukun Zhu, Ting Liu, Thomas S. Huang, Hartwig Adam, Liang-Chieh Chen

The semantic segmentation branch is the same as the typical design of any semantic segmentation model (e. g., DeepLab), while the instance segmentation branch is class-agnostic, involving a simple instance center regression.

Instance Segmentation Panoptic Segmentation +2

SPGNet: Semantic Prediction Guidance for Scene Parsing

no code implementations ICCV 2019 Bowen Cheng, Liang-Chieh Chen, Yunchao Wei, Yukun Zhu, Zilong Huang, JinJun Xiong, Thomas Huang, Wen-mei Hwu, Honghui Shi

The multi-scale context module refers to the operations to aggregate feature responses from a large spatial extent, while the single-stage encoder-decoder structure encodes the high-level semantic information in the encoder path and recovers the boundary information in the decoder path.

Pose Estimation Scene Parsing +2

FEELVOS: Fast End-to-End Embedding Learning for Video Object Segmentation

3 code implementations CVPR 2019 Paul Voigtlaender, Yuning Chai, Florian Schroff, Hartwig Adam, Bastian Leibe, Liang-Chieh Chen

Many of the recent successful methods for video object segmentation (VOS) are overly complicated, heavily rely on fine-tuning on the first frame, and/or are slow, and are hence of limited practical use.

Object Segmentation +3

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

76 code implementations ECCV 2018 Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, Hartwig Adam

The former networks are able to encode multi-scale contextual information by probing the incoming features with filters or pooling operations at multiple rates and multiple effective fields-of-view, while the latter networks can capture sharper object boundaries by gradually recovering the spatial information.

 Ranked #1 on Semantic Segmentation on PASCAL VOC 2012 test (using extra training data)

Image Classification Image Segmentation +2

MobileNetV2: Inverted Residuals and Linear Bottlenecks

148 code implementations CVPR 2018 Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen

In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes.

Image Classification Image Segmentation +4

The Devil is in the Decoder: Classification, Regression and GANs

1 code implementation18 Jul 2017 Zbigniew Wojna, Vittorio Ferrari, Sergio Guadarrama, Nathan Silberman, Liang-Chieh Chen, Alireza Fathi, Jasper Uijlings

Many machine vision applications, such as semantic segmentation and depth prediction, require predictions for every pixel of the input image.

Boundary Detection Depth Estimation +4

Rethinking Atrous Convolution for Semantic Image Segmentation

75 code implementations17 Jun 2017 Liang-Chieh Chen, George Papandreou, Florian Schroff, Hartwig Adam

To handle the problem of segmenting objects at multiple scales, we design modules which employ atrous convolution in cascade or in parallel to capture multi-scale context by adopting multiple atrous rates.

Ranked #3 on Semantic Segmentation on PASCAL VOC 2012 test (using extra training data)

Dichotomous Image Segmentation Image Segmentation +3

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

47 code implementations2 Jun 2016 Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille

ASPP probes an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views, thus capturing objects as well as image context at multiple scales.

Image Segmentation Semantic Segmentation

Weakly- and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation

1 code implementation ICCV 2015 George Papandreou, Liang-Chieh Chen, Kevin P. Murphy, Alan L. Yuille

Deep convolutional neural networks (DCNNs) trained on a large number of images with strong pixel-level annotations have recently significantly pushed the state-of-art in semantic image segmentation.

Image Segmentation Segmentation +1

Zoom Better to See Clearer: Human and Object Parsing with Hierarchical Auto-Zoom Net

no code implementations21 Nov 2015 Fangting Xia, Peng Wang, Liang-Chieh Chen, Alan L. Yuille

To tackle these difficulties, we propose a "Hierarchical Auto-Zoom Net" (HAZN) for object part parsing which adapts to the local scales of objects and parts.

ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering

no code implementations18 Nov 2015 Kan Chen, Jiang Wang, Liang-Chieh Chen, Haoyuan Gao, Wei Xu, Ram Nevatia

ABC-CNN determines an attention map for an image-question pair by convolving the image feature map with configurable convolutional kernels derived from the question's semantics.

Question Answering Visual Question Answering

Attention to Scale: Scale-aware Semantic Image Segmentation

no code implementations CVPR 2016 Liang-Chieh Chen, Yi Yang, Jiang Wang, Wei Xu, Alan L. Yuille

We adapt a state-of-the-art semantic image segmentation model, which we jointly train with multi-scale input images and the attention model.

Image Segmentation Segmentation +1

Weakly- and Semi-Supervised Learning of a DCNN for Semantic Image Segmentation

3 code implementations9 Feb 2015 George Papandreou, Liang-Chieh Chen, Kevin Murphy, Alan L. Yuille

Deep convolutional neural networks (DCNNs) trained on a large number of images with strong pixel-level annotations have recently significantly pushed the state-of-art in semantic image segmentation.

Image Segmentation Segmentation +2

Learning Deep Structured Models

no code implementations9 Jul 2014 Liang-Chieh Chen, Alexander G. Schwing, Alan L. Yuille, Raquel Urtasun

Towards this goal, we propose a training algorithm that is able to learn structured models jointly with deep features that form the MRF potentials.

Multi-class Classification

Modeling Image Patches with a Generic Dictionary of Mini-Epitomes

no code implementations CVPR 2014 George Papandreou, Liang-Chieh Chen, Alan L. Yuille

As an alternative, we develop a generative model for the raw intensity of image patches and show that it can support image classification performance on par with optimized SIFT-based techniques in a bag-of-visual-words setting.

Classification General Classification +1

Beat the MTurkers: Automatic Image Labeling from Weak 3D Supervision

1 code implementation CVPR 2014 Liang-Chieh Chen, Sanja Fidler, Alan L. Yuille, Raquel Urtasun

Labeling large-scale datasets with very accurate object segmentations is an elaborate task that requires a high degree of quality control and a budget of tens or hundreds of thousands of dollars.

Autonomous Driving

Cannot find the paper you are looking for? You can Submit a new open access paper.