Search Results for author: Zhuotao Tian

Found 33 papers, 26 papers with code

Referencing Where to Focus: Improving VisualGrounding with Referential Query

no code implementations26 Dec 2024 Yabing Wang, Zhuotao Tian, Qingpei Guo, Zheng Qin, Sanping Zhou, Ming Yang, Le Wang

It consists of the query adaption module that can be seamlessly integrated into CLIP and generate the referential query to provide the prior context for decoder, along with a task-specific decoder.

Decoder Visual Grounding

VisionZip: Longer is Better but Not Necessary in Vision Language Models

1 code implementation5 Dec 2024 Senqiao Yang, Yukang Chen, Zhuotao Tian, Chengyao Wang, Jingyao Li, Bei Yu, Jiaya Jia

To address this, we introduce VisionZip, a simple yet effective method that selects a set of informative tokens for input to the language model, reducing visual token redundancy and improving efficiency while maintaining model performance.

Video Understanding Visual Question Answering

Typicalness-Aware Learning for Failure Detection

1 code implementation4 Nov 2024 Yijun Liu, Jiequan Cui, Zhuotao Tian, Senqiao Yang, Qingdong He, Xiaoling Wang, Jingyong Su

We observe that, with the cross-entropy loss, model predictions are optimized to align with the corresponding labels via increasing logit magnitude or refining logit direction.

Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation

1 code implementation11 Jul 2024 Tong Shao, Zhuotao Tian, Hang Zhao, Jingyong Su

CLIP, as a vision-language model, has significantly advanced Open-Vocabulary Semantic Segmentation (OVSS) with its zero-shot capabilities.

Language Modeling Language Modelling +3

Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models

1 code implementation7 Jul 2024 Longxiang Tang, Zhuotao Tian, Kai Li, Chunming He, Hantao Zhou, Hengshuang Zhao, Xiu Li, Jiaya Jia

To address this problem efficiently, we propose the Distribution-aware Interference-free Knowledge Integration (DIKI) framework, retaining pre-trained knowledge of VLMs from a perspective of avoiding information interference.

class-incremental learning Class Incremental Learning +2

Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs

1 code implementation26 Jun 2024 Xin Lai, Zhuotao Tian, Yukang Chen, Senqiao Yang, Xiangru Peng, Jiaya Jia

Mathematical reasoning presents a significant challenge for Large Language Models (LLMs) due to the extensive and precise chain of reasoning required for accuracy.

Ranked #11 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +2

Scalable Language Model with Generalized Continual Learning

1 code implementation11 Apr 2024 Bohao Peng, Zhuotao Tian, Shu Liu, MingChang Yang, Jiaya Jia

In this study, we introduce the Scalable Language Model (SLM) to overcome these limitations within a more challenging and generalized setting, representing a significant advancement toward practical applications for continual learning.

Continual Learning Language Modeling +2

Unified Language-driven Zero-shot Domain Adaptation

1 code implementation CVPR 2024 Senqiao Yang, Zhuotao Tian, Li Jiang, Jiaya Jia

This paper introduces Unified Language-driven Zero-shot Domain Adaptation (ULDA), a novel task setting that enables a single model to adapt to diverse target domains without explicit domain-ID knowledge.

Domain Adaptation Representation Learning

OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation

1 code implementation CVPR 2024 Bohao Peng, Xiaoyang Wu, Li Jiang, Yukang Chen, Hengshuang Zhao, Zhuotao Tian, Jiaya Jia

This exploration led to the creation of Omni-Adaptive 3D CNNs (OA-CNNs), a family of networks that integrates a lightweight module to greatly enhance the adaptivity of sparse CNNs at minimal computational cost.

Ranked #5 on 3D Semantic Segmentation on SemanticKITTI (val mIoU metric)

3D Semantic Segmentation LIDAR Semantic Segmentation

SaCo Loss: Sample-wise Affinity Consistency for Vision-Language Pre-training

no code implementations CVPR 2024 Sitong Wu, Haoru Tan, Zhuotao Tian, Yukang Chen, Xiaojuan Qi, Jiaya Jia

We discover that the lack of consideration for sample-wise affinity consistency across modalities in existing training objectives is the central cause.

LISA++: An Improved Baseline for Reasoning Segmentation with Large Language Model

1 code implementation28 Dec 2023 Senqiao Yang, Tianyuan Qu, Xin Lai, Zhuotao Tian, Bohao Peng, Shu Liu, Jiaya Jia

While LISA effectively bridges the gap between segmentation and large language models to enable reasoning segmentation, it poses certain limitations: unable to distinguish different instances of the target region, and constrained by the pre-defined textual response formats.

Instance Segmentation Language Modeling +5

Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training

1 code implementation CVPR 2024 Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao

In contrast, such privilege has not yet fully benefited 3D deep learning, mainly due to the limited availability of large-scale 3D datasets.

Ranked #3 on 3D Semantic Segmentation on SemanticKITTI (val mIoU metric, using extra training data)

3D Semantic Segmentation LIDAR Semantic Segmentation +1

Boosting Few-shot 3D Point Cloud Segmentation via Query-Guided Enhancement

1 code implementation6 Aug 2023 Zhenhua Ning, Zhuotao Tian, Guangming Lu, Wenjie Pei

Although extensive research has been conducted on 3D point cloud segmentation, effectively adapting generic models to novel categories remains a formidable challenge.

Point Cloud Segmentation Segmentation

Hierarchical Dense Correlation Distillation for Few-Shot Segmentation-Extended Abstract

no code implementations27 Jun 2023 Bohao Peng, Zhuotao Tian, Xiaoyang Wu, Chengyao Wang, Shu Liu, Jingyong Su, Jiaya Jia

We hope our work can benefit broader industrial applications where novel classes with limited annotations are required to be decently identified.

Few-Shot Semantic Segmentation Segmentation +2

Decoupled Kullback-Leibler Divergence Loss

4 code implementations23 May 2023 Jiequan Cui, Zhuotao Tian, Zhisheng Zhong, Xiaojuan Qi, Bei Yu, Hanwang Zhang

In this paper, we delve deeper into the Kullback-Leibler (KL) Divergence loss and mathematically prove that it is equivalent to the Decoupled Kullback-Leibler (DKL) Divergence loss that consists of 1) a weighted Mean Square Error (wMSE) loss and 2) a Cross-Entropy loss incorporating soft labels.

Adversarial Defense Adversarial Robustness +1

Learning Context-aware Classifier for Semantic Segmentation

2 code implementations21 Mar 2023 Zhuotao Tian, Jiequan Cui, Li Jiang, Xiaojuan Qi, Xin Lai, Yixin Chen, Shu Liu, Jiaya Jia

Semantic segmentation is still a challenging task for parsing diverse contexts in different scenes, thus the fixed classifier might not be able to well address varying feature distributions during testing.

Decoder Segmentation +1

Spatial Pruned Sparse Convolution for Efficient 3D Object Detection

no code implementations28 Sep 2022 Jianhui Liu, Yukang Chen, Xiaoqing Ye, Zhuotao Tian, Xiao Tan, Xiaojuan Qi

3D scenes are dominated by a large number of background points, which is redundant for the detection task that mainly needs to focus on foreground objects.

3D Object Detection Object +1

Generalized Parametric Contrastive Learning

4 code implementations26 Sep 2022 Jiequan Cui, Zhisheng Zhong, Zhuotao Tian, Shu Liu, Bei Yu, Jiaya Jia

Based on theoretical analysis, we observe that supervised contrastive loss tends to bias high-frequency classes and thus increases the difficulty of imbalanced learning.

Contrastive Learning Domain Generalization +3

Understanding the Tricks of Deep Learning in Medical Image Segmentation: Challenges and Future Directions

1 code implementation21 Sep 2022 Dong Zhang, Yi Lin, Hao Chen, Zhuotao Tian, Xin Yang, Jinhui Tang, Kwang Ting Cheng

Over the past few years, the rapid development of deep learning technologies for computer vision has significantly improved the performance of medical image segmentation (MedISeg).

Data Augmentation Domain Adaptation +3

DecoupleNet: Decoupled Network for Domain Adaptive Semantic Segmentation

1 code implementation20 Jul 2022 Xin Lai, Zhuotao Tian, Xiaogang Xu, Yingcong Chen, Shu Liu, Hengshuang Zhao, LiWei Wang, Jiaya Jia

Unsupervised domain adaptation in semantic segmentation has been raised to alleviate the reliance on expensive pixel-wise annotations.

Segmentation Semantic Segmentation +2

SEA: Bridging the Gap Between One- and Two-stage Detector Distillation via SEmantic-aware Alignment

no code implementations2 Mar 2022 Yixin Chen, Zhuotao Tian, Pengguang Chen, Shu Liu, Jiaya Jia

We revisit the one- and two-stage detector distillation tasks and present a simple and efficient semantic-aware framework to fill the gap between them.

Instance Segmentation object-detection +2

Guided Point Contrastive Learning for Semi-supervised Point Cloud Semantic Segmentation

2 code implementations ICCV 2021 Li Jiang, Shaoshuai Shi, Zhuotao Tian, Xin Lai, Shu Liu, Chi-Wing Fu, Jiaya Jia

To address the high cost and challenges of 3D point-level labeling, we present a method for semi-supervised point cloud semantic segmentation to adopt unlabeled point clouds in training to boost the model performance.

3D Semantic Segmentation Contrastive Learning +1

ResLT: Residual Learning for Long-tailed Recognition

5 code implementations26 Jan 2021 Jiequan Cui, Shu Liu, Zhuotao Tian, Zhisheng Zhong, Jiaya Jia

From this perspective, the trivial solution utilizes different branches for the head, medium, and tail classes respectively, and then sums their outputs as the final results is not feasible.

Long-tail Learning

Generalized Few-shot Semantic Segmentation

1 code implementation CVPR 2022 Zhuotao Tian, Xin Lai, Li Jiang, Shu Liu, Michelle Shu, Hengshuang Zhao, Jiaya Jia

Then, since context is essential for semantic segmentation, we propose the Context-Aware Prototype Learning (CAPL) that significantly improves performance by 1) leveraging the co-occurrence prior knowledge from support samples, and 2) dynamically enriching contextual information to the classifier, conditioned on the content of each query image.

Generalized Few-Shot Semantic Segmentation Segmentation +1

Prior Guided Feature Enrichment Network for Few-Shot Segmentation

3 code implementations4 Aug 2020 Zhuotao Tian, Hengshuang Zhao, Michelle Shu, Zhicheng Yang, Ruiyu Li, Jiaya Jia

It consists of novel designs of (1) a training-free prior mask generation method that not only retains generalization power but also improves model performance and (2) Feature Enrichment Module (FEM) that overcomes spatial inconsistency by adaptively enriching query features with support features and prior masks.

Few-Shot Semantic Segmentation Semantic Segmentation

Region Refinement Network for Salient Object Detection

no code implementations27 Jun 2019 Zhuotao Tian, Hengshuang Zhao, Michelle Shu, Jiaze Wang, Ruiyu Li, Xiaoyong Shen, Jiaya Jia

Albeit intensively studied, false prediction and unclear boundaries are still major issues of salient object detection.

Object object-detection +5

Learning Shape-Aware Embedding for Scene Text Detection

no code implementations CVPR 2019 Zhuotao Tian, Michelle Shu, Pengyuan Lyu, Ruiyu Li, Chao Zhou, Xiaoyong Shen, Jiaya Jia

We address the problem of detecting scene text in arbitrary shapes, which is a challenging task due to the high variety and complexity of the scene.

Instance Segmentation Scene Text Detection +3

Cannot find the paper you are looking for? You can Submit a new open access paper.