Search Results for author: Tianzhu Zhang

Found 59 papers, 15 papers with code

Instance-Adaptive and Geometric-Aware Keypoint Learning for Category-Level 6D Object Pose Estimation

1 code implementation28 Mar 2024 Xiao Lin, Wenfei Yang, Yuan Gao, Tianzhu Zhang

(2) The second design is a Geometric-Aware Feature Aggregation module, which can efficiently integrate the local and global geometric information into keypoint features.

6D Pose Estimation using RGB Keypoint Detection

Unsupervised Template-assisted Point Cloud Shape Correspondence Network

no code implementations25 Mar 2024 Jiacheng Deng, Jiahao Lu, Tianzhu Zhang

Unsupervised point cloud shape correspondence aims to establish point-wise correspondences between source and target point clouds.

BSNet: Box-Supervised Simulation-assisted Mean Teacher for 3D Instance Segmentation

1 code implementation22 Mar 2024 Jiahao Lu, Jiacheng Deng, Tianzhu Zhang

To generate higher quality pseudo-labels and achieve more precise weakly supervised 3DIS results, we propose the Box-Supervised Simulation-assisted Mean Teacher for 3D Instance Segmentation (BSNet), which devises a novel pseudo-labeler called Simulation-assisted Transformer.

3D Instance Segmentation Decoder +1

Multi-modal Attribute Prompting for Vision-Language Models

no code implementations1 Mar 2024 Xin Liu, Jiamin Wu, Tianzhu Zhang

To address this issue, we propose a Multi-modal Attribute Prompting method (MAP) by jointly exploring textual attribute prompting, visual attribute prompting, and attribute-level alignment.


Joint Attention-Guided Feature Fusion Network for Saliency Detection of Surface Defects

no code implementations5 Feb 2024 Xiaoheng Jiang, Feng Yan, Yang Lu, Ke Wang, Shuai Guo, Tianzhu Zhang, Yanwei Pang, Jianwei Niu, Mingliang Xu

To address these issues, we propose a joint attention-guided feature fusion network (JAFFNet) for saliency detection of surface defects based on the encoder-decoder network.

Defect Detection Saliency Detection

Unifying Visual and Vision-Language Tracking via Contrastive Learning

1 code implementation20 Jan 2024 Yinchao Ma, Yuyang Tang, Wenfei Yang, Tianzhu Zhang, Jinpeng Zhang, Mengxue Kang

Single object tracking aims to locate the target object in a video sequence according to the state specified by different modal references, including the initial bounding box (BBOX), natural language (NL), or both (NL+BBOX).

Contrastive Learning Object Tracking +2

Frequency Domain Modality-invariant Feature Learning for Visible-infrared Person Re-Identification

no code implementations3 Jan 2024 Yulin Li, Tianzhu Zhang, Yongdong Zhang

Visible-infrared person re-identification (VI-ReID) is challenging due to the significant cross-modality discrepancies between visible and infrared images.

Metric Learning Person Re-Identification

Not Every Side Is Equal: Localization Uncertainty Estimation for Semi-Supervised 3D Object Detection

no code implementations ICCV 2023 Chuxin Wang, Wenfei Yang, Tianzhu Zhang

Semi-supervised 3D object detection from point cloud aims to train a detector with a small number of labeled data and a large number of unlabeled data.

3D Object Detection object-detection +1

TIFace: Improving Facial Reconstruction through Tensorial Radiance Fields and Implicit Surfaces

2 code implementations15 Dec 2023 Ruijie Zhu, Jiahao Chang, Ziyang Song, Jiahuan Yu, Tianzhu Zhang

This report describes the solution that secured the first place in the "View Synthesis Challenge for Human Heads (VSCHH)" at the ICCV 2023 workshop.

Face Reconstruction Neural Rendering +1

Focus on Query: Adversarial Mining Transformer for Few-Shot Segmentation

1 code implementation NeurIPS 2023 YuAn Wang, Naisong Luo, Tianzhu Zhang

In this paper, we rethink the importance of support information and propose a new query-centric FSS model Adversarial Mining Transformer (AMFormer), which achieves accurate query image segmentation with only rough support guidance or even weak support labels.

Image Segmentation Semantic Segmentation

GUPNet++: Geometry Uncertainty Propagation Network for Monocular 3D Object Detection

1 code implementation24 Oct 2023 Yan Lu, Xinzhu Ma, Lei Yang, Tianzhu Zhang, Yating Liu, Qi Chu, Tong He, Yonghui Li, Wanli Ouyang

It models the uncertainty propagation relationship of the geometry projection during training, improving the stability and efficiency of the end-to-end model learning.

Monocular 3D Object Detection object-detection

SE-ORNet: Self-Ensembling Orientation-aware Network for Unsupervised Point Cloud Shape Correspondence

1 code implementation CVPR 2023 Jiacheng Deng, Chuxin Wang, Jiahao Lu, Jianfeng He, Tianzhu Zhang, Jiyang Yu, Zhe Zhang

The key of our approach is to exploit an orientation estimation module with a domain adaptive discriminator to align the orientations of point cloud pairs, which significantly alleviates the mispredictions of symmetrical parts.

Ranked #2 on 3D Dense Shape Correspondence on SHREC'19 (using extra training data)

3D Dense Shape Correspondence

Structured Epipolar Matcher for Local Feature Matching

no code implementations29 Mar 2023 Jiahao Chang, Jiahuan Yu, Tianzhu Zhang

Local feature matching is challenging due to textureless and repetitive patterns.

Adaptive Spot-Guided Transformer for Consistent Local Feature Matching

no code implementations CVPR 2023 Jiahuan Yu, Jiahao Chang, Jianfeng He, Tianzhu Zhang, Feng Wu

To deal with the above issues, we propose Adaptive Spot-Guided Transformer (ASTR) for local feature matching, which jointly models the local consistency and scale variations in a unified coarse-to-fine architecture.

Rethinking the Correlation in Few-Shot Segmentation: A Buoys View

no code implementations CVPR 2023 YuAn Wang, Rui Sun, Tianzhu Zhang

In this work, we rethink how to mitigate the false matches from the perspective of representative reference features (referred to as buoys), and propose a novel adaptive buoys correlation (ABC) network to rectify direct pairwise pixel-level correlation, including a buoys mining module and an adaptive correlation module.

Query Refinement Transformer for 3D Instance Segmentation

no code implementations ICCV 2023 Jiahao Lu, Jiacheng Deng, Chuxin Wang, Jianfeng He, Tianzhu Zhang

Additionally, we design an affiliated transformer decoder that suppresses the interference of noise background queries and helps the foreground queries focus on instance discriminative parts to predict final segmentation results.

3D Instance Segmentation Decoder +2

Adaptive Template Transformer for Mitochondria Segmentation in Electron Microscopy Images

no code implementations ICCV 2023 Yuwen Pan, Naisong Luo, Rui Sun, Meng Meng, Tianzhu Zhang, Zhiwei Xiong, Yongdong Zhang

Mitochondria, as tiny structures within the cell, are of significant importance to study cell functions for biological and clinical analysis.

Multimodal High-order Relation Transformer for Scene Boundary Detection

no code implementations ICCV 2023 Xi Wei, Zhangxiang Shi, Tianzhu Zhang, Xiaoyuan Yu, Lei Xiao

Scene boundary detection breaks down long videos into meaningful story-telling units and plays a crucial role in high-level video understanding.

Boundary Detection Decoder +2

Dynamic Generative Targeted Attacks With Pattern Injection

no code implementations CVPR 2023 Weiwei Feng, Nanqing Xu, Tianzhu Zhang, Yongdong Zhang

Concretely, the former adopts a dynamic convolution kernel and a static convolution kernel for the specific instance and the global dataset, respectively, which can inherit the advantages of both instance-specific and instance-agnostic attacks.

D2Former: Jointly Learning Hierarchical Detectors and Contextual Descriptors via Agent-Based Transformers

no code implementations CVPR 2023 Jianfeng He, Yuan Gao, Tianzhu Zhang, Zhe Zhang, Feng Wu

Second, the HKDL module can generate keypoint detectors in a hierarchical way, which is helpful for detecting keypoints with diverse levels of structures.

Camouflaged Instance Segmentation via Explicit De-Camouflaging

no code implementations CVPR 2023 Naisong Luo, Yuwen Pan, Rui Sun, Tianzhu Zhang, Zhiwei Xiong, Feng Wu

To address these challenges, we propose a novel De-camouflaging Network (DCNet) including a pixel-level camouflage decoupling module and an instance-level camouflage suppression module.

Instance Segmentation Segmentation +1

Domain Generalized Stereo Matching via Hierarchical Visual Transformation

no code implementations CVPR 2023 Tianyu Chang, Xun Yang, Tianzhu Zhang, Meng Wang

In this way, we can prevent the model from exploiting the artifacts of synthetic stereo images as shortcut features, thereby estimating the disparity maps more effectively based on the learned robust and shortcut-invariant representation.

Domain Generalization Stereo Matching

Foreground-Background Distribution Modeling Transformer for Visual Object Tracking

no code implementations ICCV 2023 Dawei Yang, Jianfeng He, Yinchao Ma, Qianjin Yu, Tianzhu Zhang

To address the above limitations, we propose a novel foreground-background distribution modeling transformer for visual object tracking (F-BDMTrack), including a fore-background agent learning (FBAL) module and a distribution-aware attention (DA2) module in a unified transformer architecture.

Object Visual Object Tracking

Alignment Before Aggregation: Trajectory Memory Retrieval Network for Video Object Segmentation

no code implementations ICCV 2023 Rui Sun, YuAn Wang, Huayu Mai, Tianzhu Zhang, Feng Wu

In this work, we reconcile the inherent tension of spatial and temporal information to retrieve memory frame information along the object trajectory, and propose a novel and coherent Trajectory Memory Retrieval Network (TMRN) to equip with the trajectory information, including a spatial alignment module and a temporal aggregation module.

Retrieval Semantic Segmentation +2

Cross-Modality Transformer for Visible-Infrared Person Re-Identification

no code implementations ECCV 2022 Kongzhu Jiang, Tianzhu Zhang, Xiang Liu, Bingqiao Qian, Yongdong Zhang, Feng Wu ;

To alleviate the above issues, we propose a novel Cross-Modality Transformer (CMT) to jointly explore a modality-level alignment module and an instance-level module for VI-ReID.

Decoder Person Re-Identification

A Keypoint-based Global Association Network for Lane Detection

1 code implementation CVPR 2022 Jinsheng Wang, Yinchao Ma, Shaofei Huang, Tianrui Hui, Fei Wang, Chen Qian, Tianzhu Zhang

Earlier works follow a top-down roadmap to regress predefined anchors into various shapes of lane lines, which lacks enough flexibility to fit complex shapes of lanes due to the fixed anchor shapes.

Ranked #4 on Lane Detection on TuSimple (F1 score metric)

Keypoint Estimation Lane Detection

Motion-Modulated Temporal Fragment Alignment Network for Few-Shot Action Recognition

no code implementations CVPR 2022 Jiamin Wu, Tianzhu Zhang, Zhe Zhang, Feng Wu, Yongdong Zhang

To address this issue, we propose an end-to-end Motion-modulated Temporal Fragment Alignment Network (MTFAN) by jointly exploring the task-specific motion modulation and the multi-level temporal fragment alignment for Few-Shot Action Recognition (FSAR).

Few-Shot action recognition Few Shot Action Recognition +1

Learning Dynamic Compact Memory Embedding for Deformable Visual Object Tracking

no code implementations23 Nov 2021 Pengfei Zhu, Hongtao Yu, Kaihua Zhang, Yu Wang, Shuai Zhao, Lei Wang, Tianzhu Zhang, QinGhua Hu

To address this issue, segmentation-based trackers have been proposed that employ per-pixel matching to improve the tracking performance of deformable objects effectively.

Segmentation Visual Object Tracking +1

Uncertainty Guided Collaborative Training for Weakly Supervised Temporal Action Detection

no code implementations CVPR 2021 Wenfei Yang, Tianzhu Zhang, Xiaoyuan Yu, Tian Qi, Yongdong Zhang, Feng Wu

To alleviate this problem, we propose a novel Uncertainty Guided Collaborative Training (UGCT) strategy, which mainly includes two key designs: (1) The first design is an online pseudo label generation module, in which the RGB and FLOW streams work collaboratively to learn from each other.

Action Detection Pseudo Label

Lesion-Aware Transformers for Diabetic Retinopathy Grading

no code implementations CVPR 2021 Rui Sun, Yihao Li, Tianzhu Zhang, Zhendong Mao, Feng Wu, Yongdong Zhang

First, to the best of our knowledge, this is the first work to formulate lesion discovery as a weakly supervised lesion localization problem via a transformer decoder.

Decoder Diabetic Retinopathy Grading

Diverse Part Discovery: Occluded Person Re-identification with Part-Aware Transformer

no code implementations CVPR 2021 Yulin Li, Jianfeng He, Tianzhu Zhang, Xiang Liu, Yongdong Zhang, Feng Wu

To address these issues, we propose a novel end-to-end Part-Aware Transformer (PAT) for occluded person Re-ID through diverse part discovery via a transformer encoderdecoder architecture, including a pixel context based transformer encoder and a part prototype based transformer decoder.

Decoder Person Re-Identification

Action Unit Memory Network for Weakly Supervised Temporal Action Localization

no code implementations CVPR 2021 Wang Luo, Tianzhu Zhang, Wenfei Yang, Jingen Liu, Tao Mei, Feng Wu, Yongdong Zhang

In this paper, we present an Action Unit Memory Network (AUMN) for weakly supervised temporal action localization, which can mitigate the above two challenges by learning an action unit memory bank.

Weakly Supervised Action Localization Weakly-supervised Temporal Action Localization +1

Meta-Attack: Class-Agnostic and Model-Agnostic Physical Adversarial Attack

no code implementations ICCV 2021 Weiwei Feng, Baoyuan Wu, Tianzhu Zhang, Yong Zhang, Yongdong Zhang

To tackle these issues, we propose a class-agnostic and model-agnostic physical adversarial attack model (Meta-Attack), which is able to not only generate robust physical adversarial examples by simulating color and shape distortions, but also generalize to attacking novel images and novel DNN models by accessing a few digital and physical images.

Adversarial Attack Few-Shot Learning

Foreground Activation Maps for Weakly Supervised Object Localization

no code implementations ICCV 2021 Meng Meng, Tianzhu Zhang, Qi Tian, Yongdong Zhang, Feng Wu

To the best of our knowledge, this is the first work that can achieve remarkable performance for both tasks by optimizing them jointly via FAM for WSOL.

Classification Object +1

Task-Aware Part Mining Network for Few-Shot Learning

no code implementations ICCV 2021 Jiamin Wu, Tianzhu Zhang, Yongdong Zhang, Feng Wu

The task-aware part filters can adapt to any individual task and automatically mine task-related local parts even for an unseen task.

Few-Shot Learning

Graph Structured Network for Image-Text Matching

1 code implementation CVPR 2020 Chunxiao Liu, Zhendong Mao, Tianzhu Zhang, Hongtao Xie, Bin Wang, Yongdong Zhang

The GSMN explicitly models object, relation and attribute as a structured phrase, which not only allows to learn correspondence of object, relation and attribute separately, but also benefits to learn fine-grained correspondence of structured phrase.

Attribute Image-text matching +3

Cross-modality Person re-identification with Shared-Specific Feature Transfer

no code implementations CVPR 2020 Yan Lu, Yue Wu, Bin Liu, Tianzhu Zhang, Baopu Li, Qi Chu, Nenghai Yu

In this paper, we tackle the above limitation by proposing a novel cross-modality shared-specific feature transfer algorithm (termed cm-SSFT) to explore the potential of both the modality-shared information and the modality-specific characteristics to boost the re-identification performance.

Cross-Modality Person Re-identification Person Re-Identification

Describe and Attend to Track: Learning Natural Language guided Structural Representation and Visual Attention for Object Tracking

no code implementations25 Nov 2018 Xiao Wang, Chenglong Li, Rui Yang, Tianzhu Zhang, Jin Tang, Bin Luo

To refine the states of the target and re-track the target when it is back to view from heavy occlusion and out of view, we elaborately design a novel subnetwork to learn the target-driven visual attentions from the guidance of both visual and natural language cues.

Object Tracking

Joint Pose and Expression Modeling for Facial Expression Recognition

no code implementations CVPR 2018 Feifei Zhang, Tianzhu Zhang, Qirong Mao, Changsheng Xu

First, the encoder-decoder structure of the generator can learn a generative and discriminative identity representation for face images.

Decoder Facial Expression Recognition +3

In Defense of Sparse Tracking: Circulant Sparse Tracker

no code implementations CVPR 2016 Tianzhu Zhang, Adel Bibi, Bernard Ghanem

Sparse representation has been introduced to visual tracking by finding the best target candidate with minimal reconstruction error within the particle filter framework.

Visual Tracking

3D Part-Based Sparse Tracker With Automatic Synchronization and Registration

no code implementations CVPR 2016 Adel Bibi, Tianzhu Zhang, Bernard Ghanem

In this paper, we present a part-based sparse tracker in a particle filter framework where both the motion and appearance model are formulated in 3D.

Occlusion Handling

Structural Sparse Tracking

no code implementations CVPR 2015 Tianzhu Zhang, Si Liu, Changsheng Xu, Shuicheng Yan, Bernard Ghanem, Narendra Ahuja, Ming-Hsuan Yang

Sparse representation has been applied to visual tracking by finding the best target candidate with minimal reconstruction error by use of target templates.

Visual Tracking

Cannot find the paper you are looking for? You can Submit a new open access paper.