Search Results for author: Zhen Zhao

Found 33 papers, 15 papers with code

SM$^3$: Self-Supervised Multi-task Modeling with Multi-view 2D Images for Articulated Objects

no code implementations17 Jan 2024 Haowen Wang, Zhen Zhao, Zhao Jin, Zhengping Che, Liang Qiao, Yakun Huang, Zhipeng Fan, XIUQUAN QIAO, Jian Tang

Reconstructing real-world objects and estimating their movable joint structures are pivotal technologies within the field of robotics.

SWBT: Similarity Weighted Behavior Transformer with the Imperfect Demonstration for Robotic Manipulation

no code implementations17 Jan 2024 Kun Wu, Ning Liu, Zhen Zhao, Di Qiu, Jinming Li, Zhengping Che, Zhiyuan Xu, Qinru Qiu, Jian Tang

Imitation learning (IL), aiming to learn optimal control policies from expert demonstrations, has been an effective method for robot manipulation tasks.

Imitation Learning Robot Manipulation

Roll With the Punches: Expansion and Shrinkage of Soft Label Selection for Semi-supervised Fine-Grained Learning

1 code implementation19 Dec 2023 Yue Duan, Zhen Zhao, Lei Qi, Luping Zhou, Lei Wang, Yinghuan Shi

While semi-supervised learning (SSL) has yielded promising results, the more realistic SSL scenario remains to be explored, in which the unlabeled data exhibits extremely high recognition difficulty, e. g., fine-grained visual classification in the context of SSL (SS-FGVC).

Fine-Grained Image Classification Pseudo Label

Alternate Diverse Teaching for Semi-supervised Medical Image Segmentation

1 code implementation29 Nov 2023 Zhen Zhao, Zicheng Wang, Longyue Wang, Yixuan Yuan, Luping Zhou

To mitigate the confirmation bias from the diverse supervision, the core of AD-MT lies in two proposed modules: the Random Periodic Alternate (RPA) Updating Module and the Conflict-Combating Module (CCM).

Data Augmentation Image Segmentation +2

Clean Label Disentangling for Medical Image Segmentation with Noisy Labels

1 code implementation28 Nov 2023 Zicheng Wang, Zhen Zhao, Erjian Guo, Luping Zhou

Current methods focusing on medical image segmentation suffer from incorrect annotations, which is known as the noisy label issue.

Disentanglement Image Segmentation +2

GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation

no code implementations25 Nov 2023 Zhanyu Wang, Longyue Wang, Zhen Zhao, Minghao Wu, Chenyang Lyu, Huayang Li, Deng Cai, Luping Zhou, Shuming Shi, Zhaopeng Tu

While the recent advances in Multimodal Large Language Models (MLLMs) constitute a significant leap forward in the field, these models are predominantly confined to the realm of input-side multimodal comprehension, lacking the capacity for multimodal content generation.

Instruction Following Language Modelling +7

Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer

1 code implementation22 Nov 2023 Zhen Zhao, Jingqun Tang, Chunhui Lin, Binghong Wu, Can Huang, Hao liu, Xin Tan, Zhizhong Zhang, Yuan Xie

A straightforward solution is performing model fine-tuning tailored to a specific scenario, but it is computationally intensive and requires multiple model copies for various scenarios.

In-Context Learning Scene Text Recognition

DriveGPT4: Interpretable End-to-end Autonomous Driving via Large Language Model

no code implementations2 Oct 2023 Zhenhua Xu, Yujia Zhang, Enze Xie, Zhen Zhao, Yong Guo, Kwan-Yee. K. Wong, Zhenguo Li, Hengshuang Zhao

Multimodal large language models (MLLMs) have emerged as a prominent area of interest within the research community, given their proficiency in handling and reasoning with non-textual data, including images and videos.

Autonomous Driving Language Modelling +2

Enhancing Sample Utilization through Sample Adaptive Augmentation in Semi-Supervised Learning

1 code implementation ICCV 2023 Guan Gui, Zhen Zhao, Lei Qi, Luping Zhou, Lei Wang, Yinghuan Shi

Sample adaptive augmentation (SAA) is proposed for this stated purpose and consists of two modules: 1) sample selection module; 2) sample augmentation module.

DTF-Net: Category-Level Pose Estimation and Shape Reconstruction via Deformable Template Field

no code implementations4 Aug 2023 Haowen Wang, Zhipeng Fan, Zhen Zhao, Zhengping Che, Zhiyuan Xu, Dong Liu, Feifei Feng, Yakun Huang, XIUQUAN QIAO, Jian Tang

We introduce a pose regression module that shares the deformation features and template codes from the fields to estimate the accurate 6D pose of each object in the scene.

Object Pose Estimation

Conflict-Based Cross-View Consistency for Semi-Supervised Semantic Segmentation

1 code implementation CVPR 2023 Zicheng Wang, Zhen Zhao, Xiaoxia Xing, Dong Xu, Xiangyu Kong, Luping Zhou

In this work, we propose a new conflict-based cross-view consistency (CCVC) method based on a two-branch co-training framework which aims at enforcing the two sub-nets to learn informative features from irrelevant views.

Semi-Supervised Semantic Segmentation

Rethinking Gradient Projection Continual Learning: Stability / Plasticity Feature Space Decoupling

no code implementations CVPR 2023 Zhen Zhao, Zhizhong Zhang, Xin Tan, Jun Liu, Yanyun Qu, Yuan Xie, Lizhuang Ma

In this paper, we propose a space decoupling (SD) algorithm to decouple the feature space into a pair of complementary subspaces, i. e., the stability space I, and the plasticity space R. I is established by conducting space intersection between the historic and current feature space, and thus I contains more task-shared bases.

Continual Learning

Augmentation Matters: A Simple-yet-Effective Approach to Semi-supervised Semantic Segmentation

1 code implementation CVPR 2023 Zhen Zhao, Lihe Yang, Sifan Long, Jimin Pi, Luping Zhou, Jingdong Wang

Differently, in this work, we follow a standard teacher-student framework and propose AugSeg, a simple and clean approach that focuses mainly on data perturbations to boost the SSS performance.

Semi-Supervised Semantic Segmentation

Beyond Attentive Tokens: Incorporating Token Importance and Diversity for Efficient Vision Transformers

1 code implementation CVPR 2023 Sifan Long, Zhen Zhao, Jimin Pi, Shengsheng Wang, Jingdong Wang

In this paper, we emphasize the cruciality of diverse global semantics and propose an efficient token decoupling and merging method that can jointly consider the token importance and diversity for token pruning.

Computational Efficiency Efficient ViTs

Instance-specific and Model-adaptive Supervision for Semi-supervised Semantic Segmentation

1 code implementation CVPR 2023 Zhen Zhao, Sifan Long, Jimin Pi, Jingdong Wang, Luping Zhou

Relying on the model's performance, iMAS employs a class-weighted symmetric intersection-over-union to evaluate quantitative hardness of each unlabeled instance and supervises the training on unlabeled data in a model-adaptive manner.

Segmentation Semi-Supervised Semantic Segmentation

MutexMatch: Semi-Supervised Learning with Mutex-Based Consistency Regularization

3 code implementations27 Mar 2022 Yue Duan, Zhen Zhao, Lei Qi, Lei Wang, Luping Zhou, Yinghuan Shi, Yang Gao

The core issue in semi-supervised learning (SSL) lies in how to effectively leverage unlabeled data, whereas most existing methods tend to put a great emphasis on the utilization of high-confidence samples yet seldom fully explore the usage of low-confidence samples.

Semi-Supervised Image Classification

The Winning Solution to the iFLYTEK Challenge 2021 Cultivated Land Extraction from High-Resolution Remote Sensing Image

1 code implementation22 Feb 2022 Zhen Zhao, Yuqiu Liu, Gang Zhang, Liang Tang, Xiaolin Hu

This report introduces our solution to the iFLYTEK challenge 2021 cultivated land extraction from high-resolution remote sensing image.

Instance Segmentation Segmentation +1

Bi-Dimensional Feature Alignment for Cross-Domain Object Detection

no code implementations14 Nov 2020 Zhen Zhao, Yuhong Guo, Jieping Ye

Recently the problem of cross-domain object detection has started drawing attention in the computer vision community.

Object Object Detection +1

Ensemble Model with Batch Spectral Regularization and Data Blending for Cross-Domain Few-Shot Learning with Unlabeled Data

no code implementations8 Jun 2020 Zhen Zhao, Bingyu Liu, Yuhong Guo, Jieping Ye

In this paper, we present our proposed ensemble model with batch spectral regularization and data blending mechanisms for the Track 2 problem of the cross-domain few-shot learning (CD-FSL) challenge.

cross-domain few-shot learning

Feature Transformation Ensemble Model with Batch Spectral Regularization for Cross-Domain Few-Shot Classification

no code implementations18 May 2020 Bingyu Liu, Zhen Zhao, Zhenpeng Li, Jianan Jiang, Yuhong Guo, Jieping Ye

In this paper, we propose a feature transformation ensemble model with batch spectral regularization for the Cross-domain few-shot learning (CD-FSL) challenge.

cross-domain few-shot learning Data Augmentation +2

Adaptive Object Detection with Dual Multi-Label Prediction

no code implementations ECCV 2020 Zhen Zhao, Yuhong Guo, Haifeng Shen, Jieping Ye

In this paper, we propose a novel end-to-end unsupervised deep domain adaptation model for adaptive object detection by exploiting multi-label object recognition as a dual auxiliary task.

Image-to-Image Translation Object +4

Mutual Learning Network for Multi-Source Domain Adaptation

no code implementations29 Mar 2020 Zhenpeng Li, Zhen Zhao, Yuhong Guo, Haifeng Shen, Jieping Ye

However, in practice the labeled data can come from multiple source domains with different distributions.

Unsupervised Domain Adaptation

Fast Inference in Capsule Networks Using Accumulated Routing Coefficients

no code implementations15 Apr 2019 Zhen Zhao, Ashley Kleinhans, Gursharan Sandhu, Ishan Patel, K. P. Unnikrishnan

Afterward, the routing coefficients associated with the training examples are accumulated offline and used to create a set of "master" routing coefficients.

Object Rotated MNIST

Capsule Networks with Max-Min Normalization

no code implementations22 Mar 2019 Zhen Zhao, Ashley Kleinhans, Gursharan Sandhu, Ishan Patel, K. P. Unnikrishnan

Capsule Networks (CapsNet) use the Softmax function to convert the logits of the routing coefficients into a set of normalized values that signify the assignment probabilities between capsules in adjacent layers.

CT Super-resolution GAN Constrained by the Identical, Residual, and Cycle Learning Ensemble(GAN-CIRCLE)

no code implementations10 Aug 2018 Chenyu You, Guang Li, Yi Zhang, Xiaoliu Zhang, Hongming Shan, Shenghong Ju, Zhen Zhao, Zhuiyang Zhang, Wenxiang Cong, Michael W. Vannier, Punam K. Saha, Ge Wang

Specifically, with the generative adversarial network (GAN) as the building block, we enforce the cycle-consistency in terms of the Wasserstein distance to establish a nonlinear end-to-end mapping from noisy LR input images to denoised and deblurred HR outputs.

Computed Tomography (CT) Generative Adversarial Network +2

Structure-sensitive Multi-scale Deep Neural Network for Low-Dose CT Denoising

no code implementations2 May 2018 Chenyu You, Qingsong Yang, Hongming Shan, Lars Gjesteby, Guang Li, Shenghong Ju, Zhuiyang Zhang, Zhen Zhao, Yi Zhang, Wenxiang Cong, Ge Wang

However, the radiation dose reduction compromises the signal-to-noise ratio (SNR), leading to strong noise and artifacts that down-grade CT image quality.

Computed Tomography (CT) Denoising

Cannot find the paper you are looking for? You can Submit a new open access paper.