Search Results for author: Zhizheng Zhang

Found 46 papers, 15 papers with code

RelationVLM: Making Large Vision-Language Models Understand Visual Relations

no code implementations • 19 Mar 2024 • Zhipeng Huang, Zhizheng Zhang, Zheng-Jun Zha, Yan Lu, Baining Guo

The development of Large Vision-Language Models (LVLMs) is striving to catch up with the success of Large Language Models (LLMs), yet it faces more challenges to be resolved.

Language Modelling

Paper
Add Code

VisualCritic: Making LMMs Perceive Visual Quality Like Humans

no code implementations • 19 Mar 2024 • Zhipeng Huang, Zhizheng Zhang, Yiting Lu, Zheng-Jun Zha, Zhibo Chen, Baining Guo

In this paper, we explore this question and provide the answer "Yes!".

Instruction Following

Paper
Add Code

SeD: Semantic-Aware Discriminator for Image Super-Resolution

1 code implementation • 29 Feb 2024 • Bingchen Li, Xin Li, Hanxin Zhu, Yeying Jin, Ruoyu Feng, Zhizheng Zhang, Zhibo Chen

In particular, one discriminator is utilized to enable the SR network to learn the distribution of real-world high-quality images in an adversarial training manner.

Image Super-Resolution

Paper
Code

NaVid: Video-based VLM Plans the Next Step for Vision-and-Language Navigation

no code implementations • 24 Feb 2024 • Jiazhao Zhang, Kunyu Wang, Rongtao Xu, Gengze Zhou, Yicong Hong, Xiaomeng Fang, Qi Wu, Zhizheng Zhang, He Wang

Vision-and-Language Navigation (VLN) stands as a key research problem of Embodied AI, aiming at enabling agents to navigate in unseen environments following linguistic instructions.

Decision Making Instruction Following +3

Paper
Add Code

Reinforced UI Instruction Grounding: Towards a Generic UI Task Automation API

no code implementations • 7 Oct 2023 • Zhizheng Zhang, Wenxuan Xie, Xiaoyi Zhang, Yan Lu

In this work, we build a multimodal model to ground natural language instructions in given UI screenshots as a generic UI task automation executor.

Decoder document understanding +1

Paper
Add Code

Adaptive Frequency Filters As Efficient Global Token Mixers

2 code implementations • ICCV 2023 • Zhipeng Huang, Zhizheng Zhang, Cuiling Lan, Zheng-Jun Zha, Yan Lu, Baining Guo

With this insight, we propose Adaptive Frequency Filtering (AFF) token mixer.

119

Paper
Code

When and Why Momentum Accelerates SGD:An Empirical Study

no code implementations • 15 Jun 2023 • Jingwen Fu, Bohan Wang, Huishuai Zhang, Zhizheng Zhang, Wei Chen, Nanning Zheng

In the comparison of SGDM and SGD with the same effective learning rate and the same batch size, we observe a consistent pattern: when $\eta_{ef}$ is small, SGDM and SGD experience almost the same empirical training losses; when $\eta_{ef}$ surpasses a certain threshold, SGDM begins to perform better than SGD.

Paper
Add Code

Responsible Task Automation: Empowering Large Language Models as Responsible Task Automators

no code implementations • 2 Jun 2023 • Zhizheng Zhang, Xiaoyi Zhang, Wenxuan Xie, Yan Lu

In specific, we present Responsible Task Automation (ResponsibleTA) as a fundamental framework to facilitate responsible collaboration between LLM-based coordinators and executors for task automation with three empowered capabilities: 1) predicting the feasibility of the commands for executors; 2) verifying the completeness of executors; 3) enhancing the security (e. g., the protection of users' privacy).

Prompt Engineering

Paper
Add Code

Mask-Based Modeling for Neural Radiance Fields

1 code implementation • 11 Apr 2023 • Ganlin Yang, Guoqiang Wei, Zhizheng Zhang, Yan Lu, Dong Liu

Most Neural Radiance Fields (NeRFs) exhibit limited generalization capabilities, which restrict their applicability in representing multiple scenes using a single model.

Representation Learning

Paper
Code

Unifying Layout Generation with a Decoupled Diffusion Model

no code implementations • CVPR 2023 • Mude Hui, Zhizheng Zhang, Xiaoyi Zhang, Wenxuan Xie, Yuwang Wang, Yan Lu

Since different attributes have their individual semantics and characteristics, we propose to decouple the diffusion processes for them to improve the diversity of training samples and learn the reverse process jointly to exploit global-scope contexts for facilitating generation.

Paper
Add Code

Versatile Neural Processes for Learning Implicit Neural Representations

1 code implementation • 21 Jan 2023 • Zongyu Guo, Cuiling Lan, Zhizheng Zhang, Yan Lu, Zhibo Chen

In this paper, we propose an efficient NP framework dubbed Versatile Neural Processes (VNP), which largely increases the capability of approximating functions.

Decoder

Paper
Code

Template-guided Hierarchical Feature Restoration for Anomaly Detection

no code implementations • ICCV 2023 • Hewei Guo, Liping Ren, Jingjing Fu, Yuwang Wang, Zhizheng Zhang, Cuiling Lan, Haoqian Wang, Xinwen Hou

Targeting for detecting anomalies of various sizes for complicated normal patterns, we propose a Template-guided Hierarchical Feature Restoration method, which introduces two key techniques, bottleneck compression and template-guided compensation, for anomaly-free feature restoration.

Ranked #11 on Anomaly Detection on MVTec LOCO AD

Anomaly Detection

Paper
Add Code

Image Coding for Machines with Omnipotent Feature Learning

no code implementations • 5 Jul 2022 • Ruoyu Feng, Xin Jin, Zongyu Guo, Runsen Feng, Yixin Gao, Tianyu He, Zhizheng Zhang, Simeng Sun, Zhibo Chen

Learning a kind of feature that is both general (for AI tasks) and compact (for compression) is pivotal for its success.

Self-Supervised Learning

Paper
Add Code

Deep Frequency Filtering for Domain Generalization

no code implementations • CVPR 2023 • Shiqi Lin, Zhizheng Zhang, Zhipeng Huang, Yan Lu, Cuiling Lan, Peng Chu, Quanzeng You, Jiang Wang, Zicheng Liu, Amey Parulkar, Viraj Navkal, Zhibo Chen

Improving the generalization ability of Deep Neural Networks (DNNs) is critical for their practical uses, which has been a longstanding challenge.

Domain Generalization Retrieval

Paper
Add Code

Active Token Mixer

2 code implementations • 11 Mar 2022 • Guoqiang Wei, Zhizheng Zhang, Cuiling Lan, Yan Lu, Zhibo Chen

In this work, we propose an innovative token-mixer, dubbed Active Token Mixer (ATM), to actively incorporate flexible contextual information distributed across different channels from other tokens into the given query token.

Ranked #64 on Object Detection on COCO minival

Image Classification Instance Segmentation +2

119

Paper
Code

Mask-based Latent Reconstruction for Reinforcement Learning

1 code implementation • 28 Jan 2022 • Tao Yu, Zhizheng Zhang, Cuiling Lan, Yan Lu, Zhibo Chen

For deep reinforcement learning (RL) from pixels, learning effective state representations is crucial for achieving high performance.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Learning Cross-Scale Weighted Prediction for Efficient Neural Video Compression

1 code implementation • 26 Dec 2021 • Zongyu Guo, Runsen Feng, Zhizheng Zhang, Xin Jin, Zhibo Chen

Neural video codecs have demonstrated great potential in video transmission and storage applications.

Motion Compensation Optical Flow Estimation +2

Paper
Code

Lifelong Unsupervised Domain Adaptive Person Re-identification with Coordinated Anti-forgetting and Adaptation

no code implementations • CVPR 2022 • Zhipeng Huang, Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Peng Chu, Quanzeng You, Jiang Wang, Zicheng Liu, Zheng-Jun Zha

In this paper, to address more practical scenarios, we propose a new task, Lifelong Unsupervised Domain Adaptive (LUDA) person ReID.

Domain Adaptive Person Re-Identification Knowledge Distillation +4

Paper
Add Code

SelectAugment: Hierarchical Deterministic Sample Selection for Data Augmentation

no code implementations • 6 Dec 2021 • Shiqi Lin, Zhizheng Zhang, Xin Li, Wenjun Zeng, Zhibo Chen

Data augmentation (DA) has been widely investigated to facilitate model optimization in many tasks.

Data Augmentation Fine-Grained Image Recognition +3

Paper
Add Code

MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark

no code implementations • 30 Nov 2021 • Xiaotian Han, Quanzeng You, Chunyu Wang, Zhizheng Zhang, Peng Chu, Houdong Hu, Jiang Wang, Zicheng Liu

This dataset provides a more reliable benchmark of multi-camera, multi-object tracking systems in cluttered and crowded environments.

Ranked #2 on Object Tracking on MMPTRACK

Multi-Object Tracking Multiple People Tracking +1

Paper
Add Code

Confounder Identification-free Causal Visual Feature Learning

no code implementations • 26 Nov 2021 • Xin Li, Zhizheng Zhang, Guoqiang Wei, Cuiling Lan, Wenjun Zeng, Xin Jin, Zhibo Chen

In this paper, we propose a novel Confounder Identification-free Causal Visual Feature Learning (CICF) method, which obviates the need for identifying confounders.

Domain Generalization Meta-Learning

Paper
Add Code

Versatile Learned Video Compression

no code implementations • NeurIPS 2021 • Runsen Feng, Zongyu Guo, Zhizheng Zhang, Zhibo Chen

We show that the flow prediction module can largely reduce the transmission cost of voxel flows.

Motion Compensation MS-SSIM +2

Paper
Add Code

Pose-Guided Feature Learning with Knowledge Distillation for Occluded Person Re-Identification

no code implementations • 31 Jul 2021 • Kecheng Zheng, Cuiling Lan, Wenjun Zeng, Jiawei Liu, Zhizheng Zhang, Zheng-Jun Zha

Occluded person re-identification (ReID) aims to match person images with occlusion.

Knowledge Distillation Person Re-Identification

Paper
Add Code

Dual Aspect Self-Attention based on Transformer for Remaining Useful Life Prediction

1 code implementation • 30 Jun 2021 • Zhizheng Zhang, Wen Song, Qiqiang Li

While deep learning has achieved great success in RUL prediction, existing methods have difficulties in processing long sequences and extracting information from the sensor and time step aspects.

Decoder

Paper
Code

ToAlign: Task-oriented Alignment for Unsupervised Domain Adaptation

1 code implementation • NeurIPS 2021 • Guoqiang Wei, Cuiling Lan, Wenjun Zeng, Zhizheng Zhang, Zhibo Chen

Unsupervised domain adaptive classifcation intends to improve the classifcation performance on unlabeled target domain.

Unsupervised Domain Adaptation

Paper
Code

PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning

2 code implementations • NeurIPS 2021 • Tao Yu, Cuiling Lan, Wenjun Zeng, Mingxiao Feng, Zhizheng Zhang, Zhibo Chen

In this work, we propose a novel method, dubbed PlayVirtual, which augments cycle-consistent virtual trajectories to enhance the data efficiency for RL feature representation learning.

Ranked #1 on Continuous Control (100k environment steps) on DeepMind Finger Spin (Images)

Continuous Control (100k environment steps) Continuous Control (500k environment steps) +3

Paper
Code

Soft then Hard: Rethinking the Quantization in Neural Image Compression

no code implementations • 12 Apr 2021 • Zongyu Guo, Zhizheng Zhang, Runsen Feng, Zhibo Chen

Quantization is one of the core components in lossy image compression.

Image Compression Quantization

Paper
Add Code

Disentanglement-based Cross-Domain Feature Augmentation for Effective Unsupervised Domain Adaptive Person Re-identification

no code implementations • 25 Mar 2021 • Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Quanzeng You, Zicheng Liu, Kecheng Zheng, Zhibo Chen

Each recomposed feature, obtained based on the domain-invariant feature (which enables a reliable inheritance of identity) and an enhancement from a domain specific feature (which enables the approximation of real distributions), is thus an "ideal" augmentation.

Disentanglement Domain Adaptive Person Re-Identification +2

Paper
Add Code

Learned Block-based Hybrid Image Compression

no code implementations • 17 Dec 2020 • Yaojun Wu, Xin Li, Zhizheng Zhang, Xin Jin, Zhibo Chen

Recent works on learned image compression perform encoding and decoding processes in a full-resolution manner, resulting in two problems when deployed for practical applications.

Blocking Image Compression +2

Paper
Add Code

Exploiting Sample Uncertainty for Domain Adaptive Person Re-Identification

1 code implementation • 16 Dec 2020 • Kecheng Zheng, Cuiling Lan, Wenjun Zeng, Zhizheng Zhang, Zheng-Jun Zha

Based on this finding, we propose to exploit the uncertainty (measured by consistency levels) to evaluate the reliability of the pseudo-label of a sample and incorporate the uncertainty to re-weight its contribution within various ReID losses, including the identity (ID) classification loss per sample, the triplet loss, and the contrastive loss.

Clustering Domain Adaptive Person Re-Identification +3

141

Paper
Code

Learning Omni-frequency Region-adaptive Representations for Real Image Super-Resolution

no code implementations • 11 Dec 2020 • Xin Li, Xin Jin, Tao Yu, Yingxue Pang, Simeng Sun, Zhizheng Zhang, Zhibo Chen

Traditional single image super-resolution (SISR) methods that focus on solving single and uniform degradation (i. e., bicubic down-sampling), typically suffer from poor performance when applied into real-world low-resolution (LR) images due to the complicated realistic degradations.

Image Super-Resolution

Paper
Add Code

Causal Contextual Prediction for Learned Image Compression

no code implementations • 19 Nov 2020 • Zongyu Guo, Zhizheng Zhang, Runsen Feng, Zhibo Chen

In this paper, we propose the concept of separate entropy coding to leverage a serial decoding process for causal contextual entropy prediction in the latent space.

Image Compression MS-SSIM +1

Paper
Add Code

Uncertainty-Aware Few-Shot Image Classification

no code implementations • 9 Oct 2020 • Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Zhibo Chen, Shih-Fu Chang

In this work, we propose Uncertainty-Aware Few-Shot framework for image classification by modeling uncertainty of the similarities of query-support pairs and performing uncertainty-aware optimization.

Classification Few-Shot Image Classification +3

Paper
Add Code

Beyond Triplet Loss: Meta Prototypical N-tuple Loss for Person Re-identification

no code implementations • 8 Jun 2020 • Zhizheng Zhang, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen, Shih-Fu Chang

There is a lack of loss design which enables the joint optimization of multiple instances (of multiple classes) within per-query optimization for person ReID.

Classification General Classification +3

Paper
Add Code

Multi-scale Grouped Dense Network for VVC Intra Coding

no code implementations • 16 May 2020 • Xin Li, Simeng Sun, Zhizheng Zhang, Zhibo Chen

Versatile Video Coding (H. 266/VVC) standard achieves better image quality when keeping the same bits than any other conventional image codec, such as BPG, JPEG, and etc.

Generative Adversarial Network

Paper
Add Code

Multi-Granularity Reference-Aided Attentive Feature Aggregation for Video-based Person Re-identification

no code implementations • CVPR 2020 • Zhizheng Zhang, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen

In this paper, we propose an attentive feature aggregation module, namely Multi-Granularity Reference-aided Attentive Feature Aggregation (MG-RAFA), to delicately aggregate spatio-temporal features into a discriminative video-level feature representation.

Video-Based Person Re-Identification

Paper
Add Code

Region Normalization for Image Inpainting

1 code implementation • 23 Nov 2019 • Tao Yu, Zongyu Guo, Xin Jin, Shilin Wu, Zhibo Chen, Weiping Li, Zhizheng Zhang, Sen Liu

In this work, we show that the mean and variance shifts caused by full-spatial FN limit the image inpainting network training and we propose a spatial region-wise normalization named Region Normalization (RN) to overcome the limitation.

Image Inpainting

185

Paper
Code

On the Strong Equivalences of LPMLN Programs

no code implementations • 18 Sep 2019 • Bin Wang, Jun Shen, Shutao Zhang, Zhizheng Zhang

Firstly, we present the notions of p-strong and w-strong equivalences between LPMLN programs.

Paper
Add Code

On the Strong Equivalences for LPMLN Programs

no code implementations • 9 Sep 2019 • Bin Wang, Jun Shen, Shutao Zhang, Zhizheng Zhang

In this paper, we study the strong equivalence for LPMLN programs, which is an important tool for program rewriting and theoretical investigations in the field of logic programming.

Logic in Computer Science D.1.6

Paper
Add Code

A Coarse-to-Fine Framework for Learned Color Enhancement with Non-Local Attention

no code implementations • 8 Jun 2019 • Chaowei Shan, Zhizheng Zhang, Zhibo Chen

For current learned methods in this field, global harmonious perception and local details are hard to be well-considered in a single model simultaneously.

Paper
Add Code

CaseNet: Content-Adaptive Scale Interaction Networks for Scene Parsing

no code implementations • 17 Apr 2019 • Xin Jin, Cuiling Lan, Wen-Jun Zeng, Zhizheng Zhang, Zhibo Chen

We achieve this by the context interaction among the features of different scales.

Position Scene Parsing

Paper
Add Code

Relation-Aware Global Attention for Person Re-identification

1 code implementation • CVPR 2020 • Zhizheng Zhang, Cuiling Lan, Wen-Jun Zeng, Xin Jin, Zhibo Chen

For person re-identification (re-id), attention mechanisms have become attractive as they aim at strengthening discriminative features and suppressing irrelevant ones, which matches well the key of re-id, i. e., discriminative feature learning.

Clustering Image Classification +3

327

Paper
Code

Asynchronous Episodic Deep Deterministic Policy Gradient: Towards Continuous Control in Computationally Complex Environments

1 code implementation • 3 Mar 2019 • Zhizheng Zhang, Jiale Chen, Zhibo Chen, Weiping Li

Not limited to the control tasks in computationally complex environments, AE-DDPG also achieves higher rewards and 2- to 4-fold improvement in sample efficiency on average compared to other variants of DDPG in MuJoCo environments.

Continuous Control Reinforcement Learning (RL)

287

Paper
Code

Densely Semantically Aligned Person Re-Identification

no code implementations • CVPR 2019 • Zhizheng Zhang, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen

We propose a densely semantically aligned person re-identification framework.

Human Detection Person Re-Identification

Paper
Add Code

Learning to Run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments

2 code implementations • 2 Apr 2018 • Łukasz Kidziński, Sharada Prasanna Mohanty, Carmichael Ong, Zhewei Huang, Shuchang Zhou, Anton Pechenko, Adam Stelmaszczyk, Piotr Jarosik, Mikhail Pavlov, Sergey Kolesnikov, Sergey Plis, Zhibo Chen, Zhizheng Zhang, Jiale Chen, Jun Shi, Zhuobin Zheng, Chun Yuan, Zhihui Lin, Henryk Michalewski, Piotr Miłoś, Błażej Osiński, Andrew Melnik, Malte Schilling, Helge Ritter, Sean Carroll, Jennifer Hicks, Sergey Levine, Marcel Salathé, Scott Delp

In the NIPS 2017 Learning to Run challenge, participants were tasked with building a controller for a musculoskeletal model to make it run as fast as possible through an obstacle course.

reinforcement-learning Reinforcement Learning (RL)

124

Paper
Code

ESmodels: An Epistemic Specification Solver

no code implementations • 14 May 2014 • Zhizheng Zhang, Kaikai Zhao

(To appear in Theory and Practice of Logic Programming (TPLP)) ESmodels is designed and implemented as an experiment platform to investigate the semantics, language, related reasoning algorithms, and possible applications of epistemic specifications. We first give the epistemic specification language of ESmodels and its semantics.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.