Search Results for author: Wei-Shi Zheng

Found 125 papers, 54 papers with code

Dual Illumination Estimation for Robust Exposure Correction

2 code implementations • 30 Oct 2019 • Qing Zhang, Yongwei Nie, Wei-Shi Zheng

By performing dual illumination estimation, we obtain two intermediate exposure correction results for the input image, with one fixes the underexposed regions and the other one restores the overexposed regions.

Multi-Exposure Image Fusion

428

Paper
Code

Unsupervised Person Re-identification by Soft Multilabel Learning

1 code implementation • CVPR 2019 • Hong-Xing Yu, Wei-Shi Zheng, An-Cong Wu, Xiaowei Guo, Shaogang Gong, Jian-Huang Lai

To overcome this problem, we propose a deep model for the soft multilabel learning for unsupervised RE-ID.

Ranked #80 on Person Re-Identification on DukeMTMC-reID

Unsupervised Person Re-Identification

314

Paper
Code

DreamView: Injecting View-specific Text Guidance into Text-to-3D Generation

2 code implementations • 9 Apr 2024 • Junkai Yan, Yipeng Gao, Qize Yang, Xihan Wei, Xuansong Xie, AnCong Wu, Wei-Shi Zheng

Text-to-3D generation, which synthesizes 3D assets according to an overall text description, has significantly progressed.

3D Generation Text to 3D

131

Paper
Code

MIST: Multiple Instance Self-Training Framework for Video Anomaly Detection

1 code implementation • CVPR 2021 • Jia-Chang Feng, Fa-Ting Hong, Wei-Shi Zheng

Weakly supervised video anomaly detection (WS-VAD) is to distinguish anomalies from normal events based on discriminative representations.

Ranked #5 on Anomaly Detection In Surveillance Videos on ShanghaiTech Weakly Supervised

Anomaly Detection In Surveillance Videos Pseudo Label +1

114

Paper
Code

DilateFormer: Multi-Scale Dilated Transformer for Visual Recognition

1 code implementation • 3 Feb 2023 • Jiayu Jiao, Yu-Ming Tang, Kun-Yu Lin, Yipeng Gao, Jinhua Ma, YaoWei Wang, Wei-Shi Zheng

In this work, we explore effective Vision Transformers to pursue a preferable trade-off between the computational complexity and size of the attended receptive field.

Instance Segmentation object-detection +2

Paper
Code

Rethinking Temporal Fusion for Video-based Person Re-identification on Semantic and Time Aspect

2 code implementations • 28 Nov 2019 • Xinyang Jiang, Yifei Gong, Xiaowei Guo, Qize Yang, Feiyue Huang, Wei-Shi Zheng, Feng Zheng, Xing Sun

Recently, the research interest of person re-identification (ReID) has gradually turned to video-based methods, which acquire a person representation by aggregating frame features of an entire video.

Video-Based Person Re-Identification

Paper
Code

Devil's in the Details: Aligning Visual Clues for Conditional Embedding in Person Re-Identification

1 code implementation • 11 Sep 2020 • Fufu Yu, Xinyang Jiang, Yifei Gong, Shizhen Zhao, Xiaowei Guo, Wei-Shi Zheng, Feng Zheng, Xing Sun

Secondly, the Conditional Feature Embedding requires the overall feature of a query image to be dynamically adjusted based on the gallery image it matches, while most of the existing methods ignore the reference images.

Ranked #1 on Person Re-Identification on CUHK03-C

Person Re-Identification

Paper
Code

One for More: Selecting Generalizable Samples for Generalizable ReID Model

1 code implementation • 10 Dec 2020 • Enwei Zhang, Xinyang Jiang, Hao Cheng, AnCong Wu, Fufu Yu, Ke Li, Xiaowei Guo, Feng Zheng, Wei-Shi Zheng, Xing Sun

Current training objectives of existing person Re-IDentification (ReID) models only ensure that the loss of the model decreases on selected training batch, with no regards to the performance on samples outside the batch.

Person Re-Identification

Paper
Code

Learning 3D Shape Feature for Texture-Insensitive Person Re-Identification

1 code implementation • CVPR 2021 • Jiaxing Chen, Xinyang Jiang, Fudong Wang, Jun Zhang, Feng Zheng, Xing Sun, Wei-Shi Zheng

In this paper, rather than relying on texture based information, we propose to improve the robustness of person ReID against clothing texture by exploiting the information of a person's 3D shape.

Ranked #4 on Person Re-Identification on PRCC

3D Reconstruction Person Re-Identification

Paper
Code

Weakly supervised discriminative feature learning with state information for person identification

2 code implementations • CVPR 2020 • Hong-Xing Yu, Wei-Shi Zheng

We evaluate our model on unsupervised person re-identification and pose-invariant face recognition.

Face Recognition Person Identification +3

Paper
Code

AcroFOD: An Adaptive Method for Cross-domain Few-shot Object Detection

1 code implementation • 22 Sep 2022 • Yipeng Gao, Lingxiao Yang, Yunmu Huang, Song Xie, Shiyong Li, Wei-Shi Zheng

Under the domain shift, cross-domain few-shot object detection aims to adapt object detectors in the target domain with a few annotated target data.

Cross-Domain Few-Shot Data Augmentation +2

Paper
Code

Fully Convolutional Network Ensembles for White Matter Hyperintensities Segmentation in MR Images

2 code implementations • 14 Feb 2018 • Hongwei Li, Gongfa Jiang, Jian-Guo Zhang, Ruixuan Wang, Zhaolei Wang, Wei-Shi Zheng, Bjoern Menze

In this paper, we present a study using deep fully convolutional network and ensemble models to automatically detect such WMH using fluid attenuation inversion recovery (FLAIR) and T1 magnetic resonance (MR) scans.

Data Augmentation

Paper
Code

Do Not Disturb Me: Person Re-identification Under the Interference of Other Pedestrians

1 code implementation • ECCV 2020 • Shizhen Zhao, Changxin Gao, Jun Zhang, Hao Cheng, Chuchu Han, Xinyang Jiang, Xiaowei Guo, Wei-Shi Zheng, Nong Sang, Xing Sun

In the conventional person Re-ID setting, it is widely assumed that cropped person images are for each individual.

Person Re-Identification Retrieval

Paper
Code

Cross-modal Consensus Network for Weakly Supervised Temporal Action Localization

2 code implementations • 27 Jul 2021 • Fa-Ting Hong, Jia-Chang Feng, Dan Xu, Ying Shan, Wei-Shi Zheng

In this work, we argue that the features extracted from the pretrained extractor, e. g., I3D, are not the WS-TALtask-specific features, thus the feature re-calibration is needed for reducing the task-irrelevant information redundancy.

Ranked #1 on Weakly-supervised Temporal Action Localization on THUMOS’14

Weakly Supervised Action Localization Weakly-supervised Temporal Action Localization +1

Paper
Code

Cross-modal Consensus Network forWeakly Supervised Temporal Action Localization

1 code implementation • Proceedings of the 29th ACM International Conference on Multimedia 2021 • Fa-Ting Hong, Jia-Chang Feng, Dan Xu, Ying Shan, Wei-Shi Zheng

Ranked #1 on Weakly Supervised Temporal Action Localization on THUMOS14

Weakly-supervised Temporal Action Localization Weakly Supervised Temporal Action Localization

Paper
Code

Unsupervised Learning for Optical Flow Estimation Using Pyramid Convolution LSTM

1 code implementation • 26 Jul 2019 • Shuosen Guan, Haoxin Li, Wei-Shi Zheng

Most of current Convolution Neural Network (CNN) based methods for optical flow estimation focus on learning optical flow on synthetic datasets with groundtruth, which is not practical.

Action Recognition Optical Flow Estimation

Paper
Code

Viewpoint-Aware Loss with Angular Regularization for Person Re-Identification

1 code implementation • 3 Dec 2019 • Zhihui Zhu, Xinyang Jiang, Feng Zheng, Xiaowei Guo, Feiyue Huang, Wei-Shi Zheng, Xing Sun

Instead of one subspace for each viewpoint, our method projects the feature from different viewpoints into a unified hypersphere and effectively models the feature distribution on both the identity-level and the viewpoint-level.

Ranked #5 on Person Re-Identification on Market-1501 (using extra training data)

Person Re-Identification

Paper
Code

Adapter Learning in Pretrained Feature Extractor for Continual Learning of Diseases

1 code implementation • 18 Apr 2023 • Wentao Zhang, Yujun Huang, Tong Zhang, Qingsong Zou, Wei-Shi Zheng, Ruixuan Wang

In particular, updating an intelligent diagnosis system with training data of new diseases would cause catastrophic forgetting of old disease knowledge.

Continual Learning

Paper
Code

Shape-Erased Feature Learning for Visible-Infrared Person Re-Identification

1 code implementation • CVPR 2023 • Jiawei Feng, AnCong Wu, Wei-Shi Zheng

To this end, we propose shape-erased feature learning paradigm that decorrelates modality-shared features in two orthogonal subspaces.

Person Re-Identification

Paper
Code

Selective Hourglass Mapping for Universal Image Restoration Based on Diffusion Model

1 code implementation • 17 Mar 2024 • Dian Zheng, Xiao-Ming Wu, Shuzhou Yang, Jian Zhang, Jian-Fang Hu, Wei-Shi Zheng

Universal image restoration is a practical and potential computer vision task for real-world applications.

Image Restoration Zero-shot Generalization

Paper
Code

Squeeze-and-Attention Networks for Semantic Segmentation

1 code implementation • CVPR 2020 • Zilong Zhong, Zhong Qiu Lin, Rene Bidart, Xiaodan Hu, Ibrahim Ben Daya, Zhifeng Li, Wei-Shi Zheng, Jonathan Li, Alexander Wong

The recent integration of attention mechanisms into segmentation networks improves their representational capabilities through a great emphasis on more informative features.

Ranked #6 on Semantic Segmentation on PASCAL VOC 2012 test

Segmentation Semantic Segmentation

Paper
Code

AsyFOD: An Asymmetric Adaptation Paradigm for Few-Shot Domain Adaptive Object Detection

1 code implementation • CVPR 2023 • Yipeng Gao, Kun-Yu Lin, Junkai Yan, YaoWei Wang, Wei-Shi Zheng

Critically, in FSDAOD, the data-scarcity in the target domain leads to an extreme data imbalance between the source and target domains, which potentially causes over-adaptation in traditional feature alignment.

object-detection Object Detection

Paper
Code

A Versatile Framework for Multi-scene Person Re-identification

1 code implementation • 17 Mar 2024 • Wei-Shi Zheng, Junkai Yan, Yi-Xing Peng

To overcome significant variations between images across camera views, mountains of variants of ReID models were developed for solving a number of challenges, such as resolution change, clothing change, occlusion, modality change, and so on.

Data Augmentation Person Re-Identification +1

Paper
Code

Cross-view Asymmetric Metric Learning for Unsupervised Person Re-identification

1 code implementation • ICCV 2017 • Hong-Xing Yu, An-Cong Wu, Wei-Shi Zheng

While metric learning is important for Person re-identification (RE-ID), a significant problem in visual surveillance for cross-view pedestrian matching, existing metric models for RE-ID are mostly based on supervised learning that requires quantities of labeled samples in all pairs of camera views for training.

Ranked #117 on Person Re-Identification on Market-1501

Clustering Metric Learning +1

Paper
Code

SIOD: Single Instance Annotated Per Category Per Image for Object Detection

1 code implementation • CVPR 2022 • Hanjun Li, Xingjia Pan, Ke Yan, Fan Tang, Wei-Shi Zheng

Object detection under imperfect data receives great attention recently.

Contrastive Learning Object +4

Paper
Code

Unsupervised Person Re-identification by Deep Asymmetric Metric Embedding

1 code implementation • 29 Jan 2019 • Hong-Xing Yu, An-Cong Wu, Wei-Shi Zheng

In such a way, DECAMEL jointly learns the feature representation and the unsupervised asymmetric metric.

Clustering Deep Clustering +2

Paper
Code

Estimator Meets Equilibrium Perspective: A Rectified Straight Through Estimator for Binary Neural Networks Training

1 code implementation • ICCV 2023 • Xiao-Ming Wu, Dian Zheng, Zuhao Liu, Wei-Shi Zheng

The pioneering work BinaryConnect uses Straight Through Estimator (STE) to mimic the gradients of the sign function, but it also causes the crucial inconsistency problem.

Binarization

Paper
Code

Cross-Camera Feature Prediction for Intra-Camera Supervised Person Re-identification across Distant Scenes

1 code implementation • 29 Jul 2021 • Wenhang Ge, Chunyan Pan, AnCong Wu, Hongwei Zheng, Wei-Shi Zheng

To learn camera-invariant representation from cross-camera unpaired training data, we propose a cross-camera feature prediction method to mine cross-camera self supervision information from camera-specific feature distribution by transforming fake cross-camera positive feature pairs and minimize the distances of the fake pairs.

Person Re-Identification

Paper
Code

ASAG: Building Strong One-Decoder-Layer Sparse Detectors via Adaptive Sparse Anchor Generation

1 code implementation • ICCV 2023 • Shenghao Fu, Junkai Yan, Yipeng Gao, Xiaohua Xie, Wei-Shi Zheng

We find that the architecture discrepancy between dense and sparse detectors leads to feature conflict, hampering the performance of one-decoder-layer detectors.

Paper
Code

Learning Multi-Attention Context Graph for Group-Based Re-Identification

1 code implementation • 29 Apr 2021 • Yichao Yan, Jie Qin, Bingbing Ni, Jiaxin Chen, Li Liu, Fan Zhu, Wei-Shi Zheng, Xiaokang Yang, Ling Shao

Extensive experiments on the novel dataset as well as three existing datasets clearly demonstrate the effectiveness of the proposed framework for both group-based re-id tasks.

Person Re-Identification

Paper
Code

Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training

1 code implementation • 3 Nov 2023 • Yipeng Gao, Zeyu Wang, Wei-Shi Zheng, Cihang Xie, Yuyin Zhou

Contrastive learning has emerged as a promising paradigm for 3D open-world understanding, i. e., aligning point cloud representation to image and text embedding space individually.

Ranked #1 on Zero-shot 3D classification on Objaverse LVIS (using extra training data)

Contrastive Learning Retrieval +3

Paper
Code

Hybrid Dynamic-static Context-aware Attention Network for Action Assessment in Long Videos

2 code implementations • 13 Aug 2020 • Ling-An Zeng, Fa-Ting Hong, Wei-Shi Zheng, Qi-Zhi Yu, Wei Zeng, Yao-Wei Wang, Jian-Huang Lai

However, most existing works focus only on video dynamic information (i. e., motion information) but ignore the specific postures that an athlete is performing in a video, which is important for action assessment in long videos.

Ranked #2 on Action Quality Assessment on Rhythmic Gymnastic

Action Assessment Action Quality Assessment

Paper
Code

Learning to Imagine: Diversify Memory for Incremental Learning using Unlabeled Data

1 code implementation • CVPR 2022 • Yu-Ming Tang, Yi-Xing Peng, Wei-Shi Zheng

The diverse generated samples could effectively prevent DNN from forgetting when learning new tasks.

Contrastive Learning Incremental Learning

Paper
Code

MIXGAN: Learning Concepts from Different Domains for Mixture Generation

1 code implementation • 4 Jul 2018 • Guang-Yuan Hao, Hong-Xing Yu, Wei-Shi Zheng

In this work, we present an interesting attempt on mixture generation: absorbing different image concepts (e. g., content and style) from different domains and thus generating a new domain with learned concepts.

Generative Adversarial Network Translation

Paper
Code

When Prompt-based Incremental Learning Does Not Meet Strong Pretraining

1 code implementation • ICCV 2023 • Yu-Ming Tang, Yi-Xing Peng, Wei-Shi Zheng

However, existing prompt-based methods heavily rely on strong pretraining (typically trained on ImageNet-21k), and we find that their models could be trapped if the potential gap between the pretraining task and unknown future tasks is large.

Incremental Learning Retrieval

Paper
Code

Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition

1 code implementation • 3 Mar 2024 • Kun-Yu Lin, Henghui Ding, Jiaming Zhou, Yi-Xing Peng, Zhilin Zhao, Chen Change Loy, Wei-Shi Zheng

To answer this, we establish a CROSS-domain Open-Vocabulary Action recognition benchmark named XOV-Action, and conduct a comprehensive evaluation of five state-of-the-art CLIP-based video learners under various types of domain gaps.

Open Vocabulary Action Recognition

Paper
Code

Human Co-Parsing Guided Alignment for Occluded Person Re-identification

1 code implementation • IEEE Transactions on Image Processing 2022 • Shuguang Dou, Cairong Zhao, Xinyang Jiang, Shanshan Zhang, Wei-Shi Zheng, WangMeng Zuo

Most supervised methods propose to train an extra human parsing model aside from the ReID model with cross-domain human parts annotation, suffering from expensive annotation cost and domain gap; Unsupervised methods integrate a feature clustering-based human parsing process into the ReID model, but lacking supervision signals brings less satisfactory segmentation results.

Ranked #3 on Person Re-Identification on Occluded-DukeMTMC

Human Parsing Person Re-Identification

Paper
Code

Person Re-identification by Contour Sketch under Moderate Clothing Change

2 code implementations • 6 Feb 2020 • Qize Yang, An-Cong Wu, Wei-Shi Zheng

Substantial development of re-id has recently been observed, and the majority of existing models are largely dependent on color appearance and assume that pedestrians do not change their clothes across camera views.

Person Re-Identification

Paper
Code

Discriminator-Free Generative Adversarial Attack

1 code implementation • 20 Jul 2021 • ShaoHao Lu, Yuqiao Xian, Ke Yan, Yi Hu, Xing Sun, Xiaowei Guo, Feiyue Huang, Wei-Shi Zheng

The Deep Neural Networks are vulnerable toadversarial exam-ples(Figure 1), making the DNNs-based systems collapsed byadding the inconspicuous perturbations to the images.

Adversarial Attack Disentanglement

Paper
Code

NECA: Neural Customizable Human Avatar

1 code implementation • 15 Mar 2024 • Junjin Xiao, Qing Zhang, Zhan Xu, Wei-Shi Zheng

The core of our approach is to represent humans in complementary dual spaces and predict disentangled neural fields of geometry, albedo, shadow, as well as an external lighting, from which we are able to derive realistic rendering with high-frequency details via volumetric rendering.

Paper
Code

Combined Depth Space based Architecture Search For Person Re-identification

1 code implementation • CVPR 2021 • Hanjun Li, Gaojie Wu, Wei-Shi Zheng

We propose a novel search space called Combined Depth Space (CDS), based on which we search for an efficient network architecture, which we call CDNet, via a differentiable architecture search algorithm.

Image Classification Person Re-Identification

Paper
Code

Adaptive Interaction Modeling via Graph Operations Search

1 code implementation • CVPR 2020 • Haoxin Li, Wei-Shi Zheng, Yu Tao, Haifeng Hu, Jian-Huang Lai

We propose to search the network structures with differentiable architecture search mechanism, which learns to construct adaptive structures for different videos to facilitate adaptive interaction modeling.

Action Analysis

Paper
Code

Cross-Camera Trajectories Help Person Retrieval in a Camera Network

1 code implementation • 27 Apr 2022 • Xin Zhang, Xiaohua Xie, JianHuang Lai, Wei-Shi Zheng

To address this issue, we propose a pedestrian retrieval framework based on cross-camera trajectory generation, which integrates both temporal and spatial information.

Person Retrieval Re-Ranking +1

Paper
Code

SNN2ANN: A Fast and Memory-Efficient Training Framework for Spiking Neural Networks

1 code implementation • 19 Jun 2022 • Jianxiong Tang, JianHuang Lai, Xiaohua Xie, Lingxiao Yang, Wei-Shi Zheng

The SNN2ANN consists of 2 components: a) a weight sharing architecture between ANN and SNN and b) spiking mapping units.

Paper
Code

Task-oriented Self-supervised Learning for Anomaly Detection in Electroencephalography

1 code implementation • 4 Jul 2022 • Yaojia Zheng, Zhouwu Liu, Rong Mo, Ziyi Chen, Wei-Shi Zheng, Ruixuan Wang

Compared to supervised learning with labelled disease EEG data which can train a model to analyze specific diseases but would fail to monitor previously unseen statuses, anomaly detection based on only normal EEGs can detect any potential anomaly in new EEGs.

Anomaly Detection EEG +1

Paper
Code

Weakly Supervised Text-Based Person Re-Identification

1 code implementation • ICCV 2021 • Shizhen Zhao, Changxin Gao, Yuanjie Shao, Wei-Shi Zheng, Nong Sang

Specifically, to alleviate the intra-class variations, a clustering method is utilized to generate pseudo labels for both visual and textual instances.

Clustering Person Re-Identification +1

Paper
Code

Multimodal Action Quality Assessment

1 code implementation • 31 Jan 2024 • Ling-An Zeng, Wei-Shi Zheng

To leverage multimodal information for AQA, i. e., RGB, optical flow and audio information, we propose a Progressive Adaptive Multimodal Fusion Network (PAMFN) that separately models modality-specific information and mixed-modality information.

Action Quality Assessment Optical Flow Estimation

Paper
Code

Learning to Detect Important People in Unlabelled Images for Semi-supervised Important People Detection

1 code implementation • CVPR 2020 • Fa-Ting Hong, Wei-Hong Li, Wei-Shi Zheng

Important people detection is to automatically detect the individuals who play the most important roles in a social event image, which requires the designed model to understand a high-level pattern.

Object Recognition Pseudo Label

Paper
Code

Revisit PCA-based Technique for Out-of-Distribution Detection

1 code implementation • ICCV 2023 • Xiaoyuan Guan, Zhouwu Liu, Wei-Shi Zheng, Yuren Zhou, Ruixuan Wang

Out-of-distribution (OOD) detection is a desired ability to ensure the reliability and safety of intelligent systems.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Paper
Code

Efficient and Effective Weakly-Supervised Action Segmentation via Action-Transition-Aware Boundary Alignment

1 code implementation • 28 Mar 2024 • Angchi Xu, Wei-Shi Zheng

Weakly-supervised action segmentation is a task of learning to partition a long video into several action segments, where training videos are only accompanied by transcripts (ordered list of actions).

Action Segmentation Segmentation

Paper
Code

Long-Term Human Motion Prediction by Modeling Motion Context and Enhancing Motion Dynamic

no code implementations • 7 May 2018 • Yongyi Tang, Lin Ma, Wei Liu, Wei-Shi Zheng

Human motion prediction aims at generating future frames of human motion based on an observed sequence of skeletons.

Human motion prediction motion prediction

Paper
Add Code

A Coordinate-wise Optimization Algorithm for Sparse Inverse Covariance Selection

no code implementations • 19 Nov 2017 • Ganzhao Yuan, Haoxian Tan, Wei-Shi Zheng

Sparse inverse covariance selection is a fundamental problem for analyzing dependencies in high dimensional data.

Paper
Add Code

Deep CNNs for HEp-2 Cells Classification : A Cross-specimen Analysis

no code implementations • 20 Apr 2016 • Hongwei Li, Wei-Shi Zheng, JianGuo Zhang

Automatic classification of Human Epithelial Type-2 (HEp-2) cells staining patterns is an important and yet a challenging problem.

Classification General Classification

Paper
Add Code

Adversarial Attribute-Image Person Re-identification

no code implementations • 5 Dec 2017 • Zhou Yin, Wei-Shi Zheng, An-Cong Wu, Hong-Xing Yu, Hai Wan, Xiaowei Guo, Feiyue Huang, Jian-Huang Lai

While attributes have been widely used for person re-identification (Re-ID) which aims at matching the same person images across disjoint camera views, they are used either as extra features or for performing multi-task learning to assist the image-image matching task.

Attribute Multi-Task Learning +1

Paper
Add Code

One-pass Person Re-identification by Sketch Online Discriminant Analysis

no code implementations • 9 Nov 2017 • Wei-Hong Li, Zhuowei Zhong, Wei-Shi Zheng

While there is a few work on discussing online re-id, most of them require considerable storage of all passed data samples that have been ever observed, and this could be unrealistic for processing data from a large camera network.

Person Re-Identification

Paper
Add Code

PersonRank: Detecting Important People in Images

no code implementations • 6 Nov 2017 • Wei-Hong Li, Benchao Li, Wei-Shi Zheng

Always, some individuals in images are more important/attractive than others in some events such as presentation, basketball game or speech.

Paper
Add Code

Latent Embeddings for Collective Activity Recognition

no code implementations • 20 Sep 2017 • Yongyi Tang, Peizhen Zhang, Jian-Fang Hu, Wei-Shi Zheng

Rather than simply recognizing the action of a person individually, collective activity recognition aims to find out what a group of people is acting in a collective scene.

Activity Recognition

Paper
Add Code

Online Hashing

no code implementations • 6 Apr 2017 • Long-Kai Huang, Qiang Yang, Wei-Shi Zheng

Specifically, a new loss function is proposed to measure the similarity loss between a pair of data samples in hamming space.

Paper
Add Code

Robust Depth-based Person Re-identification

no code implementations • 28 Mar 2017 • Ancong Wu, Wei-Shi Zheng, Jian-Huang Lai

More specifically, we exploit depth voxel covariance descriptor and further propose a locally rotation invariant depth shape descriptor called Eigen-depth feature to describe pedestrian body shape.

Person Re-Identification

Paper
Add Code

Person Re-Identification by Camera Correlation Aware Feature Augmentation

no code implementations • 26 Mar 2017 • Ying-Cong Chen, Xiatian Zhu, Wei-Shi Zheng, Jian-Huang Lai

The challenge of person re-identification (re-id) is to match individual images of the same person captured by different non-overlapping camera views against significant and unknown cross-view feature distortion.

Person Re-Identification

Paper
Add Code

Embedding Deep Metric for Person Re-identication A Study Against Large Variations

no code implementations • 1 Nov 2016 • Hailin Shi, Yang Yang, Xiangyu Zhu, Shengcai Liao, Zhen Lei, Wei-Shi Zheng, Stan Z. Li

From this point of view, selecting suitable positive i. e. intra-class) training samples within a local range is critical for training the CNN embedding, especially when the data has large intra-class variations.

Person Re-Identification

Paper
Add Code

Top-push Video-based Person Re-identification

no code implementations • CVPR 2016 • Jin-Jie You, An-Cong Wu, Xiang Li, Wei-Shi Zheng

Since only limited information can be exploited from still images, it is hard (if not impossible) to overcome the occlusion, pose and camera-view change, and lighting variation problems.

Video-Based Person Re-Identification

Paper
Add Code

An Enhanced Deep Feature Representation for Person Re-identification

no code implementations • 26 Apr 2016 • Shangxuan Wu, Ying-Cong Chen, Xiang Li, An-Cong Wu, Jin-Jie You, Wei-Shi Zheng

In this paper, we focus on the feature representation and claim that hand-crafted histogram features can be complementary to Convolutional Neural Network (CNN) features.

Metric Learning Person Re-Identification

Paper
Add Code

Human Re-identification by Matching Compositional Template with Cluster Sampling

no code implementations • 1 Feb 2015 • Yuanlu Xu, Liang Lin, Wei-Shi Zheng, Xiaobai Liu

This paper aims at a newly raising task in visual surveillance: re-identifying people at a distance by matching body information, given several reference examples.

Person Re-Identification

Paper
Add Code

Adversarial Open-World Person Re-Identification

no code implementations • ECCV 2018 • Xiang Li, An-Cong Wu, Wei-Shi Zheng

The main idea is learning to attack feature extractor on the target people by using GAN to generate very target-like images (imposters), and in the meantime the model will make the feature extractor learn to tolerate the attack by discriminative learning so as to realize group-based verification.

Person Re-Identification

Paper
Add Code

Improving Fast Segmentation With Teacher-student Learning

no code implementations • 19 Oct 2018 • Jiafeng Xie, Bing Shuai, Jian-Fang Hu, Jingyang Lin, Wei-Shi Zheng

Recently, segmentation neural networks have been significantly improved by demonstrating very promising accuracies on public benchmarks.

Segmentation

Paper
Add Code

Accelerating Large Scale Knowledge Distillation via Dynamic Importance Sampling

no code implementations • 3 Dec 2018 • Minghan Li, Tanli Zuo, Ruicheng Li, Martha White, Wei-Shi Zheng

Knowledge distillation is an effective technique that transfers knowledge from a large teacher model to a shallow student.

Knowledge Distillation Machine Translation +2

Paper
Add Code

Group-Attention Single-Shot Detector (GA-SSD): Finding Pulmonary Nodules in Large-Scale CT Images

no code implementations • 18 Dec 2018 • Jiechao Ma, Xiang Li, Hongwei Li, Bjoern H. Menze, Sen Liang, Rongguo Zhang, Wei-Shi Zheng

In this paper, we propose a novel and effective abnormality detector implementing the attention mechanism and group convolution on 3D single-shot detector (SSD) called group-attention SSD (GA-SSD).

Computed Tomography (CT) Finding Pulmonary Nodules In Large-Scale Ct Images

Paper
Add Code

Deep Bilinear Learning for RGB-D Action Recognition

no code implementations • ECCV 2018 • Jian-Fang Hu, Wei-Shi Zheng, Jia-Hui Pan, Jian-Huang Lai, Jian-Guo Zhang

In this paper, we focus on exploring modality-temporal mutual information for RGB-D action recognition.

Action Recognition Temporal Action Localization

Paper
Add Code

A Matrix Splitting Method for Composite Function Minimization

no code implementations • CVPR 2017 • Ganzhao Yuan, Wei-Shi Zheng, Bernard Ghanem

Incorporating a new Gaussian elimination procedure, the matrix splitting method achieves state-of-the-art performance.

Paper
Add Code

Multi-Scale Learning for Low-Resolution Person Re-Identification

no code implementations • ICCV 2015 • Xiang Li, Wei-Shi Zheng, Xiaojuan Wang, Tao Xiang, Shaogang Gong

In real world person re-identification (re-id), images of people captured at very different resolutions from different locations need be matched.

Person Re-Identification

Paper
Add Code

Partial Person Re-Identification

no code implementations • ICCV 2015 • Wei-Shi Zheng, Xiang Li, Tao Xiang, Shengcai Liao, Jian-Huang Lai, Shaogang Gong

We address a new partial person re-identification (re-id) problem, where only a partial observation of a person is available for matching across different non-overlapping camera views.

Person Re-Identification

Paper
Add Code

RGB-Infrared Cross-Modality Person Re-Identification

no code implementations • ICCV 2017 • Ancong Wu, Wei-Shi Zheng, Hong-Xing Yu, Shaogang Gong, Jian-Huang Lai

To that end, matching RGB images with infrared images is required, which are heterogeneous with very different visual characteristics.

Ranked #4 on Cross-Modal Person Re-Identification on SYSU-MM01 (mAP (All-search & Single-shot) metric)

Cross-Modality Person Re-identification Cross-Modal Person Re-Identification

Paper
Add Code

A Decomposition Algorithm for the Sparse Generalized Eigenvalue Problem

no code implementations • CVPR 2019 • Ganzhao Yuan, Li Shen, Wei-Shi Zheng

The sparse generalized eigenvalue problem arises in a number of standard and modern statistical learning models, including sparse principal component analysis, sparse Fisher discriminant analysis, and sparse canonical correlation analysis.

Numerical Analysis

Paper
Add Code

Pedestrian re-identification based on Tree branch network with local and global learning

no code implementations • 31 Mar 2019 • Hui Li, Meng Yang, Zhihui Lai, Wei-Shi Zheng, Zitong Yu

Deep part-based methods in recent literature have revealed the great potential of learning local part-level representation for pedestrian image in the task of person re-identification.

Person Re-Identification

Paper
Add Code

Weakly Supervised Person Re-Identification

no code implementations • CVPR 2019 • Jingke Meng, Sheng Wu, Wei-Shi Zheng

In the conventional person re-id setting, it is assumed that the labeled images are the person images within the bounding box for each individual; this labeling across multiple nonoverlapping camera views from raw video surveillance is costly and time-consuming.

Multi-Label Learning Person Re-Identification

Paper
Add Code

Learning to Learn Relation for Important People Detection in Still Images

1 code implementation • CVPR 2019 • Wei-Hong Li, Fa-Ting Hong, Wei-Shi Zheng

In this work, we propose a deep imPOrtance relatIon NeTwork (POINT) that combines both relation modeling and feature learning.

Relation Relation Network

Paper
Code

A Large-scale Varying-view RGB-D Action Dataset for Arbitrary-view Human Action Recognition

no code implementations • 24 Apr 2019 • Yanli Ji, Feixiang Xu, Yang Yang, Fumin Shen, Heng Tao Shen, Wei-Shi Zheng

Besides, we propose a View-guided Skeleton CNN (VS-CNN) to tackle the problem of arbitrary-view action recognition.

Ranked #1 on Skeleton Based Action Recognition on Varying-view RGB-D Action-Skeleton

Action Analysis Action Recognition +2

Paper
Add Code

Weakly Supervised Open-set Domain Adaptation by Dual-domain Collaboration

no code implementations • CVPR 2019 • Shuhan Tan, Jiening Jiao, Wei-Shi Zheng

Thus, it is meaningful to let partially labeled domains learn from each other to classify all the unlabeled samples in each domain under an open-set setting.

Domain Adaptation Transfer Learning

Paper
Add Code

Towards Photo-Realistic Visible Watermark Removal with Conditional Generative Adversarial Networks

no code implementations • 30 May 2019 • Xiang Li, Chan Lu, Danni Cheng, Wei-Hong Li, Mei Cao, Bo Liu, Jiechao Ma, Wei-Shi Zheng

Visible watermark plays an important role in image copyright protection and the robustness of a visible watermark to an attack is shown to be essential.

Image-to-Image Translation

Paper
Add Code

Deep Dual Relation Modeling for Egocentric Interaction Recognition

no code implementations • CVPR 2019 • Haoxin Li, Yijun Cai, Wei-Shi Zheng

To exploit the strong relations for egocentric interaction recognition, we introduce a dual relation modeling framework which learns to model the relations between the camera wearer and the interactor based on the individual action representations of the two persons.

Relation

Paper
Add Code

Cross-view Relation Networks for Mammogram Mass Detection

no code implementations • 1 Jul 2019 • Jiechao Ma, Sen Liang, Xiang Li, Hongwei Li, Bjoern H. Menze, Rongguo Zhang, Wei-Shi Zheng

Mammogram is the most effective imaging modality for the mass lesion detection of breast cancer at the early stage.

Lesion Detection Relation

Paper
Add Code

Enhancing Underexposed Photos using Perceptually Bidirectional Similarity

no code implementations • 25 Jul 2019 • Qing Zhang, Yongwei Nie, Lei Zhu, Chunxia Xiao, Wei-Shi Zheng

To obtain high-quality results free of these artifacts, we present a novel underexposed photo enhancement approach that is able to maintain the perceptual consistency.

Video Enhancement

Paper
Add Code

Jointly learning heterogeneous features for rgb-d activity recognition

no code implementations • IEEE Transactions on Pattern Analysis and Machine Intelligence ( Volume: 39 , Issue: 11 , Nov. 1 2017 ) 2016 • Jian-Fang Hu, Wei-Shi Zheng, Jian-Huang Lai, Jian-Guo Zhang

The proposed model formed in a unified framework is capable of: 1) jointly mining a set of subspaces with the same dimensionality to exploit latent shared features across different feature channels, 2) meanwhile, quantifying the shared and feature-specific components of features in the subspaces, and 3) transferring feature-specific intermediate transforms (i-transforms) for learning fusion of heterogeneous features across datasets.

Ranked #8 on Skeleton Based Action Recognition on SYSU 3D

Activity Recognition Benchmarking +3

Paper
Add Code

Early action prediction by soft regression

no code implementations • IEEE Transactions on Pattern Analysis and Machine Intelligence 2018 • Jian-Fang Hu, Wei-Shi Zheng, Lianyang Ma, Gang Wang, Jian-Huang Lai, Jian-Guo Zhang

Our formulation of soft regression framework 1) overcomes a usual assumption in existing early action prediction systems that the progress level of on-going sequence is given in the testing stage; and 2) presents a theoretical framework to better resolve the ambiguity and uncertainty of subsequences at early performing stage.

Ranked #70 on Skeleton Based Action Recognition on NTU RGB+D 120

Early Action Prediction regression +1

Paper
Add Code

DSRGAN: Explicitly Learning Disentangled Representation of Underlying Structure and Rendering for Image Generation without Tuple Supervision

no code implementations • 30 Sep 2019 • Guang-Yuan Hao, Hong-Xing Yu, Wei-Shi Zheng

We focus on explicitly learning disentangled representation for natural image generation, where the underlying spatial structure and the rendering on the structure can be independently controlled respectively, yet using no tuple supervision.

Image Generation

Paper
Add Code

Batch Face Alignment using a Low-rank GAN

no code implementations • 21 Oct 2019 • Jiabo Huang, Xiaohua Xie, Wei-Shi Zheng

This paper studies the problem of aligning a set of face images of the same individual into a normalized image while removing the outliers like partial occlusion, extreme facial expression as well as significant illumination variation.

Face Alignment Generative Adversarial Network

Paper
Add Code

Weakly Supervised Tracklet Person Re-Identification by Deep Feature-wise Mutual Learning

no code implementations • 31 Oct 2019 • Zhirui Chen, Jianheng Li, Wei-Shi Zheng

The scalability problem caused by the difficulty in annotating Person Re-identification(Re-ID) datasets has become a crucial bottleneck in the development of Re-ID. To address this problem, many unsupervised Re-ID methods have recently been proposed. Nevertheless, most of these models require transfer from another auxiliary fully supervised dataset, which is still expensive to obtain. In this work, we propose a Re-ID model based on Weakly Supervised Tracklets(WST) data from various camera views, which can be inexpensively acquired by combining the fragmented tracklets of the same person in the same camera view over a period of time. We formulate our weakly supervised tracklets Re-ID model by a novel method, named deep feature-wise mutual learning(DFML), which consists of Mutual Learning on Feature Extractors (MLFE) and Mutual Learning on Feature Classifiers (MLFC). We propose MLFE by leveraging two feature extractors to learn from each other to extract more robust and discriminative features. On the other hand, we propose MLFC by adapting discriminative features from various camera views to each classifier.

Person Re-Identification

Paper
Add Code

MINI-Net: Multiple Instance Ranking Network for Video Highlight Detection

no code implementations • ECCV 2020 • Fa-Ting Hong, Xuanteng Huang, Wei-Hong Li, Wei-Shi Zheng

We address the weakly supervised video highlight detection problem for learning to detect segments that are more attractive in training videos given their video event label but without expensive supervision of manually annotating highlight segments.

Highlight Detection

Paper
Add Code

An Asymmetric Modeling for Action Assessment

no code implementations • ECCV 2020 • Jibin Gao, Wei-Shi Zheng, Jia-Hui Pan, Chengying Gao, Yao-Wei Wang, Wei Zeng, Jian-Huang Lai

However, existing methods for action assessment are mostly limited to individual actions, especially lacking modeling of the asymmetric relations among agents (e. g., between persons and objects); and this limitation undermines their ability to assess actions containing asymmetrically interactive motion patterns, since there always exists subordination between agents in many interactive actions.

Action Assessment

Paper
Add Code

Contextual Heterogeneous Graph Network for Human-Object Interaction Detection

no code implementations • ECCV 2020 • Hai Wang, Wei-Shi Zheng, Ling Yingbiao

However, previous graph models regard human and object as the same kind of nodes and do not consider that the messages are not equally the same between different entities.

Graph Attention Human-Object Interaction Detection +1

Paper
Add Code

Towards Unbiased COVID-19 Lesion Localisation and Segmentation via Weakly Supervised Learning

1 code implementation • 1 Mar 2021 • Yang Yang, Jiancong Chen, Ruixuan Wang, Ting Ma, Lingwei Wang, Jie Chen, Wei-Shi Zheng, Tong Zhang

Despite tremendous efforts, it is very challenging to generate a robust model to assist in the accurate quantification assessment of COVID-19 on chest CT images.

Generative Adversarial Network Weakly-supervised Learning

Paper
Code

Preserving Earlier Knowledge in Continual Learning with the Help of All Previous Feature Extractors

no code implementations • 28 Apr 2021 • Zhuoyun Li, Changhong Zhong, Sijia Liu, Ruixuan Wang, Wei-Shi Zheng

In order to reduce the forgetting of particularly earlier learned old knowledge and improve the overall continual learning performance, we propose a simple yet effective fusion mechanism by including all the previously learned feature extractors into the intelligent model.

Continual Learning

Paper
Add Code

Augmented 2D-TAN: A Two-stage Approach for Human-centric Spatio-Temporal Video Grounding

no code implementations • 20 Jun 2021 • Chaolei Tan, Zihang Lin, Jian-Fang Hu, Xiang Li, Wei-Shi Zheng

We propose an effective two-stage approach to tackle the problem of language-based Human-centric Spatio-Temporal Video Grounding (HC-STVG) task.

Spatio-Temporal Video Grounding Video Grounding

Paper
Add Code

Fine-Grained Shape-Appearance Mutual Learning for Cloth-Changing Person Re-Identification

no code implementations • CVPR 2021 • Peixian Hong, Tao Wu, AnCong Wu, Xintong Han, Wei-Shi Zheng

Recently, person re-identification (Re-ID) has achieved great progress.

Ranked #3 on Person Re-Identification on PRCC

Cloth-Changing Person Re-Identification

Paper
Add Code

Graph-Based High-Order Relation Modeling for Long-Term Action Recognition

no code implementations • CVPR 2021 • Jiaming Zhou, Kun-Yu Lin, Haoxin Li, Wei-Shi Zheng

In this paper, we propose a Graph-based High-order Relation Modeling (GHRM) module to exploit the high-order relations in the long-term actions for long-term action recognition.

Ranked #5 on Video Classification on Breakfast

Action Recognition Long-video Activity Recognition +3

Paper
Add Code

Discriminative Distillation to Reduce Class Confusion in Continual Learning

no code implementations • 11 Aug 2021 • Changhong Zhong, Zhiying Cui, Ruixuan Wang, Wei-Shi Zheng

Successful continual learning of new knowledge would enable intelligent systems to recognize more and more classes of objects.

Continual Learning Image Classification

Paper
Add Code

Understanding of Kernels in CNN Models by Suppressing Irrelevant Visual Features in Images

1 code implementation • 25 Aug 2021 • Jia-Xin Zhuang, Wanying Tao, Jianfei Xing, Wei Shi, Ruixuan Wang, Wei-Shi Zheng

In this paper, a simple yet effective optimization method is proposed to interpret the activation of any kernel of interest in CNN models.

Paper
Code

Learning To Know Where To See: A Visibility-Aware Approach for Occluded Person Re-Identification

no code implementations • ICCV 2021 • Jinrui Yang, Jiawei Zhang, Fufu Yu, Xinyang Jiang, Mengdan Zhang, Xing Sun, Ying-Cong Chen, Wei-Shi Zheng

Several mainstream methods utilize extra cues (e. g., human pose information) to distinguish human parts from obstacles to alleviate the occlusion problem.

Person Re-Identification

Paper
Add Code

Predictive Feature Learning for Future Segmentation Prediction

no code implementations • ICCV 2021 • Zihang Lin, Jiangxin Sun, Jian-Fang Hu, QiZhi Yu, Jian-Huang Lai, Wei-Shi Zheng

In the latent feature learned by the autoencoder, global structures are enhanced and local details are suppressed so that it is more predictive.

Segmentation

Paper
Add Code

Action-guided 3D Human Motion Prediction

no code implementations • NeurIPS 2021 • Jiangxin Sun, Zihang Lin, Xintong Han, Jian-Fang Hu, Jia Xu, Wei-Shi Zheng

The ability of forecasting future human motion is important for human-machine interaction systems to understand human behaviors and make interaction.

Human motion prediction motion prediction

Paper
Add Code

Letter-level Online Writer Identification

no code implementations • 6 Dec 2021 • Zelin Chen, Hong-Xing Yu, AnCong Wu, Wei-Shi Zheng

To make the application of writer-id more practical (e. g., on mobile devices), we focus on a novel problem, letter-level online writer-id, which requires only a few trajectories of written letters as identification cues.

Paper
Add Code

Unpaired Image Captioning by Image-level Weakly-Supervised Visual Concept Recognition

no code implementations • 7 Mar 2022 • Peipei Zhu, Xiao Wang, Yong Luo, Zhenglong Sun, Wei-Shi Zheng, YaoWei Wang, Changwen Chen

The image-level labels are utilized to train a weakly-supervised object recognition model to extract object information (e. g., instance) in an image, and the extracted instances are adopted to infer the relationships among different objects based on an enhanced graph neural network (GNN).

Image Captioning Object +1

Paper
Add Code

Continual Learning with Bayesian Model based on a Fixed Pre-trained Feature Extractor

no code implementations • 28 Apr 2022 • Yang Yang, Zhiying Cui, Junjie Xu, Changhong Zhong, Wei-Shi Zheng, Ruixuan Wang

In this case, updating the intelligent system with data of new diseases would inevitably downgrade its performance on previously learned diseases.

Class Incremental Learning Image Classification +1

Paper
Add Code

Likert Scoring With Grade Decoupling for Long-Term Action Assessment

no code implementations • CVPR 2022 • Angchi Xu, Ling-An Zeng, Wei-Shi Zheng

Long-term action quality assessment is a task of evaluating how well an action is performed, namely, estimating a quality score from a long video.

Ranked #1 on Action Quality Assessment on Rhythmic Gymnastic

Action Assessment Action Quality Assessment

Paper
Add Code

Weakly-Supervised Temporal Action Localization by Progressive Complementary Learning

1 code implementation • 22 Jun 2022 • Jia-Run Du, Jia-Chang Feng, Kun-Yu Lin, Fa-Ting Hong, Xiao-Ming Wu, Zhongang Qi, Ying Shan, Wei-Shi Zheng

Accordingly, we first exclude these surely non-existent categories by a complementary learning loss.

Multiple Instance Learning Representation Learning +3

Paper
Code

STVGFormer: Spatio-Temporal Video Grounding with Static-Dynamic Cross-Modal Understanding

no code implementations • 6 Jul 2022 • Zihang Lin, Chaolei Tan, Jian-Fang Hu, Zhi Jin, Tiancai Ye, Wei-Shi Zheng

The static branch performs cross-modal understanding in a single frame and learns to localize the target object spatially according to intra-frame visual cues like object appearances.

Ranked #2 on Spatio-Temporal Video Grounding on HC-STVG2

Spatio-Temporal Video Grounding Video Grounding

Paper
Add Code

PCCT: Progressive Class-Center Triplet Loss for Imbalanced Medical Image Classification

no code implementations • 11 Jul 2022 • Kanghao Chen, Weixian Lei, Rong Zhang, Shen Zhao, Wei-Shi Zheng, Ruixuan Wang

For the class-center involved triplet loss, the positive and negative samples in each triplet are replaced by their corresponding class centers, which enforces data representations of the same class closer to the class center.

Image Classification Medical Image Classification

Paper
Add Code

Learning Discriminative Representation via Metric Learning for Imbalanced Medical Image Classification

no code implementations • 14 Jul 2022 • Chenghua Zeng, Huijuan Lu, Kanghao Chen, Ruixuan Wang, Wei-Shi Zheng

Data imbalance between common and rare diseases during model training often causes intelligent diagnosis systems to have biased predictions towards common diseases.

Image Classification Medical Image Classification +1

Paper
Add Code

Text-Adaptive Multiple Visual Prototype Matching for Video-Text Retrieval

no code implementations • 27 Sep 2022 • Chengzhi Lin, AnCong Wu, Junwei Liang, Jun Zhang, Wenhang Ge, Wei-Shi Zheng, Chunhua Shen

To address this problem, we propose a Text-Adaptive Multiple Visual Prototype Matching model, which automatically captures multiple prototypes to describe a video by adaptive aggregation of video token features.

Cross-Modal Retrieval Retrieval +2

Paper
Add Code

Adaptively Integrated Knowledge Distillation and Prediction Uncertainty for Continual Learning

no code implementations • 18 Jan 2023 • Kanghao Chen, Sijia Liu, Ruixuan Wang, Wei-Shi Zheng

The first one is to adaptively integrate multiple levels of old knowledge and transfer it to each block level in the new model.

Continual Learning Knowledge Distillation

Paper
Add Code

PSLT: A Light-weight Vision Transformer with Ladder Self-Attention and Progressive Shift

no code implementations • 7 Apr 2023 • Gaojie Wu, Wei-Shi Zheng, Yutong Lu, Qi Tian

In this work, we propose a ladder self-attention block with multiple branches and a progressive shift mechanism to develop a light-weight transformer backbone that requires less computing resources (e. g. a relatively small number of parameters and FLOPs), termed Progressive Shift Ladder Transformer (PSLT).

Image Classification Person Re-Identification

Paper
Add Code

Pyramid Texture Filtering

no code implementations • 11 May 2023 • Qing Zhang, Hao Jiang, Yongwei Nie, Wei-Shi Zheng

We present a simple but effective technique to smooth out textures while preserving the prominent structures.

Image Enhancement Tone Mapping

Paper
Add Code

Collaborative Static and Dynamic Vision-Language Streams for Spatio-Temporal Video Grounding

no code implementations • CVPR 2023 • Zihang Lin, Chaolei Tan, Jian-Fang Hu, Zhi Jin, Tiancai Ye, Wei-Shi Zheng

The static stream performs cross-modal understanding in a single frame and learns to attend to the target object spatially according to intra-frame visual cues like object appearances.

Object Spatio-Temporal Video Grounding +1

Paper
Add Code

Generating Anomalies for Video Anomaly Detection With Prompt-Based Feature Mapping

no code implementations • CVPR 2023 • Zuhao Liu, Xiao-Ming Wu, Dian Zheng, Kun-Yu Lin, Wei-Shi Zheng

There also exists a scene gap between virtual and real scenarios, including scene-specific anomalies (events that are abnormal in one scene but normal in another) and scene-specific attributes, such as the viewpoint of the surveillance camera.

Anomaly Detection In Surveillance Videos Video Anomaly Detection

Paper
Add Code

Hierarchical Semantic Correspondence Networks for Video Paragraph Grounding

no code implementations • CVPR 2023 • Chaolei Tan, Zihang Lin, Jian-Fang Hu, Wei-Shi Zheng, JianHuang Lai

Specifically, we develop a hierarchical encoder that encodes the multi-modal inputs into semantics-aligned representations at different levels.

Sentence Video Grounding

Paper
Add Code

Class Attention to Regions of Lesion for Imbalanced Medical Image Recognition

no code implementations • 19 Jul 2023 • Jia-Xin Zhuang, Jiabin Cai, JianGuo Zhang, Wei-Shi Zheng, Ruixuan Wang

The CARE framework needs bounding boxes to represent the lesion regions of rare diseases.

Image Classification Medical Image Classification

Paper
Add Code

Event-Guided Procedure Planning from Instructional Videos with Text Supervision

no code implementations • ICCV 2023 • An-Lan Wang, Kun-Yu Lin, Jia-Run Du, Jingke Meng, Wei-Shi Zheng

In this work, we focus on the task of procedure planning from instructional videos with text supervision, where a model aims to predict an action sequence to transform the initial visual state into the goal visual state.

Paper
Add Code

DiffuVolume: Diffusion Model for Volume based Stereo Matching

no code implementations • 30 Aug 2023 • Dian Zheng, Xiao-Ming Wu, Zuhao Liu, Jingke Meng, Wei-Shi Zheng

Our method, termed DiffuVolume, considers the diffusion model as a cost volume filter, which will recurrently remove the redundant information from the cost volume.

Stereo Matching Zero-shot Generalization

Paper
Add Code

Continual Action Assessment via Task-Consistent Score-Discriminative Feature Distribution Modeling

no code implementations • 29 Sep 2023 • Yuan-Ming Li, Ling-An Zeng, Jing-Ke Meng, Wei-Shi Zheng

Our idea for modeling Continual-AQA is to sequentially learn a task-consistent score-discriminative feature distribution, in which the latent features express a strong correlation with the score labels regardless of the task or action types.

Action Assessment Action Quality Assessment +1

Paper
Add Code

Cross-Modal Adaptive Dual Association for Text-to-Image Person Retrieval

no code implementations • 4 Dec 2023 • Dixuan Lin, Yixing Peng, Jingke Meng, Wei-Shi Zheng

In this work, we show the discrepancy between image-to-text association and text-to-image association and propose CADA: Cross-Modal Adaptive Dual Association that finely builds bidirectional image-text detailed associations.

Ranked #1 on Text based Person Retrieval on RSTPReid (mAP metric)

Attribute Cross-Modal Person Re-Identification +5

Paper
Add Code

Transformer for Object Re-Identification: A Survey

no code implementations • 13 Jan 2024 • Mang Ye, Shuoyi Chen, Chenyue Li, Wei-Shi Zheng, David Crandall, Bo Du

Object Re-Identification (Re-ID) aims to identify and retrieve specific objects from varying viewpoints.

Object

Paper
Add Code

ActionHub: A Large-scale Action Video Description Dataset for Zero-shot Action Recognition

no code implementations • 22 Jan 2024 • Jiaming Zhou, Junwei Liang, Kun-Yu Lin, Jinrui Yang, Wei-Shi Zheng

With the proposed ActionHub dataset, we further propose a novel Cross-modality and Cross-action Modeling (CoCo) framework for ZSAR, which consists of a Dual Cross-modality Alignment module and a Cross-action Invariance Mining module.

Action Recognition Video Description +1

Paper
Add Code

Siamese Learning with Joint Alignment and Regression for Weakly-Supervised Video Paragraph Grounding

no code implementations • 18 Mar 2024 • Chaolei Tan, JianHuang Lai, Wei-Shi Zheng, Jian-Fang Hu

Different from previous weakly-supervised grounding frameworks based on multiple instance learning or reconstruction learning for two-stage candidate ranking, we propose a novel siamese learning framework that jointly learns the cross-modal feature alignment and temporal coordinate regression without timestamp labels to achieve concise one-stage localization for WSVPG.

Multiple Instance Learning

Paper
Add Code

Ranking Distillation for Open-Ended Video Question Answering with Insufficient Labels

no code implementations • 21 Mar 2024 • Tianming Liang, Chaolei Tan, Beihao Xia, Wei-Shi Zheng, Jian-Fang Hu

This paper focuses on open-ended video question answering, which aims to find the correct answers from a large answer set in response to a video-related question.

Multi-Label Classification Question Answering +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.