Search Results for author: Wei-Shi Zheng

Found 125 papers, 54 papers with code

Dual Illumination Estimation for Robust Exposure Correction

2 code implementations30 Oct 2019 Qing Zhang, Yongwei Nie, Wei-Shi Zheng

By performing dual illumination estimation, we obtain two intermediate exposure correction results for the input image, with one fixes the underexposed regions and the other one restores the overexposed regions.

Multi-Exposure Image Fusion

DreamView: Injecting View-specific Text Guidance into Text-to-3D Generation

2 code implementations9 Apr 2024 Junkai Yan, Yipeng Gao, Qize Yang, Xihan Wei, Xuansong Xie, AnCong Wu, Wei-Shi Zheng

Text-to-3D generation, which synthesizes 3D assets according to an overall text description, has significantly progressed.

3D Generation Text to 3D

DilateFormer: Multi-Scale Dilated Transformer for Visual Recognition

1 code implementation3 Feb 2023 Jiayu Jiao, Yu-Ming Tang, Kun-Yu Lin, Yipeng Gao, Jinhua Ma, YaoWei Wang, Wei-Shi Zheng

In this work, we explore effective Vision Transformers to pursue a preferable trade-off between the computational complexity and size of the attended receptive field.

Instance Segmentation object-detection +2

Rethinking Temporal Fusion for Video-based Person Re-identification on Semantic and Time Aspect

2 code implementations28 Nov 2019 Xinyang Jiang, Yifei Gong, Xiaowei Guo, Qize Yang, Feiyue Huang, Wei-Shi Zheng, Feng Zheng, Xing Sun

Recently, the research interest of person re-identification (ReID) has gradually turned to video-based methods, which acquire a person representation by aggregating frame features of an entire video.

Video-Based Person Re-Identification

Devil's in the Details: Aligning Visual Clues for Conditional Embedding in Person Re-Identification

1 code implementation11 Sep 2020 Fufu Yu, Xinyang Jiang, Yifei Gong, Shizhen Zhao, Xiaowei Guo, Wei-Shi Zheng, Feng Zheng, Xing Sun

Secondly, the Conditional Feature Embedding requires the overall feature of a query image to be dynamically adjusted based on the gallery image it matches, while most of the existing methods ignore the reference images.

Person Re-Identification

One for More: Selecting Generalizable Samples for Generalizable ReID Model

1 code implementation10 Dec 2020 Enwei Zhang, Xinyang Jiang, Hao Cheng, AnCong Wu, Fufu Yu, Ke Li, Xiaowei Guo, Feng Zheng, Wei-Shi Zheng, Xing Sun

Current training objectives of existing person Re-IDentification (ReID) models only ensure that the loss of the model decreases on selected training batch, with no regards to the performance on samples outside the batch.

Person Re-Identification

Learning 3D Shape Feature for Texture-Insensitive Person Re-Identification

1 code implementation CVPR 2021 Jiaxing Chen, Xinyang Jiang, Fudong Wang, Jun Zhang, Feng Zheng, Xing Sun, Wei-Shi Zheng

In this paper, rather than relying on texture based information, we propose to improve the robustness of person ReID against clothing texture by exploiting the information of a person's 3D shape.

3D Reconstruction Person Re-Identification

AcroFOD: An Adaptive Method for Cross-domain Few-shot Object Detection

1 code implementation22 Sep 2022 Yipeng Gao, Lingxiao Yang, Yunmu Huang, Song Xie, Shiyong Li, Wei-Shi Zheng

Under the domain shift, cross-domain few-shot object detection aims to adapt object detectors in the target domain with a few annotated target data.

Cross-Domain Few-Shot Data Augmentation +2

Fully Convolutional Network Ensembles for White Matter Hyperintensities Segmentation in MR Images

2 code implementations14 Feb 2018 Hongwei Li, Gongfa Jiang, Jian-Guo Zhang, Ruixuan Wang, Zhaolei Wang, Wei-Shi Zheng, Bjoern Menze

In this paper, we present a study using deep fully convolutional network and ensemble models to automatically detect such WMH using fluid attenuation inversion recovery (FLAIR) and T1 magnetic resonance (MR) scans.

Data Augmentation

Cross-modal Consensus Network for Weakly Supervised Temporal Action Localization

2 code implementations27 Jul 2021 Fa-Ting Hong, Jia-Chang Feng, Dan Xu, Ying Shan, Wei-Shi Zheng

In this work, we argue that the features extracted from the pretrained extractor, e. g., I3D, are not the WS-TALtask-specific features, thus the feature re-calibration is needed for reducing the task-irrelevant information redundancy.

Weakly Supervised Action Localization Weakly-supervised Temporal Action Localization +1

Unsupervised Learning for Optical Flow Estimation Using Pyramid Convolution LSTM

1 code implementation26 Jul 2019 Shuosen Guan, Haoxin Li, Wei-Shi Zheng

Most of current Convolution Neural Network (CNN) based methods for optical flow estimation focus on learning optical flow on synthetic datasets with groundtruth, which is not practical.

Action Recognition Optical Flow Estimation

Viewpoint-Aware Loss with Angular Regularization for Person Re-Identification

1 code implementation3 Dec 2019 Zhihui Zhu, Xinyang Jiang, Feng Zheng, Xiaowei Guo, Feiyue Huang, Wei-Shi Zheng, Xing Sun

Instead of one subspace for each viewpoint, our method projects the feature from different viewpoints into a unified hypersphere and effectively models the feature distribution on both the identity-level and the viewpoint-level.

Ranked #5 on Person Re-Identification on Market-1501 (using extra training data)

Person Re-Identification

Adapter Learning in Pretrained Feature Extractor for Continual Learning of Diseases

1 code implementation18 Apr 2023 Wentao Zhang, Yujun Huang, Tong Zhang, Qingsong Zou, Wei-Shi Zheng, Ruixuan Wang

In particular, updating an intelligent diagnosis system with training data of new diseases would cause catastrophic forgetting of old disease knowledge.

Continual Learning

Shape-Erased Feature Learning for Visible-Infrared Person Re-Identification

1 code implementation CVPR 2023 Jiawei Feng, AnCong Wu, Wei-Shi Zheng

To this end, we propose shape-erased feature learning paradigm that decorrelates modality-shared features in two orthogonal subspaces.

Person Re-Identification

Squeeze-and-Attention Networks for Semantic Segmentation

1 code implementation CVPR 2020 Zilong Zhong, Zhong Qiu Lin, Rene Bidart, Xiaodan Hu, Ibrahim Ben Daya, Zhifeng Li, Wei-Shi Zheng, Jonathan Li, Alexander Wong

The recent integration of attention mechanisms into segmentation networks improves their representational capabilities through a great emphasis on more informative features.

Segmentation Semantic Segmentation

AsyFOD: An Asymmetric Adaptation Paradigm for Few-Shot Domain Adaptive Object Detection

1 code implementation CVPR 2023 Yipeng Gao, Kun-Yu Lin, Junkai Yan, YaoWei Wang, Wei-Shi Zheng

Critically, in FSDAOD, the data-scarcity in the target domain leads to an extreme data imbalance between the source and target domains, which potentially causes over-adaptation in traditional feature alignment.

object-detection Object Detection

A Versatile Framework for Multi-scene Person Re-identification

1 code implementation17 Mar 2024 Wei-Shi Zheng, Junkai Yan, Yi-Xing Peng

To overcome significant variations between images across camera views, mountains of variants of ReID models were developed for solving a number of challenges, such as resolution change, clothing change, occlusion, modality change, and so on.

Data Augmentation Person Re-Identification +1

Cross-view Asymmetric Metric Learning for Unsupervised Person Re-identification

1 code implementation ICCV 2017 Hong-Xing Yu, An-Cong Wu, Wei-Shi Zheng

While metric learning is important for Person re-identification (RE-ID), a significant problem in visual surveillance for cross-view pedestrian matching, existing metric models for RE-ID are mostly based on supervised learning that requires quantities of labeled samples in all pairs of camera views for training.

Clustering Metric Learning +1

Unsupervised Person Re-identification by Deep Asymmetric Metric Embedding

1 code implementation29 Jan 2019 Hong-Xing Yu, An-Cong Wu, Wei-Shi Zheng

In such a way, DECAMEL jointly learns the feature representation and the unsupervised asymmetric metric.

Clustering Deep Clustering +2

Estimator Meets Equilibrium Perspective: A Rectified Straight Through Estimator for Binary Neural Networks Training

1 code implementation ICCV 2023 Xiao-Ming Wu, Dian Zheng, Zuhao Liu, Wei-Shi Zheng

The pioneering work BinaryConnect uses Straight Through Estimator (STE) to mimic the gradients of the sign function, but it also causes the crucial inconsistency problem.

Binarization

Cross-Camera Feature Prediction for Intra-Camera Supervised Person Re-identification across Distant Scenes

1 code implementation29 Jul 2021 Wenhang Ge, Chunyan Pan, AnCong Wu, Hongwei Zheng, Wei-Shi Zheng

To learn camera-invariant representation from cross-camera unpaired training data, we propose a cross-camera feature prediction method to mine cross-camera self supervision information from camera-specific feature distribution by transforming fake cross-camera positive feature pairs and minimize the distances of the fake pairs.

Person Re-Identification

ASAG: Building Strong One-Decoder-Layer Sparse Detectors via Adaptive Sparse Anchor Generation

1 code implementation ICCV 2023 Shenghao Fu, Junkai Yan, Yipeng Gao, Xiaohua Xie, Wei-Shi Zheng

We find that the architecture discrepancy between dense and sparse detectors leads to feature conflict, hampering the performance of one-decoder-layer detectors.

Learning Multi-Attention Context Graph for Group-Based Re-Identification

1 code implementation29 Apr 2021 Yichao Yan, Jie Qin, Bingbing Ni, Jiaxin Chen, Li Liu, Fan Zhu, Wei-Shi Zheng, Xiaokang Yang, Ling Shao

Extensive experiments on the novel dataset as well as three existing datasets clearly demonstrate the effectiveness of the proposed framework for both group-based re-id tasks.

Person Re-Identification

Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training

1 code implementation3 Nov 2023 Yipeng Gao, Zeyu Wang, Wei-Shi Zheng, Cihang Xie, Yuyin Zhou

Contrastive learning has emerged as a promising paradigm for 3D open-world understanding, i. e., aligning point cloud representation to image and text embedding space individually.

 Ranked #1 on Zero-shot 3D classification on Objaverse LVIS (using extra training data)

Contrastive Learning Retrieval +3

Hybrid Dynamic-static Context-aware Attention Network for Action Assessment in Long Videos

2 code implementations13 Aug 2020 Ling-An Zeng, Fa-Ting Hong, Wei-Shi Zheng, Qi-Zhi Yu, Wei Zeng, Yao-Wei Wang, Jian-Huang Lai

However, most existing works focus only on video dynamic information (i. e., motion information) but ignore the specific postures that an athlete is performing in a video, which is important for action assessment in long videos.

Action Assessment Action Quality Assessment

MIXGAN: Learning Concepts from Different Domains for Mixture Generation

1 code implementation4 Jul 2018 Guang-Yuan Hao, Hong-Xing Yu, Wei-Shi Zheng

In this work, we present an interesting attempt on mixture generation: absorbing different image concepts (e. g., content and style) from different domains and thus generating a new domain with learned concepts.

Generative Adversarial Network Translation

When Prompt-based Incremental Learning Does Not Meet Strong Pretraining

1 code implementation ICCV 2023 Yu-Ming Tang, Yi-Xing Peng, Wei-Shi Zheng

However, existing prompt-based methods heavily rely on strong pretraining (typically trained on ImageNet-21k), and we find that their models could be trapped if the potential gap between the pretraining task and unknown future tasks is large.

Incremental Learning Retrieval

Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition

1 code implementation3 Mar 2024 Kun-Yu Lin, Henghui Ding, Jiaming Zhou, Yi-Xing Peng, Zhilin Zhao, Chen Change Loy, Wei-Shi Zheng

To answer this, we establish a CROSS-domain Open-Vocabulary Action recognition benchmark named XOV-Action, and conduct a comprehensive evaluation of five state-of-the-art CLIP-based video learners under various types of domain gaps.

Open Vocabulary Action Recognition

Human Co-Parsing Guided Alignment for Occluded Person Re-identification

1 code implementation IEEE Transactions on Image Processing 2022 Shuguang Dou, Cairong Zhao, Xinyang Jiang, Shanshan Zhang, Wei-Shi Zheng, WangMeng Zuo

Most supervised methods propose to train an extra human parsing model aside from the ReID model with cross-domain human parts annotation, suffering from expensive annotation cost and domain gap; Unsupervised methods integrate a feature clustering-based human parsing process into the ReID model, but lacking supervision signals brings less satisfactory segmentation results.

Human Parsing Person Re-Identification

Person Re-identification by Contour Sketch under Moderate Clothing Change

2 code implementations6 Feb 2020 Qize Yang, An-Cong Wu, Wei-Shi Zheng

Substantial development of re-id has recently been observed, and the majority of existing models are largely dependent on color appearance and assume that pedestrians do not change their clothes across camera views.

Person Re-Identification

Discriminator-Free Generative Adversarial Attack

1 code implementation20 Jul 2021 ShaoHao Lu, Yuqiao Xian, Ke Yan, Yi Hu, Xing Sun, Xiaowei Guo, Feiyue Huang, Wei-Shi Zheng

The Deep Neural Networks are vulnerable toadversarial exam-ples(Figure 1), making the DNNs-based systems collapsed byadding the inconspicuous perturbations to the images.

Adversarial Attack Disentanglement

NECA: Neural Customizable Human Avatar

1 code implementation15 Mar 2024 Junjin Xiao, Qing Zhang, Zhan Xu, Wei-Shi Zheng

The core of our approach is to represent humans in complementary dual spaces and predict disentangled neural fields of geometry, albedo, shadow, as well as an external lighting, from which we are able to derive realistic rendering with high-frequency details via volumetric rendering.

Combined Depth Space based Architecture Search For Person Re-identification

1 code implementation CVPR 2021 Hanjun Li, Gaojie Wu, Wei-Shi Zheng

We propose a novel search space called Combined Depth Space (CDS), based on which we search for an efficient network architecture, which we call CDNet, via a differentiable architecture search algorithm.

Image Classification Person Re-Identification

Adaptive Interaction Modeling via Graph Operations Search

1 code implementation CVPR 2020 Haoxin Li, Wei-Shi Zheng, Yu Tao, Haifeng Hu, Jian-Huang Lai

We propose to search the network structures with differentiable architecture search mechanism, which learns to construct adaptive structures for different videos to facilitate adaptive interaction modeling.

Action Analysis

Cross-Camera Trajectories Help Person Retrieval in a Camera Network

1 code implementation27 Apr 2022 Xin Zhang, Xiaohua Xie, JianHuang Lai, Wei-Shi Zheng

To address this issue, we propose a pedestrian retrieval framework based on cross-camera trajectory generation, which integrates both temporal and spatial information.

Person Retrieval Re-Ranking +1

SNN2ANN: A Fast and Memory-Efficient Training Framework for Spiking Neural Networks

1 code implementation19 Jun 2022 Jianxiong Tang, JianHuang Lai, Xiaohua Xie, Lingxiao Yang, Wei-Shi Zheng

The SNN2ANN consists of 2 components: a) a weight sharing architecture between ANN and SNN and b) spiking mapping units.

Task-oriented Self-supervised Learning for Anomaly Detection in Electroencephalography

1 code implementation4 Jul 2022 Yaojia Zheng, Zhouwu Liu, Rong Mo, Ziyi Chen, Wei-Shi Zheng, Ruixuan Wang

Compared to supervised learning with labelled disease EEG data which can train a model to analyze specific diseases but would fail to monitor previously unseen statuses, anomaly detection based on only normal EEGs can detect any potential anomaly in new EEGs.

Anomaly Detection EEG +1

Weakly Supervised Text-Based Person Re-Identification

1 code implementation ICCV 2021 Shizhen Zhao, Changxin Gao, Yuanjie Shao, Wei-Shi Zheng, Nong Sang

Specifically, to alleviate the intra-class variations, a clustering method is utilized to generate pseudo labels for both visual and textual instances.

Clustering Person Re-Identification +1

Multimodal Action Quality Assessment

1 code implementation31 Jan 2024 Ling-An Zeng, Wei-Shi Zheng

To leverage multimodal information for AQA, i. e., RGB, optical flow and audio information, we propose a Progressive Adaptive Multimodal Fusion Network (PAMFN) that separately models modality-specific information and mixed-modality information.

Action Quality Assessment Optical Flow Estimation

Learning to Detect Important People in Unlabelled Images for Semi-supervised Important People Detection

1 code implementation CVPR 2020 Fa-Ting Hong, Wei-Hong Li, Wei-Shi Zheng

Important people detection is to automatically detect the individuals who play the most important roles in a social event image, which requires the designed model to understand a high-level pattern.

Object Recognition Pseudo Label

Efficient and Effective Weakly-Supervised Action Segmentation via Action-Transition-Aware Boundary Alignment

1 code implementation28 Mar 2024 Angchi Xu, Wei-Shi Zheng

Weakly-supervised action segmentation is a task of learning to partition a long video into several action segments, where training videos are only accompanied by transcripts (ordered list of actions).

Action Segmentation Segmentation

Long-Term Human Motion Prediction by Modeling Motion Context and Enhancing Motion Dynamic

no code implementations7 May 2018 Yongyi Tang, Lin Ma, Wei Liu, Wei-Shi Zheng

Human motion prediction aims at generating future frames of human motion based on an observed sequence of skeletons.

Human motion prediction motion prediction

A Coordinate-wise Optimization Algorithm for Sparse Inverse Covariance Selection

no code implementations19 Nov 2017 Ganzhao Yuan, Haoxian Tan, Wei-Shi Zheng

Sparse inverse covariance selection is a fundamental problem for analyzing dependencies in high dimensional data.

Deep CNNs for HEp-2 Cells Classification : A Cross-specimen Analysis

no code implementations20 Apr 2016 Hongwei Li, Wei-Shi Zheng, JianGuo Zhang

Automatic classification of Human Epithelial Type-2 (HEp-2) cells staining patterns is an important and yet a challenging problem.

Classification General Classification

Adversarial Attribute-Image Person Re-identification

no code implementations5 Dec 2017 Zhou Yin, Wei-Shi Zheng, An-Cong Wu, Hong-Xing Yu, Hai Wan, Xiaowei Guo, Feiyue Huang, Jian-Huang Lai

While attributes have been widely used for person re-identification (Re-ID) which aims at matching the same person images across disjoint camera views, they are used either as extra features or for performing multi-task learning to assist the image-image matching task.

Attribute Multi-Task Learning +1

One-pass Person Re-identification by Sketch Online Discriminant Analysis

no code implementations9 Nov 2017 Wei-Hong Li, Zhuowei Zhong, Wei-Shi Zheng

While there is a few work on discussing online re-id, most of them require considerable storage of all passed data samples that have been ever observed, and this could be unrealistic for processing data from a large camera network.

Person Re-Identification

PersonRank: Detecting Important People in Images

no code implementations6 Nov 2017 Wei-Hong Li, Benchao Li, Wei-Shi Zheng

Always, some individuals in images are more important/attractive than others in some events such as presentation, basketball game or speech.

Latent Embeddings for Collective Activity Recognition

no code implementations20 Sep 2017 Yongyi Tang, Peizhen Zhang, Jian-Fang Hu, Wei-Shi Zheng

Rather than simply recognizing the action of a person individually, collective activity recognition aims to find out what a group of people is acting in a collective scene.

Activity Recognition

Online Hashing

no code implementations6 Apr 2017 Long-Kai Huang, Qiang Yang, Wei-Shi Zheng

Specifically, a new loss function is proposed to measure the similarity loss between a pair of data samples in hamming space.

Robust Depth-based Person Re-identification

no code implementations28 Mar 2017 Ancong Wu, Wei-Shi Zheng, Jian-Huang Lai

More specifically, we exploit depth voxel covariance descriptor and further propose a locally rotation invariant depth shape descriptor called Eigen-depth feature to describe pedestrian body shape.

Person Re-Identification

Person Re-Identification by Camera Correlation Aware Feature Augmentation

no code implementations26 Mar 2017 Ying-Cong Chen, Xiatian Zhu, Wei-Shi Zheng, Jian-Huang Lai

The challenge of person re-identification (re-id) is to match individual images of the same person captured by different non-overlapping camera views against significant and unknown cross-view feature distortion.

Person Re-Identification

Embedding Deep Metric for Person Re-identication A Study Against Large Variations

no code implementations1 Nov 2016 Hailin Shi, Yang Yang, Xiangyu Zhu, Shengcai Liao, Zhen Lei, Wei-Shi Zheng, Stan Z. Li

From this point of view, selecting suitable positive i. e. intra-class) training samples within a local range is critical for training the CNN embedding, especially when the data has large intra-class variations.

Person Re-Identification

Top-push Video-based Person Re-identification

no code implementations CVPR 2016 Jin-Jie You, An-Cong Wu, Xiang Li, Wei-Shi Zheng

Since only limited information can be exploited from still images, it is hard (if not impossible) to overcome the occlusion, pose and camera-view change, and lighting variation problems.

Video-Based Person Re-Identification

An Enhanced Deep Feature Representation for Person Re-identification

no code implementations26 Apr 2016 Shangxuan Wu, Ying-Cong Chen, Xiang Li, An-Cong Wu, Jin-Jie You, Wei-Shi Zheng

In this paper, we focus on the feature representation and claim that hand-crafted histogram features can be complementary to Convolutional Neural Network (CNN) features.

Metric Learning Person Re-Identification

Human Re-identification by Matching Compositional Template with Cluster Sampling

no code implementations1 Feb 2015 Yuanlu Xu, Liang Lin, Wei-Shi Zheng, Xiaobai Liu

This paper aims at a newly raising task in visual surveillance: re-identifying people at a distance by matching body information, given several reference examples.

Person Re-Identification

Adversarial Open-World Person Re-Identification

no code implementations ECCV 2018 Xiang Li, An-Cong Wu, Wei-Shi Zheng

The main idea is learning to attack feature extractor on the target people by using GAN to generate very target-like images (imposters), and in the meantime the model will make the feature extractor learn to tolerate the attack by discriminative learning so as to realize group-based verification.

Person Re-Identification

Improving Fast Segmentation With Teacher-student Learning

no code implementations19 Oct 2018 Jiafeng Xie, Bing Shuai, Jian-Fang Hu, Jingyang Lin, Wei-Shi Zheng

Recently, segmentation neural networks have been significantly improved by demonstrating very promising accuracies on public benchmarks.

Segmentation

Accelerating Large Scale Knowledge Distillation via Dynamic Importance Sampling

no code implementations3 Dec 2018 Minghan Li, Tanli Zuo, Ruicheng Li, Martha White, Wei-Shi Zheng

Knowledge distillation is an effective technique that transfers knowledge from a large teacher model to a shallow student.

Knowledge Distillation Machine Translation +2

Group-Attention Single-Shot Detector (GA-SSD): Finding Pulmonary Nodules in Large-Scale CT Images

no code implementations18 Dec 2018 Jiechao Ma, Xiang Li, Hongwei Li, Bjoern H. Menze, Sen Liang, Rongguo Zhang, Wei-Shi Zheng

In this paper, we propose a novel and effective abnormality detector implementing the attention mechanism and group convolution on 3D single-shot detector (SSD) called group-attention SSD (GA-SSD).

Computed Tomography (CT) Finding Pulmonary Nodules In Large-Scale Ct Images

A Matrix Splitting Method for Composite Function Minimization

no code implementations CVPR 2017 Ganzhao Yuan, Wei-Shi Zheng, Bernard Ghanem

Incorporating a new Gaussian elimination procedure, the matrix splitting method achieves state-of-the-art performance.

Multi-Scale Learning for Low-Resolution Person Re-Identification

no code implementations ICCV 2015 Xiang Li, Wei-Shi Zheng, Xiaojuan Wang, Tao Xiang, Shaogang Gong

In real world person re-identification (re-id), images of people captured at very different resolutions from different locations need be matched.

Person Re-Identification

Partial Person Re-Identification

no code implementations ICCV 2015 Wei-Shi Zheng, Xiang Li, Tao Xiang, Shengcai Liao, Jian-Huang Lai, Shaogang Gong

We address a new partial person re-identification (re-id) problem, where only a partial observation of a person is available for matching across different non-overlapping camera views.

Person Re-Identification

RGB-Infrared Cross-Modality Person Re-Identification

no code implementations ICCV 2017 Ancong Wu, Wei-Shi Zheng, Hong-Xing Yu, Shaogang Gong, Jian-Huang Lai

To that end, matching RGB images with infrared images is required, which are heterogeneous with very different visual characteristics.

Ranked #4 on Cross-Modal Person Re-Identification on SYSU-MM01 (mAP (All-search & Single-shot) metric)

Cross-Modality Person Re-identification Cross-Modal Person Re-Identification

A Decomposition Algorithm for the Sparse Generalized Eigenvalue Problem

no code implementations CVPR 2019 Ganzhao Yuan, Li Shen, Wei-Shi Zheng

The sparse generalized eigenvalue problem arises in a number of standard and modern statistical learning models, including sparse principal component analysis, sparse Fisher discriminant analysis, and sparse canonical correlation analysis.

Numerical Analysis

Pedestrian re-identification based on Tree branch network with local and global learning

no code implementations31 Mar 2019 Hui Li, Meng Yang, Zhihui Lai, Wei-Shi Zheng, Zitong Yu

Deep part-based methods in recent literature have revealed the great potential of learning local part-level representation for pedestrian image in the task of person re-identification.

Person Re-Identification

Weakly Supervised Person Re-Identification

no code implementations CVPR 2019 Jingke Meng, Sheng Wu, Wei-Shi Zheng

In the conventional person re-id setting, it is assumed that the labeled images are the person images within the bounding box for each individual; this labeling across multiple nonoverlapping camera views from raw video surveillance is costly and time-consuming.

Multi-Label Learning Person Re-Identification

Learning to Learn Relation for Important People Detection in Still Images

1 code implementation CVPR 2019 Wei-Hong Li, Fa-Ting Hong, Wei-Shi Zheng

In this work, we propose a deep imPOrtance relatIon NeTwork (POINT) that combines both relation modeling and feature learning.

Relation Relation Network

Weakly Supervised Open-set Domain Adaptation by Dual-domain Collaboration

no code implementations CVPR 2019 Shuhan Tan, Jiening Jiao, Wei-Shi Zheng

Thus, it is meaningful to let partially labeled domains learn from each other to classify all the unlabeled samples in each domain under an open-set setting.

Domain Adaptation Transfer Learning

Towards Photo-Realistic Visible Watermark Removal with Conditional Generative Adversarial Networks

no code implementations30 May 2019 Xiang Li, Chan Lu, Danni Cheng, Wei-Hong Li, Mei Cao, Bo Liu, Jiechao Ma, Wei-Shi Zheng

Visible watermark plays an important role in image copyright protection and the robustness of a visible watermark to an attack is shown to be essential.

Image-to-Image Translation

Deep Dual Relation Modeling for Egocentric Interaction Recognition

no code implementations CVPR 2019 Haoxin Li, Yijun Cai, Wei-Shi Zheng

To exploit the strong relations for egocentric interaction recognition, we introduce a dual relation modeling framework which learns to model the relations between the camera wearer and the interactor based on the individual action representations of the two persons.

Relation

Cross-view Relation Networks for Mammogram Mass Detection

no code implementations1 Jul 2019 Jiechao Ma, Sen Liang, Xiang Li, Hongwei Li, Bjoern H. Menze, Rongguo Zhang, Wei-Shi Zheng

Mammogram is the most effective imaging modality for the mass lesion detection of breast cancer at the early stage.

Lesion Detection Relation

Enhancing Underexposed Photos using Perceptually Bidirectional Similarity

no code implementations25 Jul 2019 Qing Zhang, Yongwei Nie, Lei Zhu, Chunxia Xiao, Wei-Shi Zheng

To obtain high-quality results free of these artifacts, we present a novel underexposed photo enhancement approach that is able to maintain the perceptual consistency.

Video Enhancement

Jointly learning heterogeneous features for rgb-d activity recognition

no code implementations IEEE Transactions on Pattern Analysis and Machine Intelligence ( Volume: 39 , Issue: 11 , Nov. 1 2017 ) 2016 Jian-Fang Hu, Wei-Shi Zheng, Jian-Huang Lai, Jian-Guo Zhang

The proposed model formed in a unified framework is capable of: 1) jointly mining a set of subspaces with the same dimensionality to exploit latent shared features across different feature channels, 2) meanwhile, quantifying the shared and feature-specific components of features in the subspaces, and 3) transferring feature-specific intermediate transforms (i-transforms) for learning fusion of heterogeneous features across datasets.

Activity Recognition Benchmarking +3

Early action prediction by soft regression

no code implementations IEEE Transactions on Pattern Analysis and Machine Intelligence 2018 Jian-Fang Hu, Wei-Shi Zheng, Lianyang Ma, Gang Wang, Jian-Huang Lai, Jian-Guo Zhang

Our formulation of soft regression framework 1) overcomes a usual assumption in existing early action prediction systems that the progress level of on-going sequence is given in the testing stage; and 2) presents a theoretical framework to better resolve the ambiguity and uncertainty of subsequences at early performing stage.

Early Action Prediction regression +1

DSRGAN: Explicitly Learning Disentangled Representation of Underlying Structure and Rendering for Image Generation without Tuple Supervision

no code implementations30 Sep 2019 Guang-Yuan Hao, Hong-Xing Yu, Wei-Shi Zheng

We focus on explicitly learning disentangled representation for natural image generation, where the underlying spatial structure and the rendering on the structure can be independently controlled respectively, yet using no tuple supervision.

Image Generation

Batch Face Alignment using a Low-rank GAN

no code implementations21 Oct 2019 Jiabo Huang, Xiaohua Xie, Wei-Shi Zheng

This paper studies the problem of aligning a set of face images of the same individual into a normalized image while removing the outliers like partial occlusion, extreme facial expression as well as significant illumination variation.

Face Alignment Generative Adversarial Network

Weakly Supervised Tracklet Person Re-Identification by Deep Feature-wise Mutual Learning

no code implementations31 Oct 2019 Zhirui Chen, Jianheng Li, Wei-Shi Zheng

The scalability problem caused by the difficulty in annotating Person Re-identification(Re-ID) datasets has become a crucial bottleneck in the development of Re-ID. To address this problem, many unsupervised Re-ID methods have recently been proposed. Nevertheless, most of these models require transfer from another auxiliary fully supervised dataset, which is still expensive to obtain. In this work, we propose a Re-ID model based on Weakly Supervised Tracklets(WST) data from various camera views, which can be inexpensively acquired by combining the fragmented tracklets of the same person in the same camera view over a period of time. We formulate our weakly supervised tracklets Re-ID model by a novel method, named deep feature-wise mutual learning(DFML), which consists of Mutual Learning on Feature Extractors (MLFE) and Mutual Learning on Feature Classifiers (MLFC). We propose MLFE by leveraging two feature extractors to learn from each other to extract more robust and discriminative features. On the other hand, we propose MLFC by adapting discriminative features from various camera views to each classifier.

Person Re-Identification

MINI-Net: Multiple Instance Ranking Network for Video Highlight Detection

no code implementations ECCV 2020 Fa-Ting Hong, Xuanteng Huang, Wei-Hong Li, Wei-Shi Zheng

We address the weakly supervised video highlight detection problem for learning to detect segments that are more attractive in training videos given their video event label but without expensive supervision of manually annotating highlight segments.

Highlight Detection

An Asymmetric Modeling for Action Assessment

no code implementations ECCV 2020 Jibin Gao, Wei-Shi Zheng, Jia-Hui Pan, Chengying Gao, Yao-Wei Wang, Wei Zeng, Jian-Huang Lai

However, existing methods for action assessment are mostly limited to individual actions, especially lacking modeling of the asymmetric relations among agents (e. g., between persons and objects); and this limitation undermines their ability to assess actions containing asymmetrically interactive motion patterns, since there always exists subordination between agents in many interactive actions.

Action Assessment

Contextual Heterogeneous Graph Network for Human-Object Interaction Detection

no code implementations ECCV 2020 Hai Wang, Wei-Shi Zheng, Ling Yingbiao

However, previous graph models regard human and object as the same kind of nodes and do not consider that the messages are not equally the same between different entities.

Graph Attention Human-Object Interaction Detection +1

Towards Unbiased COVID-19 Lesion Localisation and Segmentation via Weakly Supervised Learning

1 code implementation1 Mar 2021 Yang Yang, Jiancong Chen, Ruixuan Wang, Ting Ma, Lingwei Wang, Jie Chen, Wei-Shi Zheng, Tong Zhang

Despite tremendous efforts, it is very challenging to generate a robust model to assist in the accurate quantification assessment of COVID-19 on chest CT images.

Generative Adversarial Network Weakly-supervised Learning

Preserving Earlier Knowledge in Continual Learning with the Help of All Previous Feature Extractors

no code implementations28 Apr 2021 Zhuoyun Li, Changhong Zhong, Sijia Liu, Ruixuan Wang, Wei-Shi Zheng

In order to reduce the forgetting of particularly earlier learned old knowledge and improve the overall continual learning performance, we propose a simple yet effective fusion mechanism by including all the previously learned feature extractors into the intelligent model.

Continual Learning

Augmented 2D-TAN: A Two-stage Approach for Human-centric Spatio-Temporal Video Grounding

no code implementations20 Jun 2021 Chaolei Tan, Zihang Lin, Jian-Fang Hu, Xiang Li, Wei-Shi Zheng

We propose an effective two-stage approach to tackle the problem of language-based Human-centric Spatio-Temporal Video Grounding (HC-STVG) task.

Spatio-Temporal Video Grounding Video Grounding

Graph-Based High-Order Relation Modeling for Long-Term Action Recognition

no code implementations CVPR 2021 Jiaming Zhou, Kun-Yu Lin, Haoxin Li, Wei-Shi Zheng

In this paper, we propose a Graph-based High-order Relation Modeling (GHRM) module to exploit the high-order relations in the long-term actions for long-term action recognition.

Action Recognition Long-video Activity Recognition +3

Discriminative Distillation to Reduce Class Confusion in Continual Learning

no code implementations11 Aug 2021 Changhong Zhong, Zhiying Cui, Ruixuan Wang, Wei-Shi Zheng

Successful continual learning of new knowledge would enable intelligent systems to recognize more and more classes of objects.

Continual Learning Image Classification

Understanding of Kernels in CNN Models by Suppressing Irrelevant Visual Features in Images

1 code implementation25 Aug 2021 Jia-Xin Zhuang, Wanying Tao, Jianfei Xing, Wei Shi, Ruixuan Wang, Wei-Shi Zheng

In this paper, a simple yet effective optimization method is proposed to interpret the activation of any kernel of interest in CNN models.

Learning To Know Where To See: A Visibility-Aware Approach for Occluded Person Re-Identification

no code implementations ICCV 2021 Jinrui Yang, Jiawei Zhang, Fufu Yu, Xinyang Jiang, Mengdan Zhang, Xing Sun, Ying-Cong Chen, Wei-Shi Zheng

Several mainstream methods utilize extra cues (e. g., human pose information) to distinguish human parts from obstacles to alleviate the occlusion problem.

Person Re-Identification

Predictive Feature Learning for Future Segmentation Prediction

no code implementations ICCV 2021 Zihang Lin, Jiangxin Sun, Jian-Fang Hu, QiZhi Yu, Jian-Huang Lai, Wei-Shi Zheng

In the latent feature learned by the autoencoder, global structures are enhanced and local details are suppressed so that it is more predictive.

Segmentation

Action-guided 3D Human Motion Prediction

no code implementations NeurIPS 2021 Jiangxin Sun, Zihang Lin, Xintong Han, Jian-Fang Hu, Jia Xu, Wei-Shi Zheng

The ability of forecasting future human motion is important for human-machine interaction systems to understand human behaviors and make interaction.

Human motion prediction motion prediction

Letter-level Online Writer Identification

no code implementations6 Dec 2021 Zelin Chen, Hong-Xing Yu, AnCong Wu, Wei-Shi Zheng

To make the application of writer-id more practical (e. g., on mobile devices), we focus on a novel problem, letter-level online writer-id, which requires only a few trajectories of written letters as identification cues.

Unpaired Image Captioning by Image-level Weakly-Supervised Visual Concept Recognition

no code implementations7 Mar 2022 Peipei Zhu, Xiao Wang, Yong Luo, Zhenglong Sun, Wei-Shi Zheng, YaoWei Wang, Changwen Chen

The image-level labels are utilized to train a weakly-supervised object recognition model to extract object information (e. g., instance) in an image, and the extracted instances are adopted to infer the relationships among different objects based on an enhanced graph neural network (GNN).

Image Captioning Object +1

Continual Learning with Bayesian Model based on a Fixed Pre-trained Feature Extractor

no code implementations28 Apr 2022 Yang Yang, Zhiying Cui, Junjie Xu, Changhong Zhong, Wei-Shi Zheng, Ruixuan Wang

In this case, updating the intelligent system with data of new diseases would inevitably downgrade its performance on previously learned diseases.

Class Incremental Learning Image Classification +1

Likert Scoring With Grade Decoupling for Long-Term Action Assessment

no code implementations CVPR 2022 Angchi Xu, Ling-An Zeng, Wei-Shi Zheng

Long-term action quality assessment is a task of evaluating how well an action is performed, namely, estimating a quality score from a long video.

Action Assessment Action Quality Assessment

STVGFormer: Spatio-Temporal Video Grounding with Static-Dynamic Cross-Modal Understanding

no code implementations6 Jul 2022 Zihang Lin, Chaolei Tan, Jian-Fang Hu, Zhi Jin, Tiancai Ye, Wei-Shi Zheng

The static branch performs cross-modal understanding in a single frame and learns to localize the target object spatially according to intra-frame visual cues like object appearances.

Spatio-Temporal Video Grounding Video Grounding

PCCT: Progressive Class-Center Triplet Loss for Imbalanced Medical Image Classification

no code implementations11 Jul 2022 Kanghao Chen, Weixian Lei, Rong Zhang, Shen Zhao, Wei-Shi Zheng, Ruixuan Wang

For the class-center involved triplet loss, the positive and negative samples in each triplet are replaced by their corresponding class centers, which enforces data representations of the same class closer to the class center.

Image Classification Medical Image Classification

Learning Discriminative Representation via Metric Learning for Imbalanced Medical Image Classification

no code implementations14 Jul 2022 Chenghua Zeng, Huijuan Lu, Kanghao Chen, Ruixuan Wang, Wei-Shi Zheng

Data imbalance between common and rare diseases during model training often causes intelligent diagnosis systems to have biased predictions towards common diseases.

Image Classification Medical Image Classification +1

Text-Adaptive Multiple Visual Prototype Matching for Video-Text Retrieval

no code implementations27 Sep 2022 Chengzhi Lin, AnCong Wu, Junwei Liang, Jun Zhang, Wenhang Ge, Wei-Shi Zheng, Chunhua Shen

To address this problem, we propose a Text-Adaptive Multiple Visual Prototype Matching model, which automatically captures multiple prototypes to describe a video by adaptive aggregation of video token features.

Cross-Modal Retrieval Retrieval +2

Adaptively Integrated Knowledge Distillation and Prediction Uncertainty for Continual Learning

no code implementations18 Jan 2023 Kanghao Chen, Sijia Liu, Ruixuan Wang, Wei-Shi Zheng

The first one is to adaptively integrate multiple levels of old knowledge and transfer it to each block level in the new model.

Continual Learning Knowledge Distillation

PSLT: A Light-weight Vision Transformer with Ladder Self-Attention and Progressive Shift

no code implementations7 Apr 2023 Gaojie Wu, Wei-Shi Zheng, Yutong Lu, Qi Tian

In this work, we propose a ladder self-attention block with multiple branches and a progressive shift mechanism to develop a light-weight transformer backbone that requires less computing resources (e. g. a relatively small number of parameters and FLOPs), termed Progressive Shift Ladder Transformer (PSLT).

Image Classification Person Re-Identification

Pyramid Texture Filtering

no code implementations11 May 2023 Qing Zhang, Hao Jiang, Yongwei Nie, Wei-Shi Zheng

We present a simple but effective technique to smooth out textures while preserving the prominent structures.

Image Enhancement Tone Mapping

Collaborative Static and Dynamic Vision-Language Streams for Spatio-Temporal Video Grounding

no code implementations CVPR 2023 Zihang Lin, Chaolei Tan, Jian-Fang Hu, Zhi Jin, Tiancai Ye, Wei-Shi Zheng

The static stream performs cross-modal understanding in a single frame and learns to attend to the target object spatially according to intra-frame visual cues like object appearances.

Object Spatio-Temporal Video Grounding +1

Generating Anomalies for Video Anomaly Detection With Prompt-Based Feature Mapping

no code implementations CVPR 2023 Zuhao Liu, Xiao-Ming Wu, Dian Zheng, Kun-Yu Lin, Wei-Shi Zheng

There also exists a scene gap between virtual and real scenarios, including scene-specific anomalies (events that are abnormal in one scene but normal in another) and scene-specific attributes, such as the viewpoint of the surveillance camera.

Anomaly Detection In Surveillance Videos Video Anomaly Detection

Hierarchical Semantic Correspondence Networks for Video Paragraph Grounding

no code implementations CVPR 2023 Chaolei Tan, Zihang Lin, Jian-Fang Hu, Wei-Shi Zheng, JianHuang Lai

Specifically, we develop a hierarchical encoder that encodes the multi-modal inputs into semantics-aligned representations at different levels.

Sentence Video Grounding

Event-Guided Procedure Planning from Instructional Videos with Text Supervision

no code implementations ICCV 2023 An-Lan Wang, Kun-Yu Lin, Jia-Run Du, Jingke Meng, Wei-Shi Zheng

In this work, we focus on the task of procedure planning from instructional videos with text supervision, where a model aims to predict an action sequence to transform the initial visual state into the goal visual state.

DiffuVolume: Diffusion Model for Volume based Stereo Matching

no code implementations30 Aug 2023 Dian Zheng, Xiao-Ming Wu, Zuhao Liu, Jingke Meng, Wei-Shi Zheng

Our method, termed DiffuVolume, considers the diffusion model as a cost volume filter, which will recurrently remove the redundant information from the cost volume.

Stereo Matching Zero-shot Generalization

Continual Action Assessment via Task-Consistent Score-Discriminative Feature Distribution Modeling

no code implementations29 Sep 2023 Yuan-Ming Li, Ling-An Zeng, Jing-Ke Meng, Wei-Shi Zheng

Our idea for modeling Continual-AQA is to sequentially learn a task-consistent score-discriminative feature distribution, in which the latent features express a strong correlation with the score labels regardless of the task or action types.

Action Assessment Action Quality Assessment +1

Cross-Modal Adaptive Dual Association for Text-to-Image Person Retrieval

no code implementations4 Dec 2023 Dixuan Lin, Yixing Peng, Jingke Meng, Wei-Shi Zheng

In this work, we show the discrepancy between image-to-text association and text-to-image association and propose CADA: Cross-Modal Adaptive Dual Association that finely builds bidirectional image-text detailed associations.

Attribute Cross-Modal Person Re-Identification +5

Transformer for Object Re-Identification: A Survey

no code implementations13 Jan 2024 Mang Ye, Shuoyi Chen, Chenyue Li, Wei-Shi Zheng, David Crandall, Bo Du

Object Re-Identification (Re-ID) aims to identify and retrieve specific objects from varying viewpoints.

Object

ActionHub: A Large-scale Action Video Description Dataset for Zero-shot Action Recognition

no code implementations22 Jan 2024 Jiaming Zhou, Junwei Liang, Kun-Yu Lin, Jinrui Yang, Wei-Shi Zheng

With the proposed ActionHub dataset, we further propose a novel Cross-modality and Cross-action Modeling (CoCo) framework for ZSAR, which consists of a Dual Cross-modality Alignment module and a Cross-action Invariance Mining module.

Action Recognition Video Description +1

Siamese Learning with Joint Alignment and Regression for Weakly-Supervised Video Paragraph Grounding

no code implementations18 Mar 2024 Chaolei Tan, JianHuang Lai, Wei-Shi Zheng, Jian-Fang Hu

Different from previous weakly-supervised grounding frameworks based on multiple instance learning or reconstruction learning for two-stage candidate ranking, we propose a novel siamese learning framework that jointly learns the cross-modal feature alignment and temporal coordinate regression without timestamp labels to achieve concise one-stage localization for WSVPG.

Multiple Instance Learning

Ranking Distillation for Open-Ended Video Question Answering with Insufficient Labels

no code implementations21 Mar 2024 Tianming Liang, Chaolei Tan, Beihao Xia, Wei-Shi Zheng, Jian-Fang Hu

This paper focuses on open-ended video question answering, which aims to find the correct answers from a large answer set in response to a video-related question.

Multi-Label Classification Question Answering +1

Cannot find the paper you are looking for? You can Submit a new open access paper.