StyleFlow For Content-Fixed Image to Image Translation

1 code implementation5 Jul 2022 Weichen Fan, Jinghuan Chen, Jiabin Ma, Jun Hou, Shuai Yi

We evaluate our model in several I2I translation benchmarks, and the results show that the proposed model has advantages over previous methods in both strongly constrained and normally constrained tasks.

Colorization Image-to-Image Translation +2

Better Teacher Better Student: Dynamic Prior Knowledge for Knowledge Distillation

1 code implementation13 Jun 2022 Zengyu Qiu, Xinzhu Ma, Kunlin Yang, Chunya Liu, Jun Hou, Shuai Yi, Wanli Ouyang

Besides, our DPK makes the performance of the student model positively correlated with that of the teacher model, which means that we can further boost the accuracy of students by applying larger teachers.

Image Classification Knowledge Distillation +3

Federated Unsupervised Domain Adaptation for Face Recognition

no code implementations9 Apr 2022 Weiming Zhuang, Xin Gan, Yonggang Wen, Xuesen Zhang, Shuai Zhang, Shuai Yi

To address this problem, we propose federated unsupervised domain adaptation for face recognition, FedFR.

Clustering Face Recognition +2

Pyramid Fusion Transformer for Semantic Segmentation

no code implementations11 Jan 2022 Zipeng Qin, Jianbo Liu, Xiaolin Zhang, Maoqing Tian, Aojun Zhou, Shuai Yi, Hongsheng Li

The recently proposed MaskFormer gives a refreshed perspective on the task of semantic segmentation: it shifts from the popular pixel-level classification paradigm to a mask-level classification method.

Decoder Segmentation +1

Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

1 code implementation ICCV 2021 Ziniu Wan, Zhengjia Li, Maoqing Tian, Jianbo Liu, Shuai Yi, Hongsheng Li

To this end, we propose Multi-level Attention Encoder-Decoder Network (MAED), including a Spatial-Temporal Encoder (STE) and a Kinematic Topology Decoder (KTD) to model multi-level attentions in a unified framework.

3D Absolute Human Pose Estimation Decoder

GroupFormer: Group Activity Recognition with Clustered Spatial-Temporal Transformer

1 code implementation ICCV 2021 Shuaicheng Li, Qianggang Cao, Lingbo Liu, Kunlin Yang, Shinan Liu, Jun Hou, Shuai Yi

It captures spatial-temporal contextual information jointly to augment the individual and group representations effectively with a clustered spatial-temporal transformer.

Group Activity Recognition

Collaborative Unsupervised Visual Representation Learning from Decentralized Data

1 code implementation ICCV 2021 Weiming Zhuang, Xin Gan, Yonggang Wen, Shuai Zhang, Shuai Yi

In this framework, each party trains models from unlabeled data independently using contrastive learning with an online network and a target network.

Contrastive Learning Federated Learning +3

Unsupervised Domain Adaptive 3D Detection with Multi-Level Consistency

1 code implementation ICCV 2021 Zhipeng Luo, Zhongang Cai, Changqing Zhou, Gongjie Zhang, Haiyu Zhao, Shuai Yi, Shijian Lu, Hongsheng Li, Shanghang Zhang, Ziwei Liu

In addition, existing 3D domain adaptive detection methods often assume prior access to the target domain annotations, which is rarely feasible in the real world.

3D Object Detection Autonomous Driving +1

Domain Consistency Regularization for Unsupervised Multi-source Domain Adaptive Classification

no code implementations16 Jun 2021 Zhipeng Luo, Xiaobing Zhang, Shijian Lu, Shuai Yi

Compared with single-source unsupervised domain adaptation (SUDA), domain shift in MUDA exists not only between the source and target domains but also among multiple source domains.

Classification Multi-Source Unsupervised Domain Adaptation +2

Towards Unsupervised Domain Adaptation for Deep Face Recognition under Privacy Constraints via Federated Learning

no code implementations17 May 2021 Weiming Zhuang, Xin Gan, Yonggang Wen, Xuesen Zhang, Shuai Zhang, Shuai Yi

To this end, FedFR forms an end-to-end training pipeline: (1) pre-train in the source domain; (2) predict pseudo labels by clustering in the target domain; (3) conduct domain-constrained federated learning across two domains.

Clustering Face Recognition +2

Unsupervised 3D Shape Completion through GAN Inversion

no code implementations CVPR 2021 Junzhe Zhang, Xinyi Chen, Zhongang Cai, Liang Pan, Haiyu Zhao, Shuai Yi, Chai Kiat Yeo, Bo Dai, Chen Change Loy

In contrast to previous fully supervised approaches, in this paper we present ShapeInversion, which introduces Generative Adversarial Network (GAN) inversion to shape completion for the first time.

Generative Adversarial Network valid

Variational Relational Point Completion Network

1 code implementation CVPR 2021 Liang Pan, Xinyi Chen, Zhongang Cai, Junzhe Zhang, Haiyu Zhao, Shuai Yi, Ziwei Liu

In particular, we propose a dual-path architecture to enable principled probabilistic modeling across partial and complete clouds.

Point Cloud Completion

Delving into Localization Errors for Monocular 3D Object Detection

1 code implementation CVPR 2021 Xinzhu Ma, Yinmin Zhang, Dan Xu, Dongzhan Zhou, Shuai Yi, Haojie Li, Wanli Ouyang

Estimating 3D bounding boxes from monocular images is an essential component in autonomous driving, while accurate 3D object detection from this kind of data is very challenging.

3D Object Detection From Monocular Images Autonomous Driving +3

Towards Overcoming False Positives in Visual Relationship Detection

no code implementations23 Dec 2020 Daisheng Jin, Xiao Ma, Chongzhi Zhang, Yizhuo Zhou, Jiashu Tao, Mingyuan Zhang, Haiyu Zhao, Shuai Yi, Zhoujun Li, Xianglong Liu, Hongsheng Li

We observe that during training, the relationship proposal distribution is highly imbalanced: most of the negative relationship proposals are easy to identify, e. g., the inaccurate object detection, which leads to the under-fitting of low-frequency difficult proposals.

Decoder Graph Attention +5

REFINE: Prediction Fusion Network for Panoptic Segmentation

no code implementations15 Dec 2020 Jiawei Ren, Cunjun Yu, Zhongang Cai, Mingyuan Zhang, Chongsong Chen, Haiyu Zhao, Shuai Yi, Hongsheng Li

Panoptic segmentation aims at generating pixel-wise class and instance predictions for each pixel in the input image, which is a challenging task and far more complicated than naively fusing the semantic and instance segmentation results.

Instance Segmentation Panoptic Segmentation +1

BiPointNet: Binary Neural Network for Point Clouds

1 code implementation ICLR 2021 Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Liu, Hao Su

To alleviate the resource constraint for real-time point cloud applications that run on edge devices, in this paper we present BiPointNet, the first model binarization approach for efficient deep learning on point clouds.


Performance Optimization for Federated Person Re-identification via Benchmark Analysis

2 code implementations26 Aug 2020 Weiming Zhuang, Yonggang Wen, Xuesen Zhang, Xin Gan, Daiying Yin, Dongzhan Zhou, Shuai Zhang, Shuai Yi

Then we propose two optimization methods: (1) To address the unbalanced weight problem, we propose a new method to dynamically change the weights according to the scale of model changes in clients in each training round; (2) To facilitate convergence, we adopt knowledge distillation to refine the server model with knowledge generated from client models on a public dataset.

Federated Learning Knowledge Distillation +2

MessyTable: Instance Association in Multiple Camera Views

no code implementations ECCV 2020 Zhongang Cai, Junzhe Zhang, Daxuan Ren, Cunjun Yu, Haiyu Zhao, Shuai Yi, Chai Kiat Yeo, Chen Change Loy

We present an interesting and challenging dataset that features a large number of scenes with messy tables captured from multiple camera views.

Balanced Meta-Softmax for Long-Tailed Visual Recognition

1 code implementation NeurIPS 2020 Jiawei Ren, Cunjun Yu, Shunan Sheng, Xiao Ma, Haiyu Zhao, Shuai Yi, Hongsheng Li

In our experiments, we demonstrate that Balanced Meta-Softmax outperforms state-of-the-art long-tailed classification solutions on both visual recognition and instance segmentation tasks.

General Classification Instance Segmentation +2

Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction

1 code implementation ECCV 2020 Cunjun Yu, Xiao Ma, Jiawei Ren, Haiyu Zhao, Shuai Yi

In this paper, we present STAR, a Spatio-Temporal grAph tRansformer framework, which tackles trajectory prediction by only attention mechanisms.

Autonomous Driving Pedestrian Trajectory Prediction +1

MagnifierNet: Towards Semantic Adversary and Fusion for Person Re-identification

1 code implementation25 Feb 2020 Yushi Lan, Yu-An Liu, Maoqing Tian, Xinchi Zhou, Xuesen Zhang, Shuai Yi, Hongsheng Li

Meanwhile, we introduce "Semantic Fusion Branch" to filter out irrelevant noises by selectively fusing semantic region information sequentially.

Person Re-Identification

GTC: Guided Training of CTC Towards Efficient and Accurate Scene Text Recognition

no code implementations4 Feb 2020 Wenyang Hu, Xiaocong Cai, Jun Hou, Shuai Yi, Zhiping Lin

Extensive experiments on standard benchmarks demonstrate that our end-to-end model achieves a new state-of-the-art for regular and irregular scene text recognition and needs 6 times shorter inference time than attentionbased methods.

Decoder Scene Text Recognition

EcoNAS: Finding Proxies for Economical Neural Architecture Search

no code implementations CVPR 2020 Dongzhan Zhou, Xinchi Zhou, Wenwei Zhang, Chen Change Loy, Shuai Yi, Xuesen Zhang, Wanli Ouyang

While many methods have been proposed to improve the efficiency of NAS, the search progress is still laborious because training and evaluating plausible architectures over large search space is time-consuming.

Neural Architecture Search

FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction

6 code implementations NeurIPS 2018 Shuyang Sun, Jiangmiao Pang, Jianping Shi, Shuai Yi, Wanli Ouyang

The basic principles in designing convolutional neural network (CNN) structures for predicting objects on different levels, e. g., image-level, region-level, and pixel-level are diverging.

Image Classification

FD-GAN: Pose-guided Feature Distilling GAN for Robust Person Re-identification

2 code implementations NeurIPS 2018 Yixiao Ge, Zhuowan Li, Haiyu Zhao, Guojun Yin, Shuai Yi, Xiaogang Wang, Hongsheng Li

Our proposed FD-GAN achieves state-of-the-art performance on three person reID datasets, which demonstrates that the effectiveness and robust feature distilling capability of the proposed FD-GAN.

Generative Adversarial Network Person Re-Identification

Learning Monocular Depth by Distilling Cross-domain Stereo Networks

1 code implementation ECCV 2018 Xiaoyang Guo, Hongsheng Li, Shuai Yi, Jimmy Ren, Xiaogang Wang

Monocular depth estimation aims at estimating a pixelwise depth map for a single image, which has wide applications in scene understanding and autonomous driving.

Autonomous Driving Monocular Depth Estimation +3

Deep Group-shuffling Random Walk for Person Re-identification

1 code implementation CVPR 2018 Yantao Shen, Hongsheng Li, Tong Xiao, Shuai Yi, Dapeng Chen, Xiaogang Wang

Person re-identification aims at finding a person of interest in an image gallery by comparing the probe image of this person with all the gallery images.

Person Re-Identification Retrieval

Person Re-identification with Deep Similarity-Guided Graph Neural Network

no code implementations ECCV 2018 Yantao Shen, Hongsheng Li, Shuai Yi, Dapeng Chen, Xiaogang Wang

However, existing person re-identification models mostly estimate the similarities of different image pairs of probe and gallery images independently while ignores the relationship information between different probe-gallery pairs.

Graph Neural Network Person Re-Identification +1

Video Person Re-Identification With Competitive Snippet-Similarity Aggregation and Co-Attentive Snippet Embedding

no code implementations CVPR 2018 Dapeng Chen, Hongsheng Li, Tong Xiao, Shuai Yi, Xiaogang Wang

The attention weights are obtained based on a query feature, which is learned from the whole probe snippet by an LSTM network, making the resulting embeddings less affected by noisy frames.

Video-Based Person Re-Identification

Hierarchical Deep Recurrent Architecture for Video Understanding

1 code implementation11 Jul 2017 Luming Tang, Boyang Deng, Haiyu Zhao, Shuai Yi

The proposed framework contains hierarchical deep architecture, including the frame-level sequence modeling part and the video-level classification part.

Classification General Classification +2

Pedestrian Travel Time Estimation in Crowded Scenes

no code implementations ICCV 2015 Shuai Yi, Hongsheng Li, Xiaogang Wang

In this paper, we target on the problem of estimating the statistic of pedestrian travel time within a period from an entrance to a destination in a crowded scene.

Blocking Scene Understanding +1

Understanding Pedestrian Behaviors From Stationary Crowd Groups

no code implementations CVPR 2015 Shuai Yi, Hongsheng Li, Xiaogang Wang

Pedestrian behavior modeling and analysis is important for crowd scene understanding and has various applications in video surveillance.

Event Detection Scene Understanding

L0 Regularized Stationary Time Estimation for Crowd Group Analysis

no code implementations CVPR 2014 Shuai Yi, Xiaogang Wang, Cewu Lu, Jiaya Jia

We tackle stationary crowd analysis in this paper, which is similarly important as modeling mobile groups in crowd scenes and finds many applications in surveillance.

