Search Results for author: Guodong Guo

Found 74 papers, 22 papers with code

GINet: Graph Interaction Network for Scene Parsing

1 code implementation • ECCV 2020 • Tianyi Wu, Yu Lu, Yu Zhu, Chuang Zhang, Ming Wu, Zhanyu Ma, Guodong Guo

GI unit is further improved by the SC-loss to enhance the semantic representations over the exemplar-based semantic graph.

Scene Parsing

8,238

Paper
Code

Fully Transformer Networks for Semantic Image Segmentation

1 code implementation • 8 Jun 2021 • Sitong Wu, Tianyi Wu, Fangjian Lin, Shengwei Tian, Guodong Guo

Transformers have shown impressive performance in various natural language processing and computer vision tasks, due to the capability of modeling long-range dependencies.

Face Parsing Image Segmentation +2

1,183

Paper
Code

Pale Transformer: A General Vision Transformer Backbone with Pale-Shaped Attention

2 code implementations • 28 Dec 2021 • Sitong Wu, Tianyi Wu, Haoru Tan, Guodong Guo

To reduce the quadratic computation complexity caused by the global self-attention, various methods constrain the range of attention within a local region to improve its efficiency.

Instance Segmentation object-detection +2

1,183

Paper
Code

Coarse-to-Fine Cascaded Networks with Smooth Predicting for Video Facial Expression Recognition

1 code implementation • 24 Mar 2022 • Fanglei Xue, Zichang Tan, Yu Zhu, Zhongsong Ma, Guodong Guo

To be specific, the universal features denote the general characteristic of facial emotions within a period and the unique features denote the specific characteristic at this moment.

Facial Expression Recognition Facial Expression Recognition (FER)

1,183

Paper
Code

Anti-UAV: A Large Multi-Modal Benchmark for UAV Tracking

1 code implementation • 21 Jan 2021 • Nan Jiang, Kuiran Wang, Xiaoke Peng, Xuehui Yu, Qiang Wang, Junliang Xing, Guorong Li, Jian Zhao, Guodong Guo, Zhenjun Han

The releasing of such a large-scale dataset could be a useful initial step in research of tracking UAVs.

210

Paper
Code

Nested Collaborative Learning for Long-Tailed Visual Recognition

1 code implementation • CVPR 2022 • Jun Li, Zichang Tan, Jun Wan, Zhen Lei, Guodong Guo

NCL consists of two core components, namely Nested Individual Learning (NIL) and Nested Balanced Online Distillation (NBOD), which focus on the individual supervised learning for each single expert and the knowledge transferring among multiple experts, respectively.

Ranked #6 on Long-tail Learning on CIFAR-10-LT (ρ=50)

Image Classification Long-tail Learning

Paper
Code

Q-ViT: Accurate and Fully Quantized Low-bit Vision Transformer

1 code implementation • 13 Oct 2022 • Yanjing Li, Sheng Xu, Baochang Zhang, Xianbin Cao, Peng Gao, Guodong Guo

The large pre-trained vision transformers (ViTs) have demonstrated remarkable performance on various visual tasks, but suffer from expensive computational and memory cost problems when deployed on resource-constrained devices.

Quantization

Paper
Code

SKFlow: Learning Optical Flow with Super Kernels

1 code implementation • 29 May 2022 • Shangkun Sun, Yuanqi Chen, Yu Zhu, Guodong Guo, Ge Li

In this paper, we propose the Super Kernel Flow Network (SKFlow), a CNN architecture to ameliorate the impacts of occlusions on optical flow estimation.

Optical Flow Estimation

Paper
Code

Vision Transformer with Attentive Pooling for Robust Facial Expression Recognition

1 code implementation • 11 Dec 2022 • Fanglei Xue, Qiangchang Wang, Zichang Tan, Zhongsong Ma, Guodong Guo

The proposed APP is employed to select the most informative patches on CNN features, and ATP discards unimportant tokens in ViT.

Ranked #5 on Facial Expression Recognition (FER) on RAF-DB

Facial Expression Recognition Facial Expression Recognition (FER) +1

Paper
Code

How is Gaze Influenced by Image Transformations? Dataset and Model

1 code implementation • 16 May 2019 • Zhaohui Che, Ali Borji, Guangtao Zhai, Xiongkuo Min, Guodong Guo, Patrick Le Callet

Data size is the bottleneck for developing deep saliency models, because collecting eye-movement data is very time consuming and expensive.

Data Augmentation Generative Adversarial Network +1

Paper
Code

EAN: Event Adaptive Network for Enhanced Action Recognition

1 code implementation • 22 Jul 2021 • Yuan Tian, Yichao Yan, Guangtao Zhai, Guodong Guo, Zhiyong Gao

In this paper, we propose a unified action recognition framework to investigate the dynamic nature of video content by introducing the following designs.

Ranked #14 on Action Recognition on Something-Something V1

Action Recognition

Paper
Code

Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection

1 code implementation • CVPR 2021 • Li Wang, Liang Du, Xiaoqing Ye, Yanwei Fu, Guodong Guo, xiangyang xue, Jianfeng Feng, Li Zhang

The objective of this paper is to learn context- and depth-aware feature representation to solve the problem of monocular 3D object detection.

Ranked #13 on Monocular 3D Object Detection on KITTI Cars Moderate

Monocular 3D Object Detection object-detection

Paper
Code

Q-DETR: An Efficient Low-Bit Quantized Detection Transformer

1 code implementation • CVPR 2023 • Sheng Xu, Yanjing Li, Mingbao Lin, Peng Gao, Guodong Guo, Jinhu Lu, Baochang Zhang

At the upper level, we introduce a new foreground-aware query matching scheme to effectively transfer the teacher information to distillation-desired features to minimize the conditional information entropy.

object-detection Object Detection +1

Paper
Code

Recurrent Bilinear Optimization for Binary Neural Networks

2 code implementations • 4 Sep 2022 • Sheng Xu, Yanjing Li, Tiancheng Wang, Teli Ma, Baochang Zhang, Peng Gao, Yu Qiao, Jinhu Lv, Guodong Guo

To address this issue, Recurrent Bilinear Optimization is proposed to improve the learning process of BNNs (RBONNs) by associating the intrinsic bilinear variables in the back propagation process.

object-detection Object Detection

Paper
Code

Self-Conditioned Probabilistic Learning of Video Rescaling

1 code implementation • ICCV 2021 • Yuan Tian, Guo Lu, Xiongkuo Min, Zhaohui Che, Guangtao Zhai, Guodong Guo, Zhiyong Gao

After optimization, the downscaled video by our framework preserves more meaningful information, which is beneficial for both the upscaling step and the downstream tasks, e. g., video action recognition task.

Video Compression Video Super-Resolution

Paper
Code

iffDetector: Inference-aware Feature Filtering for Object Detection

1 code implementation • 23 Jun 2020 • Mingyuan Mao, Yuxin Tian, Baochang Zhang, Qixiang Ye, Wanquan Liu, Guodong Guo, David Doermann

In this paper, we propose a new feature optimization approach to enhance features and suppress background noise in both the training and inference stages.

Object object-detection +1

Paper
Code

Perceptual Attacks of No-Reference Image Quality Models with Human-in-the-Loop

1 code implementation • 3 Oct 2022 • Weixia Zhang, Dingquan Li, Xiongkuo Min, Guangtao Zhai, Guodong Guo, Xiaokang Yang, Kede Ma

No-reference image quality assessment (NR-IQA) aims to quantify how humans perceive visual distortions of digital images without access to their undistorted references.

No-Reference Image Quality Assessment NR-IQA

Paper
Code

Adaptive Sparse ViT: Towards Learnable Adaptive Token Pruning by Fully Exploiting Self-Attention

1 code implementation • 28 Sep 2022 • Xiangcheng Liu, Tianyi Wu, Guodong Guo

The learnable thresholds are optimized in budget-aware training to balance accuracy and complexity, performing the corresponding pruning configurations for different input instances.

Ranked #6 on Efficient ViTs on ImageNet-1K (With LV-ViT-S)

Efficient ViTs Informativeness

Paper
Code

Looking Here or There? Gaze Following in 360-Degree Images

1 code implementation • ICCV 2021 • Yunhao Li, Wei Shen, Zhongpai Gao, Yucheng Zhu, Guangtao Zhai, Guodong Guo

Specifically, the local region is obtained as a 2D cone-shaped field along the 2D projection of the sight line starting at the human subject's head position, and the distant region is obtained by searching along the sight line in 3D sphere space.

Paper
Code

Domain-Aware SE Network for Sketch-based Image Retrieval with Multiplicative Euclidean Margin Softmax

1 code implementation • 11 Dec 2018 • Peng Lu, Gao Huang, Hangyu Lin, Wenming Yang, Guodong Guo, Yanwei Fu

This paper proposes a novel approach for Sketch-Based Image Retrieval (SBIR), for which the key is to bridge the gap between sketches and photos in terms of the data representation.

Retrieval Sketch-Based Image Retrieval

Paper
Code

Defending Black-box Skeleton-based Human Activity Classifiers

2 code implementations • 9 Mar 2022 • He Wang, Yunfeng Diao, Zichang Tan, Guodong Guo

Our method is featured by full Bayesian treatments of the clean data, the adversaries and the classifier, leading to (1) a new Bayesian Energy-based formulation of robust discriminative classifiers, (2) a new adversary sampling scheme based on natural motion manifolds, and (3) a new post-train Bayesian strategy for black-box defense.

Human Activity Recognition Time Series Analysis

Paper
Code

Anti-Retroactive Interference for Lifelong Learning

1 code implementation • 27 Aug 2022 • Runqi Wang, Yuxiang Bao, Baochang Zhang, Jianzhuang Liu, Wentao Zhu, Guodong Guo

Second, according to the similarity between incremental knowledge and base knowledge, we design an adaptive fusion of incremental knowledge, which helps the model allocate capacity to the knowledge of different difficulties.

Meta-Learning

Paper
Code

Attributes in Multiple Facial Images

no code implementations • 23 May 2018 • Xudong Liu, Guodong Guo

To address this question, we deploy deep training for facial attributes prediction, and we explore the inconsistency issue among the attributes computed from each single image.

Attribute Face Recognition

Paper
Add Code

Learning Channel Inter-dependencies at Multiple Scales on Dense Networks for Face Recognition

no code implementations • 28 Nov 2017 • Qiangchang Wang, Guodong Guo, Mohammad Iqbal Nouyed

We propose a new deep network structure for unconstrained face recognition.

Face Recognition

Paper
Add Code

Unconstrained Face Detection and Open-Set Face Recognition Challenge

no code implementations • 8 Aug 2017 • Manuel Günther, Peiyun Hu, Christian Herrmann, Chi Ho Chan, Min Jiang, Shufan Yang, Akshay Raj Dhamija, Deva Ramanan, Jürgen Beyerer, Josef Kittler, Mohamad Al Jazaery, Mohammad Iqbal Nouyed, Guodong Guo, Cezary Stankiewicz, Terrance E. Boult

Face detection and recognition benchmarks have shifted toward more difficult environments.

Face Detection Face Identification +3

Paper
Add Code

A Study on Cross-Population Age Estimation

no code implementations • CVPR 2014 • Guodong Guo, Chao Zhang

Further, we study the amount of data needed in the target population to learn a cross-population age estimator.

Age Estimation Human Aging +1

Paper
Add Code

Adversarial Attacks against Deep Saliency Models

no code implementations • 2 Apr 2019 • Zhaohui Che, Ali Borji, Guangtao Zhai, Suiyi Ling, Guodong Guo, Patrick Le Callet

The proposed attack only requires a part of the model information, and is able to generate a sparser and more insidious adversarial perturbation, compared to traditional image-space attacks.

Adversarial Attack object-detection +1

Paper
Add Code

Supervised Online Hashing via Similarity Distribution Learning

no code implementations • 31 May 2019 • Mingbao Lin, Rongrong Ji, Shen Chen, Feng Zheng, Xiaoshuai Sun, Baochang Zhang, Liujuan Cao, Guodong Guo, Feiyue Huang

In this paper, we propose to model the similarity distributions between the input data and the hashing codes, upon which a novel supervised online hashing method, dubbed as Similarity Distribution based Online Hashing (SDOH), is proposed, to keep the intrinsic semantic relationship in the produced Hamming space.

Retrieval

Paper
Add Code

A database for face presentation attack using wax figure faces

no code implementations • 6 Jun 2019 • Shan Jia, Chuanbo Hu, Guodong Guo, Zhengquan Xu

Compared to 2D face presentation attacks (e. g. printed photos and video replays), 3D type attacks are more challenging to face recognition systems (FRS) by presenting 3D characteristics or materials similar to real faces.

Face Presentation Attack Detection Face Recognition +1

Paper
Add Code

UGAN: Untraceable GAN for Multi-Domain Face Translation

no code implementations • 26 Jul 2019 • Defa Zhu, Si Liu, Wentao Jiang, Chen Gao, Tianyi Wu, Qaingchang Wang, Guodong Guo

To address this issue, we propose a method called Untraceable GAN, which has a novel source classifier to differentiate which domain an image is translated from, and determines whether the translated image still retains the characteristics of the source domain.

Image-to-Image Translation Translation

Paper
Add Code

Consensus Feature Network for Scene Parsing

no code implementations • 29 Jul 2019 • Tianyi Wu, Sheng Tang, Rui Zhang, Guodong Guo, Yongdong Zhang

However, classification networks are dominated by the discriminative portion, so directly applying classification networks to scene parsing will result in inconsistent parsing predictions within one instance and among instances of the same category.

General Classification Scene Parsing

Paper
Add Code

ChaLearn Looking at People: IsoGD and ConGD Large-scale RGB-D Gesture Recognition

no code implementations • 29 Jul 2019 • Jun Wan, Chi Lin, Longyin Wen, Yunan Li, Qiguang Miao, Sergio Escalera, Gholamreza Anbarjafari, Isabelle Guyon, Guodong Guo, Stan Z. Li

The ChaLearn large-scale gesture recognition challenge has been run twice in two workshops in conjunction with the International Conference on Pattern Recognition (ICPR) 2016 and International Conference on Computer Vision (ICCV) 2017, attracting more than $200$ teams round the world.

Gesture Recognition

Paper
Add Code

Bayesian Optimized 1-Bit CNNs

no code implementations • ICCV 2019 • Jiaxin Gu, Junhe Zhao, Xiao-Long Jiang, Baochang Zhang, Jianzhuang Liu, Guodong Guo, Rongrong Ji

Deep convolutional neural networks (DCNNs) have dominated the recent developments in computer vision through making various record-breaking models.

Paper
Add Code

RBCN: Rectified Binary Convolutional Networks for Enhancing the Performance of 1-bit DCNNs

no code implementations • 21 Aug 2019 • Chunlei Liu, Wenrui Ding, Xin Xia, Yuan Hu, Baochang Zhang, Jianzhuang Liu, Bohan Zhuang, Guodong Guo

Binarized convolutional neural networks (BCNNs) are widely used to improve memory and computation efficiency of deep convolutional neural networks (DCNNs) for mobile and AI chips based applications.

Binarization Object Tracking

Paper
Add Code

Fusing multiple features for depth-based action recognition

no code implementations • ACM Transactions on Intelligent Systems and Technology (TIST) - Special Section on Visual Understanding with RGB-D Sensors 2015 • Yu Zhu, Wenbin Chen, Guodong Guo

The experiments are conducted on four challenging depth action databases, in order to evaluate and find the best fusion methods generally.

Ranked #5 on Multimodal Activity Recognition on MSR Daily Activity3D dataset

3D Action Recognition Multimodal Activity Recognition

Paper
Add Code

WiderPerson: A Diverse Dataset for Dense Pedestrian Detection in the Wild

no code implementations • 25 Sep 2019 • Shifeng Zhang, Yiliang Xie, Jun Wan, Hansheng Xia, Stan Z. Li, Guodong Guo

To narrow this gap and facilitate future pedestrian detection research, we introduce a large and diverse dataset named WiderPerson for dense pedestrian detection in the wild.

Ranked #3 on Object Detection on WiderPerson (mMR metric)

Object Detection Pedestrian Detection

Paper
Add Code

Aggregation Signature for Small Object Tracking

no code implementations • 24 Oct 2019 • Chunlei Liu, Wenrui Ding, Jinyu Yang, Vittorio Murino, Baochang Zhang, Jungong Han, Guodong Guo

In this paper, we propose a novel aggregation signature suitable for small object tracking, especially aiming for the challenge of sudden and large drift.

Object Object Tracking

Paper
Add Code

Face Detection on Surveillance Images

no code implementations • 22 Oct 2019 • Mohammad Iqbal Nouyed, Guodong Guo

In this paper, we perform a comparative performance analysis of some of the well known face detection methods including the few used in that competition, and, compare them to our proposed body pose based face detection method.

Benchmarking Face Detection +1

Paper
Add Code

GBCNs: Genetic Binary Convolutional Networks for Enhancing the Performance of 1-bit DCNNs

no code implementations • 25 Nov 2019 • Chunlei Liu, Wenrui Ding, Yuan Hu, Baochang Zhang, Jianzhuang Liu, Guodong Guo

The BGA method is proposed to modify the binary process of GBCNs to alleviate the local minima problem, which can significantly improve the performance of 1-bit DCNNs.

Face Recognition Object Recognition +1

Paper
Add Code

Robust Invisible Hyperlinks in Physical Photographs Based on 3D Rendering Attacks

no code implementations • 3 Dec 2019 • Jun Jia, Zhongpai Gao, Kang Chen, Menghan Hu, Guangtao Zhai, Guodong Guo, Xiaokang Yang

To train a robust decoder against the physical distortion from the real world, a distortion network based on 3D rendering is inserted between the encoder and the decoder to simulate the camera imaging process.

Paper
Add Code

Static and Dynamic Fusion for Multi-modal Cross-ethnicity Face Anti-spoofing

no code implementations • 5 Dec 2019 • Ajian Liu, Zichang Tan, Xuan Li, Jun Wan, Sergio Escalera, Guodong Guo, Stan Z. Li

Regardless of the usage of deep learning and handcrafted methods, the dynamic information from videos and the effect of cross-ethnicity are rarely considered in face anti-spoofing.

Face Anti-Spoofing

Paper
Add Code

CASIA-SURF CeFA: A Benchmark for Multi-modal Cross-ethnicity Face Anti-spoofing

no code implementations • 11 Mar 2020 • Ajian Li, Zichang Tan, Xuan Li, Jun Wan, Sergio Escalera, Guodong Guo, Stan Z. Li

Ethnic bias has proven to negatively affect the performance of face recognition systems, and it remains an open research problem in face anti-spoofing.

Face Anti-Spoofing Face Recognition

Paper
Add Code

Cross-ethnicity Face Anti-spoofing Recognition Challenge: A Review

no code implementations • 23 Apr 2020 • Ajian Liu, Xuan Li, Jun Wan, Sergio Escalera, Hugo Jair Escalante, Meysam Madadi, Yi Jin, Zhuoyuan Wu, Xiaogang Yu, Zichang Tan, Qi Yuan, Ruikun Yang, Benjia Zhou, Guodong Guo, Stan Z. Li

Although ethnic bias has been verified to severely affect the performance of face recognition systems, it still remains an open research problem in face anti-spoofing.

Face Anti-Spoofing Face Recognition

Paper
Add Code

3D Face Anti-spoofing with Factorized Bilinear Coding

no code implementations • 12 May 2020 • Shan Jia, Xin Li, Chuanbo Hu, Guodong Guo, Zhengquan Xu

We have witnessed rapid advances in both face presentation attack models and presentation attack detection (PAD) in recent years.

Face Anti-Spoofing Face Presentation Attack Detection +1

Paper
Add Code

Cogradient Descent for Bilinear Optimization

no code implementations • CVPR 2020 • Li'an Zhuo, Baochang Zhang, Linlin Yang, Hanlin Chen, Qixiang Ye, David Doermann, Guodong Guo, Rongrong Ji

Conventional learning methods simplify the bilinear model by regarding two intrinsically coupled factors independently, which degrades the optimization procedure.

Image Reconstruction Network Pruning

Paper
Add Code

Self-supervised Video Object Segmentation

no code implementations • 22 Jun 2020 • Fangrui Zhu, Li Zhang, Yanwei Fu, Guodong Guo, Weidi Xie

The objective of this paper is self-supervised representation learning, with the goal of solving semi-supervised video object segmentation (a. k. a.

Object One-shot visual object segmentation +4

Paper
Add Code

Binarized Neural Architecture Search for Efficient Object Recognition

no code implementations • 8 Sep 2020 • Hanlin Chen, Li'an Zhuo, Baochang Zhang, Xiawu Zheng, Jianzhuang Liu, Rongrong Ji, David Doermann, Guodong Guo

In this paper, binarized neural architecture search (BNAS), with a search space of binarized convolutions, is introduced to produce extremely compressed models to reduce huge computational cost on embedded devices for edge computing.

Edge-computing Face Recognition +3

Paper
Add Code

Contrastive Context-Aware Learning for 3D High-Fidelity Mask Face Presentation Attack Detection

no code implementations • 13 Apr 2021 • Ajian Liu, Chenxu Zhao, Zitong Yu, Jun Wan, Anyang Su, Xing Liu, Zichang Tan, Sergio Escalera, Junliang Xing, Yanyan Liang, Guodong Guo, Zhen Lei, Stan Z. Li, Du Zhang

To bridge the gap to real-world applications, we introduce a largescale High-Fidelity Mask dataset, namely CASIA-SURF HiFiMask (briefly HiFiMask).

Face Presentation Attack Detection Face Recognition

Paper
Add Code

Joint Face Image Restoration and Frontalization for Recognition

no code implementations • 12 May 2021 • Xiaoguang Tu, Jian Zhao, Qiankun Liu, Wenjie Ai, Guodong Guo, Zhifeng Li, Wei Liu, Jiashi Feng

First, MDFR is a well-designed encoder-decoder architecture which extracts feature representation from an input face image with arbitrary low-quality factors and restores it to a high-quality counterpart.

Face Recognition Image Restoration

Paper
Add Code

Image-to-Video Generation via 3D Facial Dynamics

no code implementations • 31 May 2021 • Xiaoguang Tu, Yingtian Zou, Jian Zhao, Wenjie Ai, Jian Dong, Yuan YAO, Zhikang Wang, Guodong Guo, Zhifeng Li, Wei Liu, Jiashi Feng

Video generation from a single face image is an interesting problem and usually tackled by utilizing Generative Adversarial Networks (GANs) to integrate information from the input face image and a sequence of sparse facial landmarks.

Image to Video Generation Video Prediction

Paper
Add Code

SAR-Net: Shape Alignment and Recovery Network for Category-level 6D Object Pose and Size Estimation

no code implementations • CVPR 2022 • Haitao Lin, Zichang Liu, Chilam Cheang, Yanwei Fu, Guodong Guo, xiangyang xue

The concatenation of the observed point cloud and symmetric one reconstructs a coarse object shape, thus facilitating object center (3D translation) and 3D size estimation.

Object Optical Character Recognition (OCR)

Paper
Add Code

3D High-Fidelity Mask Face Presentation Attack Detection Challenge

no code implementations • 16 Aug 2021 • Ajian Liu, Chenxu Zhao, Zitong Yu, Anyang Su, Xing Liu, Zijian Kong, Jun Wan, Sergio Escalera, Hugo Jair Escalante, Zhen Lei, Guodong Guo

The threat of 3D masks to face recognition systems is increasingly serious and has been widely concerned by researchers.

Face Presentation Attack Detection Face Recognition +1

Paper
Add Code

The 2nd Anti-UAV Workshop & Challenge: Methods and Results

no code implementations • 23 Aug 2021 • Jian Zhao, Gang Wang, Jianan Li, Lei Jin, Nana Fan, Min Wang, Xiaojuan Wang, Ting Yong, Yafeng Deng, Yandong Guo, Shiming Ge, Guodong Guo

The 2nd Anti-UAV Workshop \& Challenge aims to encourage research in developing novel and accurate methods for multi-scale object tracking.

Object Tracking

Paper
Add Code

TransFER: Learning Relation-aware Facial Expression Representations with Transformers

no code implementations • ICCV 2021 • Fanglei Xue, Qiangchang Wang, Guodong Guo

Second, to build rich relations between different local patches, the Vision Transformers (ViT) are used in FER, called ViT-FER.

Facial Expression Recognition Facial Expression Recognition (FER) +2

Paper
Add Code

Sparse to Dense Motion Transfer for Face Image Animation

no code implementations • 1 Sep 2021 • Ruiqi Zhao, Tianyi Wu, Guodong Guo

Given a source face image and a sequence of sparse face landmarks, our goal is to generate a video of the face imitating the motion of landmarks.

Image Animation Motion Estimation +1

Paper
Add Code

IDARTS: Interactive Differentiable Architecture Search

no code implementations • ICCV 2021 • Song Xue, Runqi Wang, Baochang Zhang, Tian Wang, Guodong Guo, David Doermann

Differentiable Architecture Search (DARTS) improves the efficiency of architecture search by learning the architecture and network parameters end-to-end.

Paper
Add Code

LAE : Long-tailed Age Estimation

no code implementations • 25 Oct 2021 • Zenghao Bao, Zichang Tan, Yu Zhu, Jun Wan, Xibo Ma, Zhen Lei, Guodong Guo

To improve the performance of facial age estimation, we first formulate a simple standard baseline and build a much strong one by collecting the tricks in pre-training, data augmentation, model architecture, and so on.

Age Estimation Data Augmentation +1

Paper
Add Code

Learning to Recognize the Unseen Visual Predicates

no code implementations • 25 Sep 2019 • Defa Zhu, Si Liu, Wentao Jiang, Guanbin Li, Tianyi Wu, Guodong Guo

Visual relationship recognition models are limited in the ability to generalize from finite seen predicates to unseen ones.

Question Answering Visual Question Answering +1

Paper
Add Code

POEM: 1-bit Point-wise Operations based on Expectation-Maximization for Efficient Point Cloud Processing

no code implementations • 26 Nov 2021 • Sheng Xu, Yanjing Li, Junhe Zhao, Baochang Zhang, Guodong Guo

Real-time point cloud processing is fundamental for lots of computer vision tasks, while still challenged by the computational problem on resource-limited edge devices.

Paper
Add Code

Associative Adversarial Learning Based on Selective Attack

no code implementations • 28 Dec 2021 • Runqi Wang, Xiaoyue Duan, Baochang Zhang, Song Xue, Wentao Zhu, David Doermann, Guodong Guo

We show that our method improves the recognition accuracy of adversarial training on ImageNet by 8. 32% compared with the baseline.

Adversarial Robustness Few-Shot Learning +2

Paper
Add Code

Dynamic Group Transformer: A General Vision Transformer Backbone with Dynamic Group Attention

no code implementations • 8 Mar 2022 • Kai Liu, Tianyi Wu, Cong Liu, Guodong Guo

To reduce the quadratic computation complexity caused by each query attending to all keys/values, various methods have constrained the range of attention within local regions, where each query only attends to keys/values within a hand-crafted window.

Image Classification Instance Segmentation +3

Paper
Add Code

Confidence Dimension for Deep Learning based on Hoeffding Inequality and Relative Evaluation

no code implementations • 17 Mar 2022 • Runqi Wang, Linlin Yang, Baochang Zhang, Wentao Zhu, David Doermann, Guodong Guo

Research on the generalization ability of deep neural networks (DNNs) has recently attracted a great deal of attention.

Image Classification object-detection +1

Paper
Add Code

Iwin: Human-Object Interaction Detection via Transformer with Irregular Windows

no code implementations • 20 Mar 2022 • Danyang Tu, Xiongkuo Min, Huiyu Duan, Guodong Guo, Guangtao Zhai, Wei Shen

Iwin Transformer is a hierarchical Transformer which progressively performs token representation learning and token agglomeration within irregular windows.

Human-Object Interaction Detection Object +4

Paper
Add Code

End-to-End Human-Gaze-Target Detection with Transformers

no code implementations • CVPR 2022 • Danyang Tu, Xiongkuo Min, Huiyu Duan, Guodong Guo, Guangtao Zhai, Wei Shen

In contrast, we redefine the HGT detection task as detecting human head locations and their gaze targets, simultaneously.

Gaze Prediction object-detection +2

Paper
Add Code

Feature Selective Transformer for Semantic Image Segmentation

no code implementations • 26 Mar 2022 • Fangjian Lin, Tianyi Wu, Sitong Wu, Shengwei Tian, Guodong Guo

In this work, we focus on fusing multi-scale features from Transformer-based backbones for semantic segmentation, and propose a Feature Selective Transformer (FeSeFormer), which aggregates features from all scales (or levels) for each query feature.

feature selection Image Segmentation +2

Paper
Add Code

Bi-level Doubly Variational Learning for Energy-based Latent Variable Models

no code implementations • CVPR 2022 • Ge Kan, Jinhu Lü, Tian Wang, Baochang Zhang, Aichun Zhu, Lei Huang, Guodong Guo, Hichem Snoussi

In this paper, we propose Bi-level doubly variational learning (BiDVL), which is based on a new bi-level optimization framework and two tractable variational distributions to facilitate learning EBLVMs.

Image Generation Image Reconstruction +1

Paper
Add Code

CATrans: Context and Affinity Transformer for Few-Shot Segmentation

no code implementations • 27 Apr 2022 • Shan Zhang, Tianyi Wu, Sitong Wu, Guodong Guo

In this work, we effectively integrate the context and affinity information via the proposed novel Context and Affinity Transformer (CATrans) in a hierarchical architecture.

Relation Transfer Learning

Paper
Add Code

Region-level Contrastive and Consistency Learning for Semi-Supervised Semantic Segmentation

no code implementations • 28 Apr 2022 • Jianrong Zhang, Tianyi Wu, Chuanghao Ding, Hongwei Zhao, Guodong Guo

Specifically, we first propose a Region Mask Contrastive (RMC) loss and a Region Feature Contrastive (RFC) loss to accomplish region-level contrastive property.

Segmentation Semi-Supervised Semantic Segmentation

Paper
Add Code

FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

no code implementations • 5 May 2023 • Ajian Liu, Zichang Tan, Zitong Yu, Chenxu Zhao, Jun Wan, Yanyan Liang, Zhen Lei, Du Zhang, Stan Z. Li, Guodong Guo

The availability of handy multi-modal (i. e., RGB-D) sensors has brought about a surge of face anti-spoofing research.

Face Anti-Spoofing Face Presentation Attack Detection

Paper
Add Code

DCP-NAS: Discrepant Child-Parent Neural Architecture Search for 1-bit CNNs

no code implementations • 27 Jun 2023 • Yanjing Li, Sheng Xu, Xianbin Cao, Li'an Zhuo, Baochang Zhang, Tian Wang, Guodong Guo

One natural approach is to use 1-bit CNNs to reduce the computation and memory cost of NAS by taking advantage of the strengths of each in a unified framework, while searching the 1-bit CNNs is more challenging due to the more complicated processes involved.

Neural Architecture Search object-detection +2

Paper
Add Code

NCL++: Nested Collaborative Learning for Long-Tailed Visual Recognition

no code implementations • 29 Jun 2023 • Zichang Tan, Jun Li, Jinhao Du, Jun Wan, Zhen Lei, Guodong Guo

To achieve the collaborative learning in long-tailed learning, the balanced online distillation is proposed to force the consistent predictions among different experts and augmented copies, which reduces the learning uncertainties.

Paper
Add Code

Cr-net: A deep classification-regression network for multimodal apparent personality analysis

no code implementations • International Journal of Computer Vision 2020 • Yunan Li, Jun Wan, Qiguang Miao, Sergio Escalera, Huijuan Fang, Huizhou Chen, Xiangda Qi, Guodong Guo

First impressions strongly influence social interactions, having a high impact in the personal and professional life.

Ranked #2 on Personality Trait Recognition by Face on First Impressions v2

Personality Trait Recognition by Face regression

Paper
Add Code

On visual BMI analysis from facial images

no code implementations • Image and Vision Computing 2019 • Min Jiang, Yuanyuan Shang, Guodong Guo

Various facial representations, including geometry based representations and deep learning based, are comprehensively evaluated and analyzed from three perspectives: the overall performance on visual BMI prediction, the redundancy in facial representations and the sensitivity to head pose changes.

MORPH

Paper
Add Code

Fusion-Mamba for Cross-modality Object Detection

no code implementations • 14 Apr 2024 • Wenhao Dong, Haodong Zhu, Shaohui Lin, Xiaoyan Luo, Yunhang Shen, Xuhui Liu, Juan Zhang, Guodong Guo, Baochang Zhang

In this paper, we investigate cross-modality fusion by associating cross-modal features in a hidden state space based on an improved Mamba with a gating mechanism.

Object object-detection +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.