no code implementations • 16 Dec 2024 • Delong Zhang, Qiwei Huang, Yuanliu liu, Yang Sun, Wei-Shi Zheng, Pengfei Xiong, Wei zhang
Image-based virtual try-on is challenging since the generated image should fit the garment to model images in various poses and keep the characteristics and details of the garment simultaneously.
no code implementations • 18 Feb 2024 • Longhuang Wu, Shangxuan Tian, Youxin Wang, Pengfei Xiong
Existing methods for scene text detection can be divided into two paradigms: segmentation-based and anchor-based.
no code implementations • 1 Aug 2023 • Bolun Cai, Pengfei Xiong, Shangxuan Tian
In this paper, we propose a novel metric learning function called Center Contrastive Loss, which maintains a class-wise center bank and compares the category centers with the query data points using a contrastive loss.
Ranked #5 on Metric Learning on CUB-200-2011
no code implementations • ICCV 2023 • Junpeng Jing, Jiankun Li, Pengfei Xiong, Jiangyu Liu, Shuaicheng Liu, Yichen Guo, Xin Deng, Mai Xu, Lai Jiang, Leonid Sigal
A novel Uncertainty Guided Adaptive Correlation (UGAC) module is introduced to robustly adapt the same model for different scenarios.
no code implementations • 15 Jul 2023 • Ze Lu, Yalei Lv, Wenqi Wang, Pengfei Xiong
Specifically, we introduce an extra Frequency Branch and Frequency Loss on the spatial-based network to impose direct supervision on the frequency information, and propose a Frequency-Spatial Cross-Attention Block (FSCAB) to fuse multi-domain features and combine the corresponding characteristics.
4 code implementations • CVPR 2023 • Peng Jin, Jinfa Huang, Pengfei Xiong, Shangxuan Tian, Chang Liu, Xiangyang Ji, Li Yuan, Jie Chen
Contrastive learning-based video-language representation learning approaches, e. g., CLIP, have achieved outstanding performance, which pursue semantic interaction upon pre-defined video-text pairs.
Ranked #8 on Video Question Answering on MSRVTT-QA
2 code implementations • 11 Nov 2022 • Chengpeng Chen, Zichao Guo, Haien Zeng, Pengfei Xiong, Jian Dong
Experiments on ImageNet and COCO benchmarks demonstrate that the proposed RepGhostNet is much more effective and efficient than GhostNet and MobileNetV3 on mobile devices.
1 code implementation • 16 Jul 2022 • Yuqi Liu, Pengfei Xiong, Luhui Xu, Shengming Cao, Qin Jin
In this paper, we propose Token Shift and Selection Network (TS2-Net), a novel token shift and selection transformer architecture, which dynamically adjusts the token sequence and selects informative tokens in both temporal and spatial dimensions from input video samples.
Ranked #8 on Video Retrieval on MSR-VTT-1kA
1 code implementation • CVPR 2022 • Yizhi Wang, Guo Pu, Wenhan Luo, Yexin Wang, Pengfei Xiong, Hongwen Kang, Zhouhui Lian
To train and evaluate our approach, we construct a dataset named as TextLogo3K, consisting of about 3, 500 text logo images and their pixel-level annotations.
1 code implementation • CVPR 2022 • Zihua Zheng, Ni Nie, Zhi Ling, Pengfei Xiong, Jiangyu Liu, Hao Wang, Jiankun Li
Recently, the dense correlation volume method achieves state-of-the-art performance in optical flow.
3 code implementations • CVPR 2022 • Jiankun Li, Peisen Wang, Pengfei Xiong, Tao Cai, Ziwei Yan, Lei Yang, Jiangyu Liu, Haoqiang Fan, Shuaicheng Liu
With the advent of convolutional neural networks, stereo matching algorithms have recently gained tremendous progress.
1 code implementation • 21 Jun 2021 • Han Fang, Pengfei Xiong, Luhui Xu, Yu Chen
We present CLIP2Video network to transfer the image-language pre-training model to video-text retrieval in an end-to-end manner.
Ranked #13 on Video Retrieval on VATEX (using extra training data)
1 code implementation • CVPR 2021 • Jing Tan, Shan Zhao, Pengfei Xiong, Jiangyu Liu, Haoqiang Fan, Shuaicheng Liu
Wide-angle portraits often enjoy expanded views.
no code implementations • 24 Sep 2020 • Jing Tan, Pengfei Xiong, Yuwen He, Kuntao Xiao, Zhengyi Lv
Based on this priori, we propose a novel Local Context Attention Network (LCANet) to generate locally reinforcement feature maps in a uniform representational architecture.
2 code implementations • ECCV 2020 • Siyu Huang, Fangbo Qin, Pengfei Xiong, Ning Ding, Yijia He, Xiao Liu
To realize one-step detection with a faster and more compact model, we introduce the tri-points representation, converting the line segment detection to the end-to-end prediction of a root-point and two endpoints for each line segment.
Ranked #2 on Line Segment Detection on York Urban Dataset
no code implementations • 24 Aug 2020 • Xinyan Zhang, Yunfeng Wang, Pengfei Xiong
As a fine-grained segmentation task, human parsing is still faced with two challenges: inter-part indistinction and intra-part inconsistency, due to the ambiguous definitions and confusing relationships between similar human parts.
5 code implementations • 17 Apr 2019 • Xin Hong, Pengfei Xiong, Renhe Ji, Haoqiang Fan
The fusion block not only provides a smooth fusion between restored and existing content, but also provides an attention map to make network focus more on the unknown pixels.
2 code implementations • CVPR 2019 • Hanchao Li, Pengfei Xiong, Haoqiang Fan, Jian Sun
This paper introduces an extremely efficient CNN architecture named DFANet for semantic segmentation under resource constraints.
Ranked #7 on SMAC+ on Def_Infantry_parallel
no code implementations • 25 May 2018 • Hanchao Li, Pengfei Xiong, Jie An, Lingxue Wang
A Pyramid Attention Network(PAN) is proposed to exploit the impact of global contextual information in semantic segmentation.