no code implementations • 19 Aug 2024 • Hui Xue, Yuexuan An, Yongchun Qin, Wenqian Li, Yixin Wu, Yongjuan Che, Pengfei Fang, MinLing Zhang
Human intelligence is characterized by our ability to absorb and apply knowledge from the world around us, especially in rapidly acquiring new concepts from minimal examples, underpinned by prior knowledge.
2 code implementations • 26 Jan 2024 • Shipeng Zhu, Pengfei Fang, Chenjie Zhu, Zuoyan Zhao, Qiang Xu, Hui Xue
Leveraging the global structure of the text as a prior, the proposed GSDM develops an efficient diffusion model to recover clean texts.
1 code implementation • 31 Dec 2023 • Weijian Mai, Jian Zhang, Pengfei Fang, Zhijun Zhang
This survey comprehensively examines the emerging field of AIGC-based Brain-conditional Multimodal Synthesis, termed AIGC-Brain, to delineate the current landscape and future directions.
2 code implementations • 29 Nov 2023 • Zuoyan Zhao, Hui Xue, Pengfei Fang, Shipeng Zhu
Scene text image super-resolution (STISR) aims at simultaneously increasing the resolution and readability of low-resolution scene text images, thus boosting the performance of the downstream recognition task.
no code implementations • ICCV 2023 • Jie Hong, Zeeshan Hayder, Junlin Han, Pengfei Fang, Mehrtash Harandi, Lars Petersson
Audio-visual zero-shot learning aims to classify samples consisting of a pair of corresponding audio and video sequences from classes that are not present during training.
Ranked #2 on GZSL Video Classification on ActivityNet-GZSL (cls)
no code implementations • 21 Apr 2023 • Pengfei Fang, Mehrtash Harandi, Trung Le, Dinh Phung
Hyperbolic geometry, a Riemannian manifold endowed with constant sectional negative curvature, has been considered an alternative embedding space in many learning scenarios, \eg, natural language processing, graph learning, \etc, as a result of its intriguing property of encoding the data's hierarchical structure (like irregular graph or tree-likeness data).
2 code implementations • 21 Feb 2023 • Shipeng Zhu, Zuoyan Zhao, Pengfei Fang, Hui Xue
Scene text image super-resolution (STISR) aims to simultaneously increase the resolution and legibility of the text images, and the resulting images will significantly affect the performance of downstream tasks.
1 code implementation • 14 Nov 2022 • Junlin Han, Huangying Zhan, Jie Hong, Pengfei Fang, Hongdong Li, Lars Petersson, Ian Reid
This paper studies the problem of measuring and predicting how memorable an image is to pattern recognition machines, as a path to explore machine intelligence.
no code implementations • 2 Aug 2022 • Jie Hong, Pengfei Fang, Weihao Li, Junlin Han, Lars Petersson, Mehrtash Harandi
Learning a latent embedding to understand the underlying nature of data distribution is often formulated in Euclidean spaces with zero curvature.
1 code implementation • 14 Jun 2022 • Yuan Feng, Yaojun Hu, Pengfei Fang, Yanhong Yang, Sheng Liu, ShengYong Chen
However, jointly removing the rain and haze in scene images is ill-posed and challenging, where the existence of haze and rain and the change of atmosphere light, can both degrade the scene information.
no code implementations • 23 Mar 2022 • Jie Hong, Weihao Li, Junlin Han, Jiyang Zheng, Pengfei Fang, Mehrtash Harandi, Lars Petersson
In this paper, we present and study a new image segmentation task, called Generalized Open-set Semantic Segmentation (GOSS).
no code implementations • 7 Mar 2022 • Anqi Li, Jingsong Ma, Lizhi Ma, Pengfei Fang, Hongliang He, Zhenzhong Lan
However, these methods often demand large scale and high quality counseling data, which are difficult to collect.
1 code implementation • 28 Jan 2022 • Junlin Han, Pengfei Fang, Weihao Li, Jie Hong, Mohammad Ali Armin, Ian Reid, Lars Petersson, Hongdong Li
We present You Only Cut Once (YOCO) for performing data augmentations.
1 code implementation • 7 Dec 2021 • Rongkai Ma, Pengfei Fang, Gil Avraham, Yan Zuo, Tianyu Zhu, Tom Drummond, Mehrtash Harandi
A principle way of achieving few-shot learning is to realize a model that can rapidly adapt to the context of a given task.
no code implementations • 3 Dec 2021 • Rongkai Ma, Pengfei Fang, Tom Drummond, Mehrtash Harandi
To this end, we formulate the metric as a weighted sum on the tangent bundle of the hyperbolic space and develop a mechanism to obtain the weights adaptively and based on the constellation of the points.
1 code implementation • 11 Oct 2021 • Lin Cheng, Pengfei Fang, Yanjie Liang, Liao Zhang, Chunhua Shen, Hanzi Wang
Inspired by those observations, we propose a novel visual saliency method, termed Target-Selective Gradient Backprop (TSGB), which leverages rectification operations to effectively emphasize target classes and further efficiently propagate the saliency to the image space, thereby generating target-selective and fine-grained saliency maps.
no code implementations • 20 Sep 2021 • Jieming Zhou, Tong Zhang, Pengfei Fang, Lars Petersson, Mehrtash Harandi
The core concept of GNNs is to find a representation by recursively aggregating the representations of a central node and those of its neighbors.
1 code implementation • 25 Aug 2021 • Junlin Han, Weihao Li, Pengfei Fang, Chunyi Sun, Jie Hong, Mohammad Ali Armin, Lars Petersson, Hongdong Li
We propose and study a novel task named Blind Image Decomposition (BID), which requires separating a superimposed image into constituent underlying images in a blind setting, that is, both the source components involved in mixing as well as the mixing mechanism are unknown.
1 code implementation • 2 Jun 2021 • Chiyu Song, Hongliang He, Haofei Yu, Pengfei Fang, Leyang Cui, Zhenzhong Lan
The current state-of-the-art ranking methods mainly use an encoding paradigm called Cross-Encoder, which separately encodes each context-candidate pair and ranks the candidates according to their fitness scores.
Ranked #1 on Conversational Response Selection on Persona-Chat
no code implementations • CVPR 2021 • Jie Hong, Pengfei Fang, Weihao Li, Tong Zhang, Christian Simon, Mehrtash Harandi, Lars Petersson
Few-shot learning aims to correctly recognize query samples from unseen classes given a limited number of support samples, often by relying on global embeddings of images.
no code implementations • CVPR 2021 • Ali Cheraghian, Shafin Rahman, Pengfei Fang, Soumava Kumar Roy, Lars Petersson, Mehrtash Harandi
Few-shot class incremental learning (FSCIL) portrays the problem of learning new concepts gradually, where only a few examples per concept are available to the learner.
class-incremental learning Few-Shot Class-Incremental Learning +3
no code implementations • ICCV 2021 • Ali Cheraghian, Shafin Rahman, Sameera Ramasinghe, Pengfei Fang, Christian Simon, Lars Petersson, Mehrtash Harandi
In this paper, we propose addressing this problem using a mixture of subspaces.
class-incremental learning Few-Shot Class-Incremental Learning +2
no code implementations • ICCV 2021 • Pengfei Fang, Mehrtash Harandi, Lars Petersson
However, working in hyperbolic spaces is not without difficulties as a result of its curved geometry (e. g., computing the Frechet mean of a set of points requires an iterative algorithm).
no code implementations • 2 Nov 2020 • Pengfei Fang, Pan Ji, Lars Petersson, Mehrtash Harandi
Modern video person re-identification (re-ID) machines are often trained using a metric learning approach, supervised by a triplet loss.
no code implementations • 7 Oct 2020 • Pengfei Fang, Pan Ji, Jieming Zhou, Lars Petersson, Mehrtash Harandi
Full attention, which generates an attention value per element of the input feature maps, has been successfully demonstrated to be beneficial in visual tasks.
no code implementations • 17 Jun 2020 • Jieming Zhou, Soumava Kumar Roy, Pengfei Fang, Mehrtash Harandi, Lars Petersson
Deep neural networks need to make robust inference in the presence of occlusion, background clutter, pose and viewpoint variations -- to name a few -- when the task of person re-identification is considered.
no code implementations • ICCV 2019 • Pengfei Fang, Jieming Zhou, Soumava Kumar Roy, Lars Petersson, Mehrtash Harandi
This paper investigates a novel Bilinear attention (Bi-attention) block, which discovers and uses second order statistical information in an input feature map, for the purpose of person retrieval.