Search Results for author: Qilong Wang

Found 36 papers, 25 papers with code

Generative Inbetweening through Frame-wise Conditions-Driven Video Generation

1 code implementation16 Dec 2024 Tianyi Zhu, Dongwei Ren, Qilong Wang, Xiaohe Wu, WangMeng Zuo

Generative inbetweening aims to generate intermediate frame sequences by utilizing two key frames as input.

Video Generation

$S^3$: Synonymous Semantic Space for Improving Zero-Shot Generalization of Vision-Language Models

no code implementations6 Dec 2024 Xiaojie Yin, Qilong Wang, Bing Cao, QinGhua Hu

Recently, many studies have been conducted to enhance the zero-shot generalization ability of vision-language models (e. g., CLIP) by addressing the semantic misalignment between image and text embeddings in downstream tasks.

Zero-shot Generalization Zero-Shot Learning

TAMT: Temporal-Aware Model Tuning for Cross-Domain Few-Shot Action Recognition

no code implementations28 Nov 2024 Yilong Wang, Zilin Gao, Qilong Wang, Zhaofeng Chen, Peihua Li, QinGhua Hu

To effectively and efficiently explore the potential of pre-trained models in transferring to target domain, our TAMT proposes a Hierarchical Temporal Tuning Network (HTTN), whose core involves local temporal-aware adapters (TAA) and a global temporal-aware moment tuning (GTMT).

Cross-Domain Few-Shot Few-Shot action recognition +2

Efficient Masked AutoEncoder for Video Object Counting and A Large-Scale Benchmark

no code implementations20 Nov 2024 Bing Cao, Quanhao Lu, Jiekang Feng, Pengfei Zhu, QinGhua Hu, Qilong Wang

The dynamic imbalance of the fore-background is a major challenge in video object counting, which is usually caused by the sparsity of foreground objects.

Object Counting Optical Flow Estimation +1

Conditional Controllable Image Fusion

1 code implementation3 Nov 2024 Bing Cao, Xingxin Xu, Pengfei Zhu, Qilong Wang, QinGhua Hu

To address this issue, we propose a conditional controllable fusion (CCF) framework for general image fusion tasks without specific training.

Denoising

Not All Samples Should Be Utilized Equally: Towards Understanding and Improving Dataset Distillation

no code implementations22 Aug 2024 Shaobo Wang, Yantai Yang, Qilong Wang, Kaixin Li, Linfeng Zhang, Junchi Yan

Our findings suggest that prioritizing the synthesis of easier samples from the original dataset can enhance the quality of distilled datasets, especially in low IPC (image-per-class) settings.

Dataset Distillation

SelfDRSC++: Self-Supervised Learning for Dual Reversed Rolling Shutter Correction

1 code implementation21 Aug 2024 Wei Shang, Dongwei Ren, Wanying Zhang, Qilong Wang, Pengfei Zhu, WangMeng Zuo

Subsequently, to effectively train the DRSC network, we propose a self-supervised learning strategy that ensures cycle consistency between input and reconstructed dual reversed RS images.

Rolling Shutter Correction Self-Supervised Learning +1

Fine-Grained Domain Generalization with Feature Structuralization

no code implementations13 Jun 2024 Wenlong Yu, Dongyue Chen, Qilong Wang, QinGhua Hu

Likewise, we propose a Feature Structuralized Domain Generalization (FSDG) model, wherein features experience structuralization into common, specific, and confounding segments, harmoniously aligned with their relevant semantic concepts, to elevate performance in FGDG.

Domain Generalization Specificity

AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning

1 code implementation CVPR 2024 Yuwei Tang, Zhenyi Lin, Qilong Wang, Pengfei Zhu, QinGhua Hu

To this end, we disassemble three key components involved in computation of logit bias (i. e., logit features, logit predictor, and logit fusion) and empirically analyze the effect on performance of few-shot classification.

Few-Shot Learning

A Decoupled Spatio-Temporal Framework for Skeleton-based Action Segmentation

1 code implementation10 Dec 2023 Yunheng Li, Zhongyu Li, ShangHua Gao, Qilong Wang, Qibin Hou, Ming-Ming Cheng

Effectively modeling discriminative spatio-temporal information is essential for segmenting activities in long action sequences.

Action Segmentation

Tuning Pre-trained Model via Moment Probing

1 code implementation ICCV 2023 Mingze Gao, Qilong Wang, Zhenyi Lin, Pengfei Zhu, QinGhua Hu, Jingbo Zhou

Distinguished from LP which builds a linear classification head based on the mean of final features (e. g., word tokens for ViT) or classification tokens, our MP performs a linear classifier on feature distribution, which provides the stronger representation ability by exploiting richer statistical information inherent in features.

Image Classification

DropCov: A Simple yet Effective Method for Improving Deep Architectures

1 code implementation NeurIPS 2022 Conference 2023 Qilong Wang, Mingze Gao, Zhaolin Zhang, Jiangtao Xie, Peihua Li, QinGhua Hu

Particularly, we for the first time show that \textit{effective post-normalization can make a good trade-off between representation decorrelation and information preservation for GCP, which are crucial to alleviate over-fitting and increase representation ability of deep GCP networks, respectively}.

Reliable and Interpretable Personalized Federated Learning

no code implementations CVPR 2023 Zixuan Qin, Liu Yang, Qilong Wang, Yahong Han, QinGhua Hu

When there are large differences in data distribution among clients, it is crucial for federated learning to design a reliable client selection strategy and an interpretable client communication framework to better utilize group knowledge.

Personalized Federated Learning

Temporal-attentive Covariance Pooling Networks for Video Recognition

1 code implementation NeurIPS 2021 Zilin Gao, Qilong Wang, Bingbing Zhang, QinGhua Hu, Peihua Li

Then, a temporal covariance pooling performs temporal pooling of the attentive covariance representations to characterize both intra-frame correlations and inter-frame cross-correlations of the calibrated features.

Video Recognition

Boosting Weakly Supervised Object Detection via Learning Bounding Box Adjusters

1 code implementation ICCV 2021 Bowen Dong, Zitong Huang, Yuelin Guo, Qilong Wang, Zhenxing Niu, WangMeng Zuo

In this paper, we defend the problem setting for improving localization performance by leveraging the bounding box regression knowledge from a well-annotated auxiliary dataset.

Object object-detection +3

Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark

1 code implementation CVPR 2021 Longyin Wen, Dawei Du, Pengfei Zhu, QinGhua Hu, Qilong Wang, Liefeng Bo, Siwei Lyu

To promote the developments of object detection, tracking and counting algorithms in drone-captured videos, we construct a benchmark with a new drone-captured largescale dataset, named as DroneCrowd, formed by 112 video clips with 33, 600 HD frames in various scenarios.

object-detection Object Detection +1

So-ViT: Mind Visual Tokens for Vision Transformer

1 code implementation22 Apr 2021 Jiangtao Xie, Ruiren Zeng, Qilong Wang, Ziqi Zhou, Peihua Li

Therefore, we propose a new classification paradigm, where the second-order, cross-covariance pooling of visual tokens is combined with class token for final classification.

Classification General Classification

What Deep CNNs Benefit from Global Covariance Pooling: An Optimization Perspective

1 code implementation CVPR 2020 Qilong Wang, Li Zhang, Banggu Wu, Dongwei Ren, Peihua Li, WangMeng Zuo, QinGhua Hu

Recent works have demonstrated that global covariance pooling (GCP) has the ability to improve performance of deep convolutional neural networks (CNNs) on visual classification task.

Instance Segmentation object-detection +2

Drone-based Joint Density Map Estimation, Localization and Tracking with Space-Time Multi-Scale Attention Network

1 code implementation4 Dec 2019 Longyin Wen, Dawei Du, Pengfei Zhu, QinGhua Hu, Qilong Wang, Liefeng Bo, Siwei Lyu

This paper proposes a space-time multi-scale attention network (STANet) to solve density map estimation, localization and tracking in dense crowds of video clips captured by drones with arbitrary crowd density, perspective, and flight altitude.

Crowd Counting

ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks

12 code implementations CVPR 2020 Qilong Wang, Banggu Wu, Pengfei Zhu, Peihua Li, WangMeng Zuo, QinGhua Hu

By dissecting the channel attention module in SENet, we empirically show avoiding dimensionality reduction is important for learning channel attention, and appropriate cross-channel interaction can preserve performance while significantly decreasing model complexity.

Dimensionality Reduction Image Classification +4

Neural Blind Deconvolution Using Deep Priors

1 code implementation CVPR 2020 Dongwei Ren, Kai Zhang, Qilong Wang, QinGhua Hu, WangMeng Zuo

To connect MAP and deep models, we in this paper present two generative networks for respectively modeling the deep priors of clean image and blur kernel, and propose an unconstrained neural optimization solution to blind deconvolution.

Deblurring Self-Supervised Learning

Data Augmentation for Object Detection via Progressive and Selective Instance-Switching

1 code implementation2 Jun 2019 Hao Wang, Qilong Wang, Fan Yang, Weiqi Zhang, WangMeng Zuo

For guiding our IS to obtain better object performance, we explore issues of instance imbalance and class importance in datasets, which frequently occur and bring adverse effect on detection performance.

Data Augmentation Instance Segmentation +2

Deep Global Generalized Gaussian Networks

1 code implementation CVPR 2019 Qilong Wang, Peihua Li, Qinghua Hu, Pengfei Zhu, Wangmeng Zuo

To handle this issue, this paper proposes a novel deep global generalized Gaussian network (3G-Net), whose core is to estimate a global covariance of generalized Gaussian for modeling the last convolutional activations.

Rolling Shutter Correction

Global Gated Mixture of Second-order Pooling for Improving Deep Convolutional Neural Networks

1 code implementation NeurIPS 2018 Qilong Wang, Zilin Gao, Jiangtao Xie, WangMeng Zuo, Peihua Li

However, both GAP and existing HOP methods assume unimodal distributions, which cannot fully capture statistics of convolutional activations, limiting representation ability of deep CNNs, especially for samples with complex contents.

Global Second-order Pooling Convolutional Networks

1 code implementation CVPR 2019 Zilin Gao, Jiangtao Xie, Qilong Wang, Peihua Li

Deep Convolutional Networks (ConvNets) are fundamental to, besides large-scale visual recognition, a lot of vision tasks.

Object Recognition

Multi-scale Location-aware Kernel Representation for Object Detection

2 code implementations CVPR 2018 Hao Wang, Qilong Wang, Mingqi Gao, Peihua Li, WangMeng Zuo

Our MLKP can be efficiently computed on a modified multi-scale feature map using a low-dimensional polynomial kernel approximation. Moreover, different from existing orderless global representations based on high-order statistics, our proposed MLKP is location retentive and sensitive so that it can be flexibly adopted to object detection.

General Classification Object +2

G2DeNet: Global Gaussian Distribution Embedding Network and Its Application to Visual Recognition

no code implementations CVPR 2017 Qilong Wang, Peihua Li, Lei Zhang

Recently, plugging trainable structural layers into deep convolutional neural networks (CNNs) as image representations has made promising progress.

Mind the Class Weight Bias: Weighted Maximum Mean Discrepancy for Unsupervised Domain Adaptation

3 code implementations CVPR 2017 Hongliang Yan, Yukang Ding, Peihua Li, Qilong Wang, Yong Xu, WangMeng Zuo

Specifically, we introduce class-specific auxiliary weights into the original MMD for exploiting the class prior probability on source and target domains, whose challenge lies in the fact that the class label in target domain is unavailable.

Unsupervised Domain Adaptation

Is Second-order Information Helpful for Large-scale Visual Recognition?

1 code implementation ICCV 2017 Peihua Li, Jiangtao Xie, Qilong Wang, WangMeng Zuo

The main challenges involved are robust covariance estimation given a small sample of large-dimensional features and usage of the manifold structure of covariance matrices.

Object Recognition

RAID-G: Robust Estimation of Approximate Infinite Dimensional Gaussian With Application to Material Recognition

no code implementations CVPR 2016 Qilong Wang, Peihua Li, WangMeng Zuo, Lei Zhang

Infinite dimensional covariance descriptors can provide richer and more discriminative information than their low dimensional counterparts.

Material Recognition

Towards Effective Codebookless Model for Image Classification

no code implementations9 Jul 2015 Qilong Wang, Peihua Li, Lei Zhang, WangMeng Zuo

The bag-of-features (BoF) model for image classification has been thoroughly studied over the last decade.

Classification General Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.