Search Results for author: Jianlong Wu

Found 42 papers, 27 papers with code

Local Correlation Consistency for Knowledge Distillation

no code implementations ECCV 2020 Xiaojie Li, Jianlong Wu, Hongyu Fang, Yue Liao, Fei Wang, Chen Qian

Sufficient knowledge extraction from the teacher network plays a critical role in the knowledge distillation task to improve the performance of the student network.

Knowledge Distillation

AdaReTaKe: Adaptive Redundancy Reduction to Perceive Longer for Video-language Understanding

1 code implementation16 Mar 2025 Xiao Wang, Qingyi Si, Jianlong Wu, Shiyu Zhu, Li Cao, Liqiang Nie

Multimodal Large Language Models (MLLMs) have revolutionized video understanding, yet are still limited by context length when processing long videos.

Video Understanding

MegaSR: Mining Customized Semantics and Expressive Guidance for Image Super-Resolution

1 code implementation11 Mar 2025 Xinrui Li, Jianlong Wu, Xinchuan Huang, Chong Chen, Weili Guan, Xian-Sheng Hua, Liqiang Nie

Pioneering text-to-image (T2I) diffusion models have ushered in a new era of real-world image super-resolution (Real-ISR), significantly enhancing the visual perception of reconstructed images.

Image Super-Resolution

Continuous Knowledge-Preserving Decomposition for Few-Shot Continual Learning

2 code implementations9 Jan 2025 Xiaojie Li, Yibo Yang, Jianlong Wu, David A. Clifton, Yue Yu, Bernard Ghanem, Min Zhang

To this end, we propose Continuous Knowledge-Preserving Decomposition for FSCIL (CKPD-FSCIL), a framework that decomposes a model's weights into two parts: one that compacts existing knowledge (knowledge-sensitive components) and another that carries redundant capacity to accommodate new abilities (redundant-capacity components).

class-incremental learning Few-Shot Class-Incremental Learning +1

LipGen: Viseme-Guided Lip Video Generation for Enhancing Visual Speech Recognition

no code implementations8 Jan 2025 Bowen Hao, Dongliang Zhou, Xiaojie Li, Xingyu Zhang, Liang Xie, Jianlong Wu, Erwei Yin

The advent of deep learning techniques and advancements in hardware capabilities have significantly enhanced the performance of lip reading models.

Lip Reading speech-recognition +2

ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding

1 code implementation29 Dec 2024 Xiao Wang, Qingyi Si, Jianlong Wu, Shiyu Zhu, Li Cao, Liqiang Nie

Video Large Language Models (VideoLLMs) have made significant strides in video understanding but struggle with long videos due to the limitations of their backbone LLMs.

Video Compression Video Understanding

Efficient Dataset Distillation via Diffusion-Driven Patch Selection for Improved Generalization

no code implementations13 Dec 2024 Xinhao Zhong, Shuoyang Sun, Xulin Gu, Zhaoyang Xu, YaoWei Wang, Jianlong Wu, Bin Chen

Dataset distillation offers an efficient way to reduce memory and computational costs by optimizing a smaller dataset with performance comparable to the full-scale original.

Dataset Distillation

RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training

no code implementations18 Oct 2024 Muhe Ding, Yang Ma, Pengda Qin, Jianlong Wu, Yuhong Li, Liqiang Nie

Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks.

Denoising Question Answering +1

Preview-based Category Contrastive Learning for Knowledge Distillation

no code implementations18 Oct 2024 Muhe Ding, Jianlong Wu, Xue Dong, Xiaojie Li, Pengda Qin, Tian Gan, Liqiang Nie

It first distills the structural knowledge of both instance-level feature correspondence and the relation between instance features and category centers in a contrastive learning fashion, which can explicitly optimize the category representation and explore the distinct correlation between representations of instances and categories, contributing to discriminative category centers and better classification results.

Contrastive Learning Knowledge Distillation +1

Video DataFlywheel: Resolving the Impossible Data Trinity in Video-Language Understanding

no code implementations29 Sep 2024 Xiao Wang, Jianlong Wu, Zijia Lin, Fuzheng Zhang, Di Zhang, Liqiang Nie

For iterative refinement, we first leverage a video-language model to generate synthetic annotations, resulting in a refined dataset.

Diversity Question Answering +2

DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture

1 code implementation5 Sep 2024 Qianlong Xiang, Miao Zhang, Yuzhang Shang, Jianlong Wu, Yan Yan, Liqiang Nie

Furthermore, considering that the source data is either unaccessible or too enormous to store for current generative models, we introduce a new paradigm for their distillation without source data, termed Data-Free Knowledge Distillation for Diffusion Models (DKDM).

Data-free Knowledge Distillation Denoising

Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning

1 code implementation8 Jul 2024 Xiaojie Li, Yibo Yang, Jianlong Wu, Bernard Ghanem, Liqiang Nie, Min Zhang

The dual design enables the model to maintain the robust features of base classes, while adaptively learning distinctive feature shifts for novel classes.

class-incremental learning Few-Shot Class-Incremental Learning +3

CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning

1 code implementation7 Jun 2024 Yibo Yang, Xiaojie Li, Zhongzhu Zhou, Shuaiwen Leon Song, Jianlong Wu, Liqiang Nie, Bernard Ghanem

For the latter, we use the instruction data from the fine-tuning task, such as math or coding, to orientate the decomposition and train the largest $r$ components that most correspond to the task to learn.

Instruction Following Math +3

GenView: Enhancing View Quality with Pretrained Generative Model for Self-Supervised Learning

1 code implementation18 Mar 2024 Xiaojie Li, Yibo Yang, Xiangtai Li, Jianlong Wu, Yue Yu, Bernard Ghanem, Min Zhang

To tackle these challenges, we present GenView, a controllable framework that augments the diversity of positive views leveraging the power of pretrained generative models while preserving semantics.

Contrastive Learning Data Augmentation +2

WKVQuant: Quantizing Weight and Key/Value Cache for Large Language Models Gains More

no code implementations19 Feb 2024 Yuxuan Yue, Zhihang Yuan, Haojie Duanmu, Sifan Zhou, Jianlong Wu, Liqiang Nie

Large Language Models (LLMs) face significant deployment challenges due to their substantial memory requirements and the computational demands of auto-regressive text generation process.

Quantization Text Generation

SNP-S3: Shared Network Pre-training and Significant Semantic Strengthening for Various Video-Text Tasks

1 code implementation31 Jan 2024 Xingning Dong, Qingpei Guo, Tian Gan, Qing Wang, Jianlong Wu, Xiangyuan Ren, Yuan Cheng, Wei Chu

By employing one shared BERT-type network to refine textual and cross-modal features simultaneously, SNP is lightweight and could support various downstream applications.

Sentence

Detecting and Grounding Multi-Modal Media Manipulation and Beyond

1 code implementation25 Sep 2023 Rui Shao, Tianxing Wu, Jianlong Wu, Liqiang Nie, Ziwei Liu

HAMMER performs 1) manipulation-aware contrastive learning between two uni-modal encoders as shallow manipulation reasoning, and 2) modality-aware cross-attention by multi-modal aggregator as deep manipulation reasoning.

Binary Classification Contrastive Learning +4

Temporal Sentence Grounding in Streaming Videos

1 code implementation14 Aug 2023 Tian Gan, Xiao Wang, Yan Sun, Jianlong Wu, Qingpei Guo, Liqiang Nie

The goal of TSGSV is to evaluate the relevance between a video stream and a given sentence query.

Sentence Temporal Sentence Grounding

Neural Collapse Terminus: A Unified Solution for Class Incremental Learning and Its Variants

2 code implementations3 Aug 2023 Yibo Yang, Haobo Yuan, Xiangtai Li, Jianlong Wu, Lefei Zhang, Zhouchen Lin, Philip Torr, DaCheng Tao, Bernard Ghanem

Beyond the normal case, long-tail class incremental learning and few-shot class incremental learning are also proposed to consider the data imbalance and data scarcity, respectively, which are common in real-world implementations and further exacerbate the well-known problem of catastrophic forgetting.

class-incremental learning Few-Shot Class-Incremental Learning +1

Micro-video Tagging via Jointly Modeling Social Influence and Tag Relation

1 code implementation15 Mar 2023 Xiao Wang, Tian Gan, Yinwei Wei, Jianlong Wu, Dai Meng, Liqiang Nie

Existing methods mostly focus on analyzing video content, neglecting users' social influence and tag relation.

Link Prediction Relation +3

CHMATCH: Contrastive Hierarchical Matching and Robust Adaptive Threshold Boosted Semi-Supervised Learning

1 code implementation CVPR 2023 Jianlong Wu, Haozhe Yang, Tian Gan, Ning Ding, Feijun Jiang, Liqiang Nie

In the meantime, we make full use of the structured information in the hierarchical labels to learn an accurate affinity graph for contrastive learning.

Contrastive Learning

Visual Perturbation-aware Collaborative Learning for Overcoming the Language Prior Problem

no code implementations24 Jul 2022 Yudong Han, Liqiang Nie, Jianhua Yin, Jianlong Wu, Yan Yan

Several studies have recently pointed that existing Visual Question Answering (VQA) models heavily suffer from the language prior problem, which refers to capturing superficial statistical correlations between the question type and the answer whereas ignoring the image contents.

Diagnostic Question Answering +1

Semantic-aware Modular Capsule Routing for Visual Question Answering

no code implementations21 Jul 2022 Yudong Han, Jianhua Yin, Jianlong Wu, Yinwei Wei, Liqiang Nie

Visual Question Answering (VQA) is fundamentally compositional in nature, and many questions are simply answered by decomposing them into modular sub-problems.

Question Answering Visual Question Answering

HEAD: HEtero-Assists Distillation for Heterogeneous Object Detectors

1 code implementation12 Jul 2022 Luting Wang, Xiaojie Li, Yue Liao, Zeren Jiang, Jianlong Wu, Fei Wang, Chen Qian, Si Liu

We observe that the core difficulty for heterogeneous KD (hetero-KD) is the significant semantic gap between the backbone features of heterogeneous detectors due to the different optimization manners.

Knowledge Distillation Object +3

Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation

1 code implementation CVPR 2022 Xingning Dong, Tian Gan, Xuemeng Song, Jianlong Wu, Yuan Cheng, Liqiang Nie

Scene Graph Generation, which generally follows a regular encoder-decoder pipeline, aims to first encode the visual contents within the given image and then parse them into a compact summary graph.

Decoder Graph Generation +1

DualGNN: Dual Graph Neural Network for Multimedia Recommendation

1 code implementation IEEE Transactions on Multimedia (TMM) 2021 Qifan Wang, Yinwei Wei, Jianhua Yin, Jianlong Wu, Xuemeng Song, Liqiang Nie

Specifically, we first introduce a single-modal representation learning module, which performs graph operations on the user-microvideo graph in each modality to capture single-modal user preferences on different modalities.

Graph Neural Network Multimedia recommendation +2

Dynamic Modality Interaction Modeling for Image-Text Retrieval

1 code implementation ACM Special Interest Group on Information Retrieval 2021 Leigang Qu, Meng Liu, Jianlong Wu, Zan Gao, Liqiang Nie

To address these issues, we develop a novel modality interaction modeling network based upon the routing mechanism, which is the first unified and dynamic multimodal interaction framework towards image-text retrieval.

cross-modal alignment Cross-Modal Retrieval +4

Graph Contrastive Clustering

1 code implementation ICCV 2021 Huasong Zhong, Jianlong Wu, Chong Chen, Jianqiang Huang, Minghua Deng, Liqiang Nie, Zhouchen Lin, Xian-Sheng Hua

On the other hand, a novel graph-based contrastive learning strategy is proposed to learn more compact clustering assignments.

Clustering Contrastive Learning

Fast and Differentiable Matrix Inverse and Its Extension to SVD

no code implementations1 Jan 2021 Xingyu Xie, Hao Kong, Jianlong Wu, Guangcan Liu, Zhouchen Lin

First of all, to perform matrix inverse, we provide a differentiable yet efficient way, named LD-Minv, which is a learnable deep neural network (DNN) with each layer being an $L$-th order matrix polynomial.

Agree to Disagree: Adaptive Ensemble Knowledge Distillation in Gradient Space

1 code implementation NeurIPS 2020 Shangchen Du, Shan You, Xiaojie Li, Jianlong Wu, Fei Wang, Chen Qian, ChangShui Zhang

In this paper, we examine the diversity of teacher models in the gradient space and regard the ensemble knowledge distillation as a multi-objective optimization problem so that we can determine a better optimization direction for the training of student network.

Diversity Knowledge Distillation

Multi-modal Cooking Workflow Construction for Food Recipes

no code implementations20 Aug 2020 Liangming Pan, Jingjing Chen, Jianlong Wu, Shaoteng Liu, Chong-Wah Ngo, Min-Yen Kan, Yu-Gang Jiang, Tat-Seng Chua

Understanding food recipe requires anticipating the implicit causal effects of cooking actions, such that the recipe can be converted into a graph describing the temporal workflow of the recipe.

Common Sense Reasoning Decoder

Maximum-and-Concatenation Networks

1 code implementation ICML 2020 Xingyu Xie, Hao Kong, Jianlong Wu, Wayne Zhang, Guangcan Liu, Zhouchen Lin

While successful in many fields, deep neural networks (DNNs) still suffer from some open problems such as bad local minima and unsatisfactory generalization performance.

Dynamical System Inspired Adaptive Time Stepping Controller for Residual Network Families

no code implementations23 Nov 2019 Yibo Yang, Jianlong Wu, Hongyang Li, Xia Li, Tiancheng Shen, Zhouchen Lin

We establish a stability condition for ResNets with step sizes and weight parameters, and point out the effects of step sizes on the stability and performance.

SOGNet: Scene Overlap Graph Network for Panoptic Segmentation

1 code implementation18 Nov 2019 Yibo Yang, Hongyang Li, Xia Li, Qijie Zhao, Jianlong Wu, Zhouchen Lin

In order to overcome the lack of supervision, we introduce a differentiable module to resolve the overlap between any pair of instances.

Instance Segmentation Panoptic Segmentation +1

Differentiable Linearized ADMM

1 code implementation15 May 2019 Xingyu Xie, Jianlong Wu, Zhisheng Zhong, Guangcan Liu, Zhouchen Lin

Recently, a number of learning-based optimization methods that combine data-driven architectures with the classical optimization algorithms have been proposed and explored, showing superior empirical performance in solving various ill-posed inverse problems, but there is still a scarcity of rigorous analysis about the convergence behaviors of learning-based optimization.

Matrix Recovery with Implicitly Low-Rank Data

1 code implementation9 Nov 2018 Xingyu Xie, Jianlong Wu, Guangcan Liu, Jun Wang

To tackle this issue, we propose a novel method for matrix recovery in this paper, which could well handle the case where the target matrix is low-rank in an implicit feature space but high-rank or even full-rank in its original form.

Recurrent Squeeze-and-Excitation Context Aggregation Net for Single Image Deraining

no code implementations ECCV 2018 Xia Li, Jianlong Wu, Zhouchen Lin, Hong Liu, Hongbin Zha

In heavy rain, rain streaks have various directions and shapes, which can be regarded as the accumulation of multiple rain streak layers.

Single Image Deraining

Essential Tensor Learning for Multi-view Spectral Clustering

no code implementations10 Jul 2018 Jianlong Wu, Zhouchen Lin, Hongbin Zha

In this paper, we focus on the Markov chain based spectral clustering method and propose a novel essential tensor learning method to explore the high order correlations for multi-view representation.

Clustering

Cannot find the paper you are looking for? You can Submit a new open access paper.