Search Results for author: Yonghong Tian

Found 168 papers, 90 papers with code

Language-Inspired Relation Transfer for Few-shot Class-Incremental Learning

no code implementations10 Jan 2025 Yifan Zhao, Jia Li, Zeyin Song, Yonghong Tian

Depicting novel classes with language descriptions by observing few-shot samples is inherent in human-learning systems.

class-incremental learning Contrastive Learning +3

Solving the Catastrophic Forgetting Problem in Generalized Category Discovery

1 code implementation CVPR 2024 Xinzi Cao, Xiawu Zheng, Guanhong Wang, Weijiang Yu, Yunhang Shen, Ke Li, Yutong Lu, Yonghong Tian

The LER optimizes the distribution of potential known class samples in unlabeled data, thus ensuring the preservation of knowledge related to known categories while learning novel classes.

Activating Associative Disease-Aware Vision Token Memory for LLM-Based X-ray Report Generation

1 code implementation7 Jan 2025 Xiao Wang, Fuling Wang, Haowen Wang, Bo Jiang, Chuanfu Li, YaoWei Wang, Yonghong Tian, Jin Tang

X-ray image based medical report generation achieves significant progress in recent years with the help of the large language model, however, these models have not fully exploited the effective information in visual image regions, resulting in reports that are linguistically sound but insufficient in describing key diseases.

Language Modeling Language Modelling +2

AE-NeRF: Augmenting Event-Based Neural Radiance Fields for Non-ideal Conditions and Larger Scene

no code implementations6 Jan 2025 Chaoran Feng, Wangbo Yu, Xinhua Cheng, Zhenyu Tang, Junwu Zhang, Li Yuan, Yonghong Tian

Compared to frame-based methods, computational neuromorphic imaging using event cameras offers significant advantages, such as minimal motion blur, enhanced temporal resolution, and high dynamic range.

3D Reconstruction

VELoRA: A Low-Rank Adaptation Approach for Efficient RGB-Event based Recognition

1 code implementation28 Dec 2024 Lan Chen, Haoxiang Yang, Pengpeng Shao, Haoyu Song, Xiao Wang, Zhicheng Zhao, YaoWei Wang, Yonghong Tian

Inspired by the successful application of large models, the introduction of such large models can also be considered to further enhance the performance of multi-modal tasks.

parameter-efficient fine-tuning

RoomPainter: View-Integrated Diffusion for Consistent Indoor Scene Texturing

no code implementations21 Dec 2024 Zhipeng Huang, Wangbo Yu, Xinhua Cheng, ChengShu Zhao, Yunyang Ge, Mingyi Guo, Li Yuan, Yonghong Tian

The core of RoomPainter features a zero-shot technique that effectively adapts a 2D diffusion model for 3D-consistent texture synthesis, along with a two-stage generation strategy that ensures both global and local consistency.

Texture Synthesis

High-speed and High-quality Vision Reconstruction of Spike Camera with Spike Stability Theorem

no code implementations16 Dec 2024 Wei zhang, Weiquan Yan, Yun Zhao, Wenxiang Cheng, Gang Chen, Huihui Zhou, Yonghong Tian

To realize high-speed and high-quality vision reconstruction of the spike camera, we propose a new spike stability theorem that reveals the relationship between spike stream characteristics and stable light intensity.

Flexible and Scalable Deep Dendritic Spiking Neural Networks with Multiple Nonlinear Branching

no code implementations9 Dec 2024 Yifan Huang, Wei Fang, Zhengyu Ma, Guoqi Li, Yonghong Tian

Our work firstly demonstrates the possibility of training bio-plausible dendritic SNNs with depths and scales comparable to traditional point SNNs, and reveals superior expressivity and robustness of reduced dendritic neuron models in deep learning, thereby offering a fresh perspective on advancing neural network design.

Few-Shot Learning

Object Detection using Event Camera: A MoE Heat Conduction based Detector and A New Benchmark Dataset

1 code implementation9 Dec 2024 Xiao Wang, Yu Jin, Wentao Wu, Wei zhang, Lin Zhu, Bo Jiang, Yonghong Tian

Object detection in event streams has emerged as a cutting-edge research area, demonstrating superior performance in low-light conditions, scenarios with motion blur, and rapid movements.

Computational Efficiency Object +2

Core Placement Optimization of Many-core Brain-Inspired Near-Storage Systems for Spiking Neural Network Training

no code implementations29 Nov 2024 Xueke Zhu, Wenjie Lin, Yanyu Lin, Wenxiang Cheng, Zhengyu Ma, Yonghong Tian, Huihui Zhou

In order to improve the computing parallelism and system throughput of the many-core near-memory computing system, and to reduce power consumption, we propose a SNN training many-core deployment optimization method based on Off-policy Deterministic Actor-Critic.

Deep Reinforcement Learning

Open-Sora Plan: Open-Source Large Video Generation Model

6 code implementations28 Nov 2024 Bin Lin, Yunyang Ge, Xinhua Cheng, Zongjian Li, Bin Zhu, Shaodong Wang, Xianyi He, Yang Ye, Shenghai Yuan, Liuhan Chen, Tanghui Jia, Junwu Zhang, Zhenyu Tang, Yatian Pang, Bin She, Cen Yan, Zhiheng Hu, Xiaoyi Dong, Lin Chen, Zhang Pan, Xing Zhou, Shaoling Dong, Yonghong Tian, Li Yuan

We introduce Open-Sora Plan, an open-source project that aims to contribute a large generation model for generating desired high-resolution videos with long durations based on various user inputs.

Video Generation

HDI-Former: Hybrid Dynamic Interaction ANN-SNN Transformer for Object Detection Using Frames and Events

no code implementations27 Nov 2024 Dianze Li, Jianing Li, Xu Liu, Zhaokun Zhou, Xiaopeng Fan, Yonghong Tian

To address these challenges, we propose HDI-Former, a Hybrid Dynamic Interaction ANN-SNN Transformer, marking the first trial to design a directly trained hybrid ANN-SNN architecture for high-accuracy and energy-efficient object detection using frames and events.

object-detection Object Detection

Is this Generated Person Existed in Real-world? Fine-grained Detecting and Calibrating Abnormal Human-body

no code implementations21 Nov 2024 Zeqing Wang, Qingyang Ma, Wentao Wan, Haojie Li, Keze Wang, Yonghong Tian

Intuitively, Visual Language Models (VLMs) that have obtained remarkable performance on various visual tasks are quite suitable for this task.

Anomaly Detection

ETTFS: An Efficient Training Framework for Time-to-First-Spike Neuron

no code implementations31 Oct 2024 Kaiwei Che, Wei Fang, Zhengyu Ma, Li Yuan, Timothée Masquelier, Yonghong Tian

Spiking Neural Networks (SNNs) have attracted considerable attention due to their biologically inspired, event-driven nature, making them highly suitable for neuromorphic hardware.

Spatial-Temporal Search for Spiking Neural Networks

no code implementations24 Oct 2024 Kaiwei Che, Zhaokun Zhou, Li Yuan, JianGuo Zhang, Yonghong Tian, Luziwei Leng

Drawing inspiration from the heterogeneity of biological neural networks, we propose a differentiable approach to optimize SNN on both spatial and temporal dimensions.

Image Classification Neural Architecture Search

Adverse Weather Optical Flow: Cumulative Homogeneous-Heterogeneous Adaptation

no code implementations25 Sep 2024 Hanyu Zhou, Yi Chang, Zhiwei Shi, Wending Yan, Gang Chen, Yonghong Tian, Luxin Yan

Under this unified framework, the proposed method can progressively and explicitly transfer knowledge from clean scenes to real adverse weather.

Domain Adaptation Knowledge Distillation +1

ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis

1 code implementation3 Sep 2024 Wangbo Yu, Jinbo Xing, Li Yuan, WenBo Hu, Xiaoyu Li, Zhipeng Huang, Xiangjun Gao, Tien-Tsin Wong, Ying Shan, Yonghong Tian

Our method takes advantage of the powerful generation capabilities of video diffusion model and the coarse 3D clues offered by point-based representation to generate high-quality video frames with precise camera pose control.

3D Generation 3D Reconstruction +3

Event Stream based Human Action Recognition: A High-Definition Benchmark Dataset and Algorithms

1 code implementation19 Aug 2024 Xiao Wang, Shiao Wang, Pengpeng Shao, Bo Jiang, Lin Zhu, Yonghong Tian

In this paper, we propose a large-scale, high-definition ($1280 \times 800$) human action recognition dataset based on the CeleX-V event camera, termed CeleX-HAR.

Action Recognition Mamba +1

Time-Dependent VAE for Building Latent Representations from Visual Neural Activity with Complex Dynamics

no code implementations15 Aug 2024 Liwei Huang, Zhengyu Ma, Liutao Yu, Huihui Zhou, Yonghong Tian

Seeking high-quality representations with latent variable models (LVMs) to reveal the intrinsic correlation between neural activity and behavior or sensory stimuli has attracted much interest.

Contrastive Learning

HoloDreamer: Holistic 3D Panoramic World Generation from Text Descriptions

no code implementations21 Jul 2024 Haiyang Zhou, Xinhua Cheng, Wangbo Yu, Yonghong Tian, Li Yuan

3D scene generation is in high demand across various domains, including virtual reality, gaming, and the film industry.

Scene Generation

Efficient Event Stream Super-Resolution with Recursive Multi-Branch Fusion

1 code implementation28 Jun 2024 Quanmin Liang, Zhilin Huang, Xiawu Zheng, Feidiao Yang, Jun Peng, Kai Huang, Yonghong Tian

FFM is designed for the fusion of contextual information within neighboring event streams, leveraging the coupling relationship between positive and negative events to alleviate the misleading of noises in the respective branches.

Object Recognition Super-Resolution +1

Retain, Blend, and Exchange: A Quality-aware Spatial-Stereo Fusion Approach for Event Stream Recognition

1 code implementation27 Jun 2024 Lan Chen, Dong Li, Xiao Wang, Pengpeng Shao, Wei zhang, YaoWei Wang, Yonghong Tian, Jin Tang

In this paper, we propose a novel dual-stream framework for event stream-based pattern recognition via differentiated fusion, termed EFV++.

Graph Neural Network

SVFormer: A Direct Training Spiking Transformer for Efficient Video Action Recognition

no code implementations21 Jun 2024 Liutao Yu, Liwei Huang, Chenlin Zhou, Han Zhang, Zhengyu Ma, Huihui Zhou, Yonghong Tian

To address this challenge, some researchers have turned to brain-inspired spiking neural networks (SNNs), such as recurrent SNNs and ANN-converted SNNs, leveraging their inherent temporal dynamics and energy efficiency.

Action Recognition Temporal Action Localization

EvaGaussians: Event Stream Assisted Gaussian Splatting from Blurry Images

no code implementations29 May 2024 Wangbo Yu, Chaoran Feng, Jiye Tang, Jiashu Yang, Zhenyu Tang, Xu Jia, Yuchao Yang, Li Yuan, Yonghong Tian

Capitalizing on the high temporal resolution and dynamic range offered by the event camera, we leverage the event streams to explicitly model the formation process of motion-blurred images and guide the deblurring reconstruction of 3D-GS.

3D Scene Reconstruction Deblurring +1

High-Performance Temporal Reversible Spiking Neural Networks with $O(L)$ Training Memory and $O(1)$ Inference Cost

1 code implementation26 May 2024 Jiakui Hu, Man Yao, Xuerui Qiu, Yuhong Chou, Yuxuan Cai, Ning Qiao, Yonghong Tian, Bo Xu, Guoqi Li

This work is expected to break the technical bottleneck of significantly increasing memory cost and training time for large-scale SNNs while maintaining high performance and low inference energy cost.

Sensitivity Decouple Learning for Image Compression Artifacts Reduction

no code implementations15 May 2024 Li Ma, Yifan Zhao, Peixi Peng, Yonghong Tian

Different from these methods, we propose to decouple the intrinsic attributes into two complementary features for artifacts reduction, ie, the compression-insensitive features to regularize the high-level semantic representations during training and the compression-sensitive features to be aware of the compression degree.

Image Compression

Direct Training High-Performance Deep Spiking Neural Networks: A Review of Theories and Methods

1 code implementation6 May 2024 Chenlin Zhou, Han Zhang, Liutao Yu, Yumin Ye, Zhaokun Zhou, Liwei Huang, Zhengyu Ma, Xiaopeng Fan, Huihui Zhou, Yonghong Tian

In this paper, we provide a new perspective to summarize the theories and methods for training deep SNNs with high performance in a systematic and comprehensive way, including theory fundamentals, spiking neuron models, advanced SNN models and residual architectures, software frameworks and neuromorphic hardware, applications, and future trends.

Spatio-Temporal Side Tuning Pre-trained Foundation Models for Video-based Pedestrian Attribute Recognition

3 code implementations27 Apr 2024 Xiao Wang, Qian Zhu, Jiandong Jin, Jun Zhu, Futian Wang, Bo Jiang, YaoWei Wang, Yonghong Tian

Specifically, we formulate the video-based PAR as a vision-language fusion problem and adopt a pre-trained foundation model CLIP to extract the visual features.

Attribute Pedestrian Attribute Recognition +1

State Space Model for New-Generation Network Alternative to Transformers: A Survey

1 code implementation15 Apr 2024 Xiao Wang, Shiao Wang, Yuhe Ding, Yuehang Li, Wentao Wu, Yao Rong, Weizhe Kong, Ju Huang, Shihao Li, Haoxiang Yang, Ziwen Wang, Bo Jiang, Chenglong Li, YaoWei Wang, Yonghong Tian, Jin Tang

In this paper, we give the first comprehensive review of these works and also provide experimental comparisons and analysis to better demonstrate the features and advantages of SSM.

QKFormer: Hierarchical Spiking Transformer using Q-K Attention

2 code implementations25 Mar 2024 Chenlin Zhou, Han Zhang, Zhaokun Zhou, Liutao Yu, Liwei Huang, Xiaopeng Fan, Li Yuan, Zhengyu Ma, Huihui Zhou, Yonghong Tian

ii) We incorporate the hierarchical structure, which significantly benefits the performance of both the brain and artificial neural networks, into spiking transformers to obtain multi-scale spiking representation.

Long-term Frame-Event Visual Tracking: Benchmark Dataset and Baseline

4 code implementations9 Mar 2024 Xiao Wang, Ju Huang, Shiao Wang, Chuanming Tang, Bo Jiang, Yonghong Tian, Jin Tang, Bin Luo

Current event-/frame-event based trackers undergo evaluation on short-term tracking datasets, however, the tracking of real-world scenarios involves long-term tracking, and the performance of existing tracking algorithms in these scenarios remains unclear.

Object Tracking Rgb-T Tracking

Noisy Spiking Actor Network for Exploration

no code implementations7 Mar 2024 Ding Chen, Peixi Peng, Tiejun Huang, Yonghong Tian

As a general method for exploration in deep reinforcement learning (RL), NoisyNet can produce problem-specific exploration strategies.

continuous-control Continuous Control +4

Optimal ANN-SNN Conversion with Group Neurons

1 code implementation29 Feb 2024 Liuzhenghao Lv, Wei Fang, Li Yuan, Yonghong Tian

For instance, while converting artificial neural networks (ANNs) to SNNs circumvents the need for direct training of SNNs, it encounters issues related to conversion errors and high inference time delays.

Fully Spiking Actor Network with Intra-layer Connections for Reinforcement Learning

no code implementations9 Jan 2024 Ding Chen, Peixi Peng, Tiejun Huang, Yonghong Tian

Recently, the surrogate gradient method has been utilized for training multi-layer SNNs, which allows SNNs to achieve comparable performance with the corresponding deep networks in this task.

Deep Reinforcement Learning reinforcement-learning

CRSOT: Cross-Resolution Object Tracking using Unaligned Frame and Event Cameras

1 code implementation5 Jan 2024 Yabin Zhu, Xiao Wang, Chenglong Li, Bo Jiang, Lin Zhu, Zhixiang Huang, Yonghong Tian, Jin Tang

In this work, we formally propose the task of object tracking using unaligned neuromorphic and visible cameras.

Object Tracking

DMR: Decomposed Multi-Modality Representations for Frames and Events Fusion in Visual Reinforcement Learning

1 code implementation CVPR 2024 Haoran Xu, Peixi Peng, Guang Tan, Yuan Li, Xinhai Xu, Yonghong Tian

We explore visual reinforcement learning (RL) using two complementary visual modalities: frame-based RGB camera and event-based Dynamic Vision Sensor (DVS).

Reinforcement Learning (RL)

Event-based Visible and Infrared Fusion via Multi-task Collaboration

no code implementations CVPR 2024 Mengyue Geng, Lin Zhu, Lizhi Wang, Wei zhang, Ruiqin Xiong, Yonghong Tian

Visible and Infrared image Fusion (VIF) offers a comprehensive scene description by combining thermal infrared images with the rich textures from visible cameras.

Deblurring Image Deblurring

Machine Mindset: An MBTI Exploration of Large Language Models

1 code implementation20 Dec 2023 Jiaxi Cui, Liuzhenghao Lv, Jing Wen, Rongsheng Wang, Jing Tang, Yonghong Tian, Li Yuan

We present a novel approach for integrating Myers-Briggs Type Indicator (MBTI) personality traits into large language models (LLMs), addressing the challenges of personality consistency in personalized AI.

Large Language Model Personality Alignment +2

Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video Recognition

1 code implementation18 Dec 2023 Xiao Wang, Yao Rong, Shiao Wang, Yuan Chen, Zhe Wu, Bo Jiang, Yonghong Tian, Jin Tang

It is intuitive to combine them for high-performance RGB-Event based video recognition, however, existing works fail to achieve a good balance between the accuracy and model parameters, as shown in Fig.~\ref{firstimage}.

Video Recognition

SpikingJelly: An open-source machine learning infrastructure platform for spike-based intelligence

1 code implementation25 Oct 2023 Wei Fang, Yanqi Chen, Jianhao Ding, Zhaofei Yu, Timothée Masquelier, Ding Chen, Liwei Huang, Huihui Zhou, Guoqi Li, Yonghong Tian

Spiking neural networks (SNNs) aim to realize brain-inspired intelligence on neuromorphic chips with high energy efficiency by introducing neural dynamics and spike properties.

Code Generation

HiFi-123: Towards High-fidelity One Image to 3D Content Generation

no code implementations10 Oct 2023 Wangbo Yu, Li Yuan, Yan-Pei Cao, Xiangjun Gao, Xiaoyu Li, WenBo Hu, Long Quan, Ying Shan, Yonghong Tian

Our contributions are twofold: First, we propose a Reference-Guided Novel View Enhancement (RGNV) technique that significantly improves the fidelity of diffusion-based zero-shot novel view synthesis methods.

3D Generation Image to 3D +1

Knowledge Prompt-tuning for Sequential Recommendation

1 code implementation14 Aug 2023 Jianyang Zhai, Xiawu Zheng, Chang-Dong Wang, Hui Li, Yonghong Tian

Pre-trained language models (PLMs) have demonstrated strong performance in sequential recommendation (SR), which are utilized to extract general knowledge.

General Knowledge Sequential Recommendation

SODFormer: Streaming Object Detection with Transformer Using Events and Frames

1 code implementation8 Aug 2023 Dianze Li, Jianing Li, Yonghong Tian

Then, we design a spatiotemporal Transformer architecture to detect objects via an end-to-end sequence prediction problem, where the novel temporal Transformer module leverages rich temporal cues from two visual streams to improve the detection performance.

object-detection Object Detection

SSTFormer: Bridging Spiking Neural Network and Memory Support Transformer for Frame-Event based Recognition

1 code implementation8 Aug 2023 Xiao Wang, Zongzhen Wu, Yao Rong, Lin Zhu, Bo Jiang, Jin Tang, Yonghong Tian

Secondly, they adopt either Spiking Neural Networks (SNN) for energy-efficient recognition with suboptimal results, or Artificial Neural Networks (ANN) for energy-intensive, high-performance recognition.

Learning Sparse Neural Networks with Identity Layers

no code implementations14 Jul 2023 Mingjian Ni, Guangyao Chen, Xiawu Zheng, Peixi Peng, Li Yuan, Yonghong Tian

Applying such theory, we propose a plug-and-play CKA-based Sparsity Regularization for sparse network training, dubbed CKA-SR, which utilizes CKA to reduce feature similarity between layers and increase network sparsity.

Spike-driven Transformer

1 code implementation NeurIPS 2023 Man Yao, Jiakui Hu, Zhaokun Zhou, Li Yuan, Yonghong Tian, Bo Xu, Guoqi Li

In this paper, we incorporate the spike-driven paradigm into Transformer by the proposed Spike-driven Transformer with four unique properties: 1) Event-driven, no calculation is triggered when the input of Transformer is zero; 2) Binary spike communication, all matrix multiplications associated with the spike matrix can be transformed into sparse additions; 3) Self-attention with linear complexity at both token and channel dimensions; 4) The operations between spike-form Query, Key, and Value are mask and addition.

Chatlaw: A Multi-Agent Collaborative Legal Assistant with Knowledge Graph Enhanced Mixture-of-Experts Large Language Model

1 code implementation28 Jun 2023 Jiaxi Cui, Munan Ning, Zongjian Li, Bohua Chen, Yang Yan, Hao Li, Bin Ling, Yonghong Tian, Li Yuan

AI legal assistants based on Large Language Models (LLMs) can provide accessible legal consulting services, but the hallucination problem poses potential legal risks.

Hallucination Knowledge Graphs +3

Dual Adaptive Representation Alignment for Cross-domain Few-shot Learning

1 code implementation18 Jun 2023 Yifan Zhao, Tong Zhang, Jia Li, Yonghong Tian

Recent progress in this setting assumes that the base knowledge and novel query samples are distributed in the same domains, which are usually infeasible for realistic applications.

cross-domain few-shot learning

Spatial Re-parameterization for N:M Sparsity

1 code implementation9 Jun 2023 Yuxin Zhang, Mingliang Xu, Yonghong Tian, Rongrong Ji

This paper presents a Spatial Re-parameterization (SpRe) method for the N:M sparsity in CNNs.

Population-Based Evolutionary Gaming for Unsupervised Person Re-identification

no code implementations8 Jun 2023 Yunpeng Zhai, Peixi Peng, Mengxi Jia, Shiyong Li, Weiqiang Chen, Xuesong Gao, Yonghong Tian

Extensive experiments demonstrate that (1) CRS approximately measures the performance of models without labeled samples; (2) and PEG produces new state-of-the-art accuracy for person re-identification, indicating the great potential of population-based network cooperative training for unsupervised learning.

Diversity Knowledge Distillation +1

Point-Voxel Absorbing Graph Representation Learning for Event Stream based Recognition

1 code implementation8 Jun 2023 Bo Jiang, Chengguo Yuan, Xiao Wang, Zhimin Bao, Lin Zhu, Yonghong Tian, Jin Tang

To address these issues, we propose a novel dual point-voxel absorbing graph representation learning for event stream data representation.

Event data classification Graph Representation Learning

Long-Range Feedback Spiking Network Captures Dynamic and Static Representations of the Visual Cortex under Movie Stimuli

1 code implementation2 Jun 2023 Liwei Huang, Zhengyu Ma, Liutao Yu, Huihui Zhou, Yonghong Tian

We further conduct experiments to quantify how temporal structures (dynamic information) and static textures (static information) of the movie stimuli influence representational similarity, suggesting that our model benefits from long-range feedback to encode context-dependent representations just like the brain.

Action Recognition Image Classification +2

Auto-Spikformer: Spikformer Architecture Search

no code implementations1 Jun 2023 Kaiwei Che, Zhaokun Zhou, Zhengyu Ma, Wei Fang, Yanqi Chen, Shuaijie Shen, Li Yuan, Yonghong Tian

The integration of self-attention mechanisms into Spiking Neural Networks (SNNs) has garnered considerable interest in the realm of advanced deep learning, primarily due to their biological properties.

Temporal Contrastive Learning for Spiking Neural Networks

no code implementations23 May 2023 Haonan Qiu, Zeyin Song, Yanqi Chen, Munan Ning, Wei Fang, Tao Sun, Zhengyu Ma, Li Yuan, Yonghong Tian

However, in this work, we find the method above is not ideal for the SNNs training as it omits the temporal dynamics of SNNs and degrades the performance quickly with the decrease of inference time steps.

Contrastive Learning

Album Storytelling with Iterative Story-aware Captioning and Large Language Models

no code implementations22 May 2023 Munan Ning, Yujia Xie, Dongdong Chen, Zeyin Song, Lu Yuan, Yonghong Tian, Qixiang Ye, Li Yuan

One natural approach is to use caption models to describe each photo in the album, and then use LLMs to summarize and rewrite the generated captions into an engaging story.

Parallel Spiking Neurons with High Efficiency and Ability to Learn Long-term Dependencies

1 code implementation NeurIPS 2023 Wei Fang, Zhaofei Yu, Zhaokun Zhou, Ding Chen, Yanqi Chen, Zhengyu Ma, Timothée Masquelier, Yonghong Tian

Vanilla spiking neurons in Spiking Neural Networks (SNNs) use charge-fire-reset neuronal dynamics, which can only be simulated serially and can hardly learn long-time dependencies.

Spikingformer: Spike-driven Residual Learning for Transformer-based Spiking Neural Network

1 code implementation24 Apr 2023 Chenlin Zhou, Liutao Yu, Zhaokun Zhou, Zhengyu Ma, Han Zhang, Huihui Zhou, Yonghong Tian

Based on this residual design, we develop Spikingformer, a pure transformer-based spiking neural network.

Picking Up Quantization Steps for Compressed Image Classification

1 code implementation21 Apr 2023 Li Ma, Peixi Peng, Guangyao Chen, Yifan Zhao, Siwei Dong, Yonghong Tian

The sensitivity of deep neural networks to compressed images hinders their usage in many real applications, which means classification networks may fail just after taking a screenshot and saving it as a compressed file.

Classification Image Classification +1

Learning with Fantasy: Semantic-Aware Virtual Contrastive Constraint for Few-Shot Class-Incremental Learning

1 code implementation CVPR 2023 Zeyin Song, Yifan Zhao, Yujun Shi, Peixi Peng, Li Yuan, Yonghong Tian

However, in this work, we find that the CE loss is not ideal for the base session training as it suffers poor class separation in terms of representations, which further degrades generalization to novel classes.

class-incremental learning Contrastive Learning +2

Deep Spiking Neural Networks with High Representation Similarity Model Visual Pathways of Macaque and Mouse

1 code implementation9 Mar 2023 Liwei Huang, Zhengyu Ma, Liutao Yu, Huihui Zhou, Yonghong Tian

However, they highly simplify the computational properties of neurons compared to their biological counterparts.

A Unified Framework for Soft Threshold Pruning

1 code implementation25 Feb 2023 Yanqi Chen, Zhengyu Ma, Wei Fang, Xiawu Zheng, Zhaofei Yu, Yonghong Tian

In this work, we reformulate soft threshold pruning as an implicit optimization problem solved using the Iterative Shrinkage-Thresholding Algorithm (ISTA), a classic method from the fields of sparse recovery and compressed sensing.


Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey

1 code implementation20 Feb 2023 Xiao Wang, Guangyao Chen, Guangwu Qian, Pengcheng Gao, Xiao-Yong Wei, YaoWei Wang, Yonghong Tian, Wen Gao

We also give visualization and analysis of the model parameters and results on representative downstream tasks.


Training Full Spike Neural Networks via Auxiliary Accumulation Pathway

2 code implementations27 Jan 2023 Guangyao Chen, Peixi Peng, Guoqi Li, Yonghong Tian

The accumulation in AAP could compensate for the information loss during the forward and backward of full spike propagation, and facilitate the training of the FSNN.

Part-guided Relational Transformers for Fine-grained Visual Recognition

1 code implementation28 Dec 2022 Yifan Zhao, Jia Li, Xiaowu Chen, Yonghong Tian

This framework, namely PArt-guided Relational Transformers (PART), is proposed to learn the discriminative part features with an automatic part discovery module, and to explore the intrinsic correlations with a feature transformation module by adapting the Transformer models from the field of natural language processing.

Fine-Grained Image Classification Fine-Grained Visual Recognition +1

Parsing Objects at a Finer Granularity: A Survey

no code implementations28 Dec 2022 Yifan Zhao, Jia Li, Yonghong Tian

Fine-grained visual parsing, including fine-grained part segmentation and fine-grained object recognition, has attracted considerable critical attention due to its importance in many real-world applications, e. g., agriculture, remote sensing, and space technologies.

Fine-Grained Visual Recognition Human Part Segmentation +3

Universal Object Detection with Large Vision Model

1 code implementation19 Dec 2022 Feng Lin, Wenze Hu, YaoWei Wang, Yonghong Tian, Guangming Lu, Fanglin Chen, Yong Xu, Xiaoyu Wang

In this study, our focus is on a specific challenge: the large-scale, multi-domain universal object detection problem, which contributes to the broader goal of achieving a universal vision system.

Object object-detection +1

Shadow Removal by High-Quality Shadow Synthesis

1 code implementation8 Dec 2022 Yunshan Zhong, Lizhou You, Yuxin Zhang, Fei Chao, Yonghong Tian, Rongrong Ji

Specifically, the encoder extracts the shadow feature of a region identity which is then paired with another region identity to serve as the generator input to synthesize a pseudo image.

Image Generation Shadow Removal +1

Event-based Monocular Dense Depth Estimation with Recurrent Transformers

no code implementations6 Dec 2022 Xu Liu, Jianing Li, Xiaopeng Fan, Yonghong Tian

Event cameras, offering high temporal resolutions and high dynamic ranges, have brought a new perspective to address common challenges (e. g., motion blur and low light) in monocular depth estimation.

Decoder Event-based vision +1

Meta Architecture for Point Cloud Analysis

1 code implementation CVPR 2023 Haojia Lin, Xiawu Zheng, Lijiang Li, Fei Chao, Shanshan Wang, Yan Wang, Yonghong Tian, Rongrong Ji

However, the lack of a unified framework to interpret those networks makes any systematic comparison, contrast, or analysis challenging, and practically limits healthy development of the field.

3D Semantic Segmentation

Revisiting Color-Event based Tracking: A Unified Network, Dataset, and Metric

2 code implementations20 Nov 2022 Chuanming Tang, Xiao Wang, Ju Huang, Bo Jiang, Lin Zhu, Jianlin Zhang, YaoWei Wang, Yonghong Tian

In this paper, we propose a single-stage backbone network for Color-Event Unified Tracking (CEUTrack), which achieves the above functions simultaneously.

Object Localization Object Tracking

HARDVS: Revisiting Human Activity Recognition with Dynamic Vision Sensors

3 code implementations17 Nov 2022 Xiao Wang, Zongzhen Wu, Bo Jiang, Zhimin Bao, Lin Zhu, Guoqi Li, YaoWei Wang, Yonghong Tian

The main streams of human activity recognition (HAR) algorithms are developed based on RGB cameras which are suffered from illumination, fast motion, privacy-preserving, and large energy consumption.

Activity Prediction Human Activity Recognition +1

Unsupervised Deraining: Where Asymmetric Contrastive Learning Meets Self-similarity

no code implementations2 Nov 2022 Yi Chang, Yun Guo, Yuntong Ye, Changfeng Yu, Lin Zhu, XiLe Zhao, Luxin Yan, Yonghong Tian

In addition, considering that the existing real rain datasets are of low quality, either small scale or downloaded from the internet, we collect a real large-scale dataset under various rainy kinds of weather that contains high-resolution rainy images.

Contrastive Learning Rain Removal

Spikformer: When Spiking Neural Network Meets Transformer

2 code implementations29 Sep 2022 Zhaokun Zhou, Yuesheng Zhu, Chao He, YaoWei Wang, Shuicheng Yan, Yonghong Tian, Li Yuan

Spikformer (66. 3M parameters) with comparable size to SEW-ResNet-152 (60. 2M, 69. 26%) can achieve 74. 81% top1 accuracy on ImageNet using 4 time steps, which is the state-of-the-art in directly trained SNNs models.

Image Classification

Attention Spiking Neural Networks

no code implementations28 Sep 2022 Man Yao, Guangshe Zhao, Hengyu Zhang, Yifan Hu, Lei Deng, Yonghong Tian, Bo Xu, Guoqi Li

On ImageNet-1K, we achieve top-1 accuracy of 75. 92% and 77. 08% on single/4-step Res-SNN-104, which are state-of-the-art results in SNNs.

Action Recognition Image Classification

Fine-Grained Object Classification via Self-Supervised Pose Alignment

2 code implementations CVPR 2022 Xuhui Yang, YaoWei Wang, Ke Chen, Yong Xu, Yonghong Tian

Semantic patterns of fine-grained objects are determined by subtle appearance difference of local parts, which thus inspires a number of part-based methods.

Classification Object +1

Training-free Transformer Architecture Search

1 code implementation CVPR 2022 Qinqin Zhou, Kekai Sheng, Xiawu Zheng, Ke Li, Xing Sun, Yonghong Tian, Jie Chen, Rongrong Ji

Recently, Vision Transformer (ViT) has achieved remarkable success in several computer vision tasks.


Masked Autoencoders for Point Cloud Self-supervised Learning

4 code implementations13 Mar 2022 Yatian Pang, Wenxiao Wang, Francis E. H. Tay, Wei Liu, Yonghong Tian, Li Yuan

Then, a standard Transformer based autoencoder, with an asymmetric design and a shifting mask tokens operation, learns high-level latent features from unmasked point patches, aiming to reconstruct the masked point patches.

3D Part Segmentation Few-Shot 3D Point Cloud Classification +2

Annotation Efficient Person Re-Identification with Diverse Cluster-Based Pair Selection

no code implementations10 Mar 2022 Lantian Xue, Yixiong Zou, Peixi Peng, Yonghong Tian, Tiejun Huang

To solve this problem, we propose the Annotation Efficient Person Re-Identification method to select image pairs from an alternative pair set according to the fallibility and diversity of pairs, and train the Re-ID model based on the annotation.

Clustering Diversity +1

Event-based Video Reconstruction via Potential-assisted Spiking Neural Network

1 code implementation CVPR 2022 Lin Zhu, Xiao Wang, Yi Chang, Jianing Li, Tiejun Huang, Yonghong Tian

We propose a novel Event-based Video reconstruction framework based on a fully Spiking Neural Network (EVSNN), which utilizes Leaky-Integrate-and-Fire (LIF) neuron and Membrane Potential (MP) neuron.

Computational Efficiency Event-Based Video Reconstruction +2

PowerGear: Early-Stage Power Estimation in FPGA HLS via Heterogeneous Edge-Centric GNNs

1 code implementation25 Jan 2022 Zhe Lin, Zike Yuan, Jieru Zhao, Wei zhang, Hui Wang, Yonghong Tian

Specifically, in the graph construction flow, we introduce buffer insertion, datapath merging, graph trimming and feature annotation techniques to transform HLS designs into graph-structured data, which encode both intra-operation micro-architectures and inter-operation interconnects annotated with switching activities.

graph construction Graph Learning +2

1000x Faster Camera and Machine Vision with Ordinary Devices

no code implementations23 Jan 2022 Tiejun Huang, Yajing Zheng, Zhaofei Yu, Rui Chen, Yuan Li, Ruiqin Xiong, Lei Ma, Junwei Zhao, Siwei Dong, Lin Zhu, Jianing Li, Shanshan Jia, Yihua Fu, Boxin Shi, Si Wu, Yonghong Tian

By treating vidar as spike trains in biological vision, we have further developed a spiking neural network-based machine vision system that combines the speed of the machine and the mechanism of biological vision, achieving high-speed object detection and tracking 1, 000x faster than human vision.

object-detection Object Detection

Deep Reinforcement Learning with Spiking Q-learning

no code implementations21 Jan 2022 Ding Chen, Peixi Peng, Tiejun Huang, Yonghong Tian

With the help of special neuromorphic hardware, spiking neural networks (SNNs) are expected to realize artificial intelligence (AI) with less energy consumption.

Atari Games Deep Reinforcement Learning +3

Neural Architecture Search With Representation Mutual Information

1 code implementation CVPR 2022 Xiawu Zheng, Xiang Fei, Lei Zhang, Chenglin Wu, Fei Chao, Jianzhuang Liu, Wei Zeng, Yonghong Tian, Rongrong Ji

Building upon RMI, we further propose a new search algorithm termed RMI-NAS, facilitating with a theorem to guarantee the global optimal of the searched architecture.

Neural Architecture Search

IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization

1 code implementation CVPR 2022 Yunshan Zhong, Mingbao Lin, Gongrui Nan, Jianzhuang Liu, Baochang Zhang, Yonghong Tian, Rongrong Ji

In this paper, we observe an interesting phenomenon of intra-class heterogeneity in real data and show that existing methods fail to retain this property in their synthetic images, which causes a limited performance increase.


Optimized Separable Convolution: Yet Another Efficient Convolution Operator

no code implementations29 Sep 2021 Tao Wei, Yonghong Tian, YaoWei Wang, Yun Liang, Chang Wen Chen

In this research, we propose a novel and principled operator called optimized separable convolution by optimal design for the internal number of groups and kernel sizes for general separable convolutions can achieve the complexity of O(C^{\frac{3}{2}}K).

Heterogeneous Relational Complement for Vehicle Re-identification

1 code implementation ICCV 2021 Jiajian Zhao, Yifan Zhao, Jia Li, Ke Yan, Yonghong Tian

The crucial problem in vehicle re-identification is to find the same vehicle identity when reviewing this object from cross-view cameras, which sets a higher demand for learning viewpoint-invariant representations.

Vehicle Re-Identification

An Information Theory-inspired Strategy for Automatic Network Pruning

1 code implementation19 Aug 2021 Xiawu Zheng, Yuexiao Ma, Teng Xi, Gang Zhang, Errui Ding, Yuchao Li, Jie Chen, Yonghong Tian, Rongrong Ji

This practically limits the application of model compression when the model needs to be deployed on a wide range of devices.

AutoML Model Compression +1

Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neural Networks in Frequency Domain

1 code implementation ICCV 2021 Guangyao Chen, Peixi Peng, Li Ma, Jia Li, Lin Du, Yonghong Tian

This observation leads to more explanations of the CNN's generalization behaviors in both robustness to common perturbations and out-of-distribution detection, and motivates a new perspective on data augmentation designed by re-combing the phase spectrum of the current image and the amplitude spectrum of the distracter image.

Adversarial Attack Data Augmentation +2

VisEvent: Reliable Object Tracking via Collaboration of Frame and Event Flows

2 code implementations11 Aug 2021 Xiao Wang, Jianing Li, Lin Zhu, Zhipeng Zhang, Zhe Chen, Xin Li, YaoWei Wang, Yonghong Tian, Feng Wu

Different from visible cameras which record intensity images frame by frame, the biologically inspired event camera produces a stream of asynchronous and sparse events with much lower latency.

Object Tracking

MFGNet: Dynamic Modality-Aware Filter Generation for RGB-T Tracking

2 code implementations22 Jul 2021 Xiao Wang, Xiujun Shu, Shiliang Zhang, Bo Jiang, YaoWei Wang, Yonghong Tian, Feng Wu

The visible and thermal filters will be used to conduct a dynamic convolutional operation on their corresponding input feature maps respectively.

Rgb-T Tracking

High-Speed Image Reconstruction Through Short-Term Plasticity for Spiking Cameras

no code implementations CVPR 2021 Yajing Zheng, Lingxiao Zheng, Zhaofei Yu, Boxin Shi, Yonghong Tian, Tiejun Huang

Mimicking the sampling mechanism of the fovea, a retina-inspired camera, named spiking camera, is developed to record the external information with a sampling rate of 40, 000 Hz, and outputs asynchronous binary spike streams.

Image Reconstruction Vocal Bursts Intensity Prediction

Tracking by Joint Local and Global Search: A Target-aware Attention based Approach

1 code implementation9 Jun 2021 Xiao Wang, Jin Tang, Bin Luo, YaoWei Wang, Yonghong Tian, Feng Wu

In this paper, we propose a novel and general target-aware attention mechanism (termed TANet) and integrate it with tracking-by-detection framework to conduct joint local and global search for robust tracking.

Decoder Object +1

1xN Pattern for Pruning Convolutional Neural Networks

1 code implementation31 May 2021 Mingbao Lin, Yuxin Zhang, Yuchao Li, Bohong Chen, Fei Chao, Mengdi Wang, Shen Li, Yonghong Tian, Rongrong Ji

We also provide a workflow of filter rearrangement that first rearranges the weight matrix in the output channel dimension to derive more influential blocks for accuracy improvements and then applies similar rearrangement to the next-layer weights in the input channel dimension to ensure correct convolutional operations.

Network Pruning

Optimal ANN-SNN Conversion for Fast and Accurate Inference in Deep Spiking Neural Networks

1 code implementation25 May 2021 Jianhao Ding, Zhaofei Yu, Yonghong Tian, Tiejun Huang

We show that the inference time can be reduced by optimizing the upper bound of the fit curve in the revised ANN to achieve fast inference.

Pruning of Deep Spiking Neural Networks through Gradient Rewiring

1 code implementation11 May 2021 Yanqi Chen, Zhaofei Yu, Wei Fang, Tiejun Huang, Yonghong Tian

Our key innovation is to redefine the gradient to a new synaptic parameter, allowing better exploration of network structures by taking full advantage of the competition between pruning and regrowth of connections.

Carrying out CNN Channel Pruning in a White Box

1 code implementation24 Apr 2021 Yuxin Zhang, Mingbao Lin, Chia-Wen Lin, Jie Chen, Feiyue Huang, Yongjian Wu, Yonghong Tian, Rongrong Ji

Specifically, to model the contribution of each channel to differentiating categories, we develop a class-wise mask for each channel, implemented in a dynamic training manner w. r. t.

Image Classification

Dynamic Attention guided Multi-Trajectory Analysis for Single Object Tracking

1 code implementation30 Mar 2021 Xiao Wang, Zhe Chen, Jin Tang, Bin Luo, YaoWei Wang, Yonghong Tian, Feng Wu

In this paper, we propose to introduce more dynamics by devising a dynamic attention-guided multi-trajectory tracking strategy.

Object Tracking

Distilling a Powerful Student Model via Online Knowledge Distillation

1 code implementation26 Mar 2021 Shaojie Li, Mingbao Lin, Yan Wang, Yongjian Wu, Yonghong Tian, Ling Shao, Rongrong Ji

Besides, a self-distillation module is adopted to convert the feature map of deeper layers into a shallower one.

Knowledge Distillation

Adversarial Reciprocal Points Learning for Open Set Recognition

1 code implementation1 Mar 2021 Guangyao Chen, Peixi Peng, Xiangqian Wang, Yonghong Tian

Then, an adversarial margin constraint is proposed to reduce the open space risk by limiting the latent open space constructed by reciprocal points.

General Classification Open Set Learning

Collaborative Intelligence: Challenges and Opportunities

no code implementations13 Feb 2021 Ivan V. Bajić, Weisi Lin, Yonghong Tian

This paper presents an overview of the emerging area of collaborative intelligence (CI).

Feature Compression

Deep Residual Learning in Spiking Neural Networks

1 code implementation NeurIPS 2021 Wei Fang, Zhaofei Yu, Yanqi Chen, Tiejun Huang, Timothée Masquelier, Yonghong Tian

Previous Spiking ResNet mimics the standard residual block in ANNs and simply replaces ReLU activation layers with spiking neurons, which suffers the degradation problem and can hardly implement residual learning.

MetaVIM: Meta Variationally Intrinsic Motivated Reinforcement Learning for Decentralized Traffic Signal Control

3 code implementations4 Jan 2021 Liwen Zhu, Peixi Peng, Zongqing Lu, Xiangqian Wang, Yonghong Tian

To make the policy learned from a training scenario generalizable to new unseen scenarios, a novel Meta Variationally Intrinsic Motivated (MetaVIM) RL method is proposed to learn the decentralized policy for each intersection that considers neighbor information in a latent way.

Deep Reinforcement Learning Meta-Learning +4

Rethinking Convolution: Towards an Optimal Efficiency

no code implementations1 Jan 2021 Tao Wei, Yonghong Tian, Chang Wen Chen

In this research, we propose a novel operator called \emph{optimal separable convolution} which can be calculated at $O(C^{\frac{3}{2}}KHW)$ by optimal design for the internal number of groups and kernel sizes for general separable convolutions.

Computational Efficiency

NeuSpike-Net: High Speed Video Reconstruction via Bio-Inspired Neuromorphic Cameras

no code implementations ICCV 2021 Lin Zhu, Jianing Li, Xiao Wang, Tiejun Huang, Yonghong Tian

In this paper, we propose a NeuSpike-Net to learn both the high dynamic range and high motion sensitivity of DVS and the full texture sampling of spike camera to achieve high-speed and high dynamic image reconstruction.

Image Reconstruction Video Reconstruction +1

Fast Class-wise Updating for Online Hashing

no code implementations1 Dec 2020 Mingbao Lin, Rongrong Ji, Xiaoshuai Sun, Baochang Zhang, Feiyue Huang, Yonghong Tian, DaCheng Tao

To achieve fast online adaptivity, a class-wise updating method is developed to decompose the binary code learning and alternatively renew the hash functions in a class-wise fashion, which well addresses the burden on large amounts of training batches.

Annotation-Efficient Untrimmed Video Action Recognition

no code implementations30 Nov 2020 Yixiong Zou, Shanghang Zhang, Guangyao Chen, Yonghong Tian, Kurt Keutzer, José M. F. Moura

In this paper, we target a new problem, Annotation-Efficient Video Recognition, to reduce the requirement of annotations for both large amount of samples and the action location.

Action Recognition Contrastive Learning +3

Learning Open Set Network with Discriminative Reciprocal Points

1 code implementation ECCV 2020 Guangyao Chen, Limeng Qiao, Yemin Shi, Peixi Peng, Jia Li, Tiejun Huang, ShiLiang Pu, Yonghong Tian

In this process, one of the key challenges is to reduce the risk of generalizing the inherent characteristics of numerous unknown samples learned from a small amount of known data.

Open Set Learning

Intrinsic Relationship Reasoning for Small Object Detection

no code implementations2 Sep 2020 Kui Fu, Jia Li, Lin Ma, Kai Mu, Yonghong Tian

In this paper, we propose a novel context reasoning approach for small object detection which models and infers the intrinsic semantic and spatial layout relationships between objects.

Object object-detection +1

Cooperative Bi-path Metric for Few-shot Learning

1 code implementation10 Aug 2020 Zeyuan Wang, Yifan Zhao, Jia Li, Yonghong Tian

Given base classes with sufficient labeled samples, the target of few-shot classification is to recognize unlabeled samples of novel classes with only a few labeled samples.

Classification Few-Shot Learning +1

Revisiting Mid-Level Patterns for Cross-Domain Few-Shot Recognition

no code implementations7 Aug 2020 Yixiong Zou, Shanghang Zhang, JianPeng Yu, Yonghong Tian, José M. F. Moura

To solve this problem, cross-domain FSL (CDFSL) is proposed very recently to transfer knowledge from general-domain base classes to special-domain novel classes.

cross-domain few-shot learning

Incorporating Learnable Membrane Time Constant to Enhance Learning of Spiking Neural Networks

1 code implementation ICCV 2021 Wei Fang, Zhaofei Yu, Yanqi Chen, Timothee Masquelier, Tiejun Huang, Yonghong Tian

In this paper, we take inspiration from the observation that membrane-related parameters are different across brain regions, and propose a training algorithm that is capable of learning not only the synaptic weights but also the membrane time constants of SNNs.

Image Classification

Multiple Expert Brainstorming for Domain Adaptive Person Re-identification

2 code implementations ECCV 2020 Yunpeng Zhai, Qixiang Ye, Shijian Lu, Mengxi Jia, Rongrong Ji, Yonghong Tian

Often the best performing deep neural models are ensembles of multiple base-level networks, nevertheless, ensemble learning with respect to domain adaptive person re-ID remains unexplored.

Domain Adaptive Person Re-Identification Ensemble Learning +1

SEKD: Self-Evolving Keypoint Detection and Description

1 code implementation9 Jun 2020 Yafei Song, Ling Cai, Jia Li, Yonghong Tian, Mingyang Li

Researchers have attempted utilizing deep neural network (DNN) to learn novel local features from images inspired by its recent successes on a variety of vision tasks.

Homography Estimation Keypoint Detection

Rethinking Performance Estimation in Neural Architecture Search

1 code implementation CVPR 2020 Xiawu Zheng, Rongrong Ji, Qiang Wang, Qixiang Ye, Zhenguo Li, Yonghong Tian, Qi Tian

In this paper, we provide a novel yet systematic rethinking of PE in a resource constrained regime, termed budgeted PE (BPE), which precisely and effectively estimates the performance of an architecture sampled from an architecture space.

Neural Architecture Search

Compositional Few-Shot Recognition with Primitive Discovery and Enhancing

no code implementations12 May 2020 Yixiong Zou, Shanghang Zhang, Ke Chen, Yonghong Tian, Yao-Wei Wang, José M. F. Moura

Inspired by such capability of humans, to imitate humans' ability of learning visual primitives and composing primitives to recognize novel classes, we propose an approach to FSL to learn a feature representation composed of important primitives, which is jointly trained with two parts, i. e. primitive discovery and primitive enhancing.

Few-Shot Image Classification Few-Shot Learning +1

Self-Guided Adaptation: Progressive Representation Alignment for Domain Adaptive Object Detection

no code implementations19 Mar 2020 Zongxian Li, Qixiang Ye, Chong Zhang, Jingjing Liu, Shijian Lu, Yonghong Tian

In this work, we propose a Self-Guided Adaptation (SGA) model, target at aligning feature representation and transferring object detection models across domains while considering the instantaneous alignment difficulty.

object-detection Object Detection +1

HRank: Filter Pruning using High-Rank Feature Map

2 code implementations CVPR 2020 Mingbao Lin, Rongrong Ji, Yan Wang, Yichen Zhang, Baochang Zhang, Yonghong Tian, Ling Shao

The principle behind our pruning is that low-rank feature maps contain less information, and thus pruned results can be easily reproduced.

Network Pruning Vocal Bursts Intensity Prediction

Filter Sketch for Network Pruning

1 code implementation23 Jan 2020 Mingbao Lin, Liujuan Cao, Shaojie Li, Qixiang Ye, Yonghong Tian, Jianzhuang Liu, Qi Tian, Rongrong Ji

Our approach, referred to as FilterSketch, encodes the second-order information of pre-trained weights, which enables the representation capacity of pruned networks to be recovered with a simple fine-tuning procedure.

Network Pruning

Channel Pruning via Automatic Structure Search

1 code implementation23 Jan 2020 Mingbao Lin, Rongrong Ji, Yuxin Zhang, Baochang Zhang, Yongjian Wu, Yonghong Tian

In this paper, we propose a new channel pruning method based on artificial bee colony algorithm (ABC), dubbed as ABCPruner, which aims to efficiently find optimal pruned structure, i. e., channel number in each layer, rather than selecting "important" channels as previous works did.

Salient Object Detection with Purificatory Mechanism and Structural Similarity Loss

1 code implementation18 Dec 2019 Jia Li, Jinming Su, Changqun Xia, Mingcan Ma, Yonghong Tian

Through these two attentions, we use the Purificatory Mechanism to impose strict weights with different regions of the whole salient objects and purify results from hard-to-distinguish regions, thus accurately predicting the locations and details of salient objects.

object-detection RGB Salient Object Detection +1