Search Results for author: Zhichao Lu

Found 38 papers, 18 papers with code

VLM-KD: Knowledge Distillation from VLM for Long-Tail Visual Recognition

no code implementations29 Aug 2024 Zaiwei Zhang, Gregory P. Meyer, Zhichao Lu, Ashish Shrivastava, Avinash Ravichandran, Eric M. Wolff

To our knowledge, this work is the first to utilize knowledge distillation with text supervision generated by an off-the-shelf VLM and apply it to vanilla randomly initialized vision encoders.

Knowledge Distillation Language Modelling

SpikingSSMs: Learning Long Sequences with Sparse and Parallel Spiking State Space Models

no code implementations27 Aug 2024 Shuaijie Shen, Chao Wang, Renzhuo Huang, Yan Zhong, Qinghai Guo, Zhichao Lu, JianGuo Zhang, Luziwei Leng

Known as low energy consumption networks, spiking neural networks (SNNs) have gained a lot of attention within the past decades.

Language Modelling

Design Principle Transfer in Neural Architecture Search via Large Language Models

no code implementations21 Aug 2024 Xun Zhou, Liang Feng, Xingyu Wu, Zhichao Lu, Kay Chen Tan

In LAPT, LLM is applied to automatically reason the design principles from a set of given architectures, and then a principle adaptation method is applied to refine these principles progressively based on the new search results.

Language Modelling Large Language Model +1

Understanding the Importance of Evolutionary Search in Automated Heuristic Design with Large Language Models

no code implementations15 Jul 2024 Rui Zhang, Fei Liu, Xi Lin, Zhenkun Wang, Zhichao Lu, Qingfu Zhang

Automated heuristic design (AHD) has gained considerable attention for its potential to automate the development of effective heuristics.

Rethinking Unsupervised Outlier Detection via Multiple Thresholding

2 code implementations7 Jul 2024 Zhonghang Liu, Panzhong Lu, Guoyang Xie, Zhichao Lu, Wen-Yan Lin

In the realm of unsupervised image outlier detection, assigning outlier scores holds greater significance than its subsequent task: thresholding for predicting labels.

Outlier Detection

Evolutionary Spiking Neural Networks: A Survey

no code implementations18 Jun 2024 Shuaijie Shen, Rui Zhang, Chao Wang, Renzhuo Huang, Aiersi Tuerhong, Qinghai Guo, Zhichao Lu, JianGuo Zhang, Luziwei Leng

Spiking neural networks (SNNs) are gaining increasing attention as potential computationally efficient alternatives to traditional artificial neural networks(ANNs).

AnomalyXFusion: Multi-modal Anomaly Synthesis with Diffusion

1 code implementation30 Apr 2024 Jie Hu, Yawen Huang, Yilin Lu, Guoyang Xie, Guannan Jiang, Yefeng Zheng, Zhichao Lu

The AnomalyXFusion framework comprises two distinct yet synergistic modules: the Multi-modal In-Fusion (MIF) module and the Dynamic Dif-Fusion (DDF) module.

ShadowMaskFormer: Mask Augmented Patch Embeddings for Shadow Removal

1 code implementation29 Apr 2024 Zhuohao Li, Guoyang Xie, Guannan Jiang, Zhichao Lu

Transformer recently emerged as the de facto model for computer vision tasks and has also been successfully applied to shadow removal.

Shadow Removal

A Multi-objective Optimization Benchmark Test Suite for Real-time Semantic Segmentation

1 code implementation25 Apr 2024 Yifan Zhao, Zhenyu Liang, Zhichao Lu, Ran Cheng

To bridge the gap, we introduce a tailored streamline to transform the task of HW-NAS for real-time semantic segmentation into standard MOPs.

Autonomous Driving Evolutionary Algorithms +4

Seed Feature Maps-based CNN Models for LEO Satellite Remote Sensing Services

no code implementations12 Aug 2023 Zhichao Lu, Chuntao Ding, Shangguang Wang, Ran Cheng, Felix Juefei-Xu, Vishnu Naresh Boddeti

However, the limited resources available on LEO satellites contrast with the demands of resource-intensive CNN models, necessitating the adoption of ground-station server assistance for training and updating these models.

Semantic Segmentation

Mitigating Task Interference in Multi-Task Learning via Explicit Task Routing with Non-Learnable Primitives

no code implementations CVPR 2023 Chuntao Ding, Zhichao Lu, Shangguang Wang, Ran Cheng, Vishnu Naresh Boddeti

Our key idea is to employ non-learnable primitives to extract a diverse set of task-agnostic features and recombine them into a shared branch common to all tasks and explicit task-specific branches reserved for each task.

Multi-Task Learning

TFormer: A Transmission-Friendly ViT Model for IoT Devices

no code implementations15 Feb 2023 Zhichao Lu, Chuntao Ding, Felix Juefei-Xu, Vishnu Naresh Boddeti, Shangguang Wang, Yun Yang

The high performance and small number of model parameters and FLOPs of TFormer are attributed to the proposed hybrid layer and the proposed partially connected feed-forward network (PCS-FFN).

Image Classification object-detection +2

Revisiting Residual Networks for Adversarial Robustness

1 code implementation CVPR 2023 Shihua Huang, Zhichao Lu, Kalyanmoy Deb, Vishnu Naresh Boddeti

Then we design a robust residual block, dubbed RobustResBlock, and a compound scaling rule, dubbed RobustScaling, to distribute depth and width at the desired FLOP count.

Adversarial Robustness

Revisiting Residual Networks for Adversarial Robustness: An Architectural Perspective

1 code implementation21 Dec 2022 Shihua Huang, Zhichao Lu, Kalyanmoy Deb, Vishnu Naresh Boddeti

In contrast, little attention was devoted to analyzing the role of architectural elements (such as topology, depth, and width) on adversarial robustness.

Adversarial Robustness

Surrogate-assisted Multi-objective Neural Architecture Search for Real-time Semantic Segmentation

no code implementations14 Aug 2022 Zhichao Lu, Ran Cheng, Shihua Huang, Haoming Zhang, Changxiao Qiu, Fan Yang

The main challenges of applying NAS to semantic segmentation arise from two aspects: (i) high-resolution images to be processed; (ii) additional requirement of real-time inference speed (i. e., real-time semantic segmentation) for applications such as autonomous driving.

Autonomous Driving Image Classification +3

Neural Architecture Search as Multiobjective Optimization Benchmarks: Problem Formulation and Performance Assessment

2 code implementations8 Aug 2022 Zhichao Lu, Ran Cheng, Yaochu Jin, Kay Chen Tan, Kalyanmoy Deb

From an optimization point of view, the NAS tasks involving multiple design criteria are intrinsically multiobjective optimization problems; hence, it is reasonable to adopt evolutionary multiobjective optimization (EMO) algorithms for tackling them.

Multiobjective Optimization Neural Architecture Search

VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix

no code implementations17 Jun 2022 Teng Wang, Wenhao Jiang, Zhichao Lu, Feng Zheng, Ran Cheng, Chengguo Yin, Ping Luo

Existing vision-language pre-training (VLP) methods primarily rely on paired image-text datasets, which are either annotated by enormous human labors, or crawled from the internet followed by elaborate data cleaning techniques.

Contrastive Learning cross-modal alignment +3

Semantic-Aware Pretraining for Dense Video Captioning

no code implementations13 Apr 2022 Teng Wang, Zhu Liu, Feng Zheng, Zhichao Lu, Ran Cheng, Ping Luo

This report describes the details of our approach for the event dense-captioning task in ActivityNet Challenge 2021.

Dense Captioning Dense Video Captioning

GAMMA Challenge:Glaucoma grAding from Multi-Modality imAges

no code implementations14 Feb 2022 Junde Wu, Huihui Fang, Fei Li, Huazhu Fu, Fengbin Lin, Jiongcheng Li, Lexing Huang, Qinji Yu, Sifan Song, Xinxing Xu, Yanyu Xu, Wensai Wang, Lingxiao Wang, Shuai Lu, Huiqi Li, Shihua Huang, Zhichao Lu, Chubin Ou, Xifei Wei, Bingyuan Liu, Riadh Kobbi, Xiaoying Tang, Li Lin, Qiang Zhou, Qiang Hu, Hrvoje Bogunovic, José Ignacio Orlando, Xiulan Zhang, Yanwu Xu

However, although numerous algorithms are proposed based on fundus images or OCT volumes in computer-aided diagnosis, there are still few methods leveraging both of the modalities for the glaucoma assessment.

Multiview Transformers for Video Recognition

1 code implementation CVPR 2022 Shen Yan, Xuehan Xiong, Anurag Arnab, Zhichao Lu, Mi Zhang, Chen Sun, Cordelia Schmid

Video understanding requires reasoning at multiple spatiotemporal resolutions -- from short fine-grained motions to events taking place over longer durations.

Ranked #5 on Action Recognition on EPIC-KITCHENS-100 (using extra training data)

Action Classification Action Recognition +1

Accelerating Multi-Objective Neural Architecture Search by Random-Weight Evaluation

no code implementations8 Oct 2021 Shengran Hu, Ran Cheng, Cheng He, Zhichao Lu, Jing Wang, Miao Zhang

For the goal of automated design of high-performance deep convolutional neural networks (CNNs), Neural Architecture Search (NAS) methodology is becoming increasingly important for both academia and industries. Due to the costly stochastic gradient descent (SGD) training of CNNs for performance evaluation, most existing NAS methods are computationally expensive for real-world deployments.

Neural Architecture Search

The surprising impact of mask-head architecture on novel class segmentation

3 code implementations ICCV 2021 Vighnesh Birodkar, Zhichao Lu, Siyang Li, Vivek Rathod, Jonathan Huang

Under this family, we study Mask R-CNN and discover that instead of its default strategy of training the mask-head with a combination of proposals and groundtruth boxes, training the mask-head with only groundtruth boxes dramatically improves its performance on novel classes.

Instance Segmentation Segmentation +1

Learning from Weakly-labeled Web Videos via Exploring Sub-Concepts

no code implementations11 Jan 2021 Kunpeng Li, Zizhao Zhang, Guanhang Wu, Xuehan Xiong, Chen-Yu Lee, Zhichao Lu, Yun Fu, Tomas Pfister

To address this issue, we introduce a new method for pre-training video action recognition models using queried web videos.

Action Recognition Pseudo Label +1

Multi-objective Neural Architecture Search with Almost No Training

no code implementations27 Nov 2020 Shengran Hu, Ran Cheng, Cheng He, Zhichao Lu

In the recent past, neural architecture search (NAS) has attracted increasing attention from both academia and industries.

Neural Architecture Search Transfer Learning

PERF-Net: Pose Empowered RGB-Flow Net

no code implementations28 Sep 2020 Yinxiao Li, Zhichao Lu, Xuehan Xiong, Jonathan Huang

In recent years, many works in the video action recognition literature have shown that two stream models (combining spatial and temporal input streams) are necessary for achieving state of the art performance.

Action Classification Action Recognition +1

MUXConv: Information Multiplexing in Convolutional Neural Networks

1 code implementation CVPR 2020 Zhichao Lu, Kalyanmoy Deb, Vishnu Naresh Boddeti

To overcome this limitation, we present MUXConv, a layer that is designed to increase the flow of information by progressively multiplexing channel and spatial information in the network, while mitigating computational complexity.

Computational Efficiency Image Classification +6

RetinaTrack: Online Single Stage Joint Detection and Tracking

1 code implementation CVPR 2020 Zhichao Lu, Vivek Rathod, Ronny Votel, Jonathan Huang

Traditionally multi-object tracking and object detection are performed using separate systems with most prior works focusing exclusively on one of these aspects over the other.

Autonomous Driving Multi-Object Tracking +3

Multi-Objective Evolutionary Design of Deep Convolutional Neural Networks for Image Classification

1 code implementation3 Dec 2019 Zhichao Lu, Ian Whalen, Yashesh Dhebar, Kalyanmoy Deb, Erik Goodman, Wolfgang Banzhaf, Vishnu Naresh Boddeti

While existing approaches have achieved competitive performance in image classification, they are not well suited to problems where the computational budget is limited for two reasons: (1) the obtained architectures are either solely optimized for classification performance, or only for one deployment scenario; (2) the search process requires vast computational resources in most approaches.

Classification Computational Efficiency +4

Cannot find the paper you are looking for? You can Submit a new open access paper.