no code implementations • 29 Aug 2024 • Zaiwei Zhang, Gregory P. Meyer, Zhichao Lu, Ashish Shrivastava, Avinash Ravichandran, Eric M. Wolff
To our knowledge, this work is the first to utilize knowledge distillation with text supervision generated by an off-the-shelf VLM and apply it to vanilla randomly initialized vision encoders.
no code implementations • 27 Aug 2024 • Shuaijie Shen, Chao Wang, Renzhuo Huang, Yan Zhong, Qinghai Guo, Zhichao Lu, JianGuo Zhang, Luziwei Leng
Known as low energy consumption networks, spiking neural networks (SNNs) have gained a lot of attention within the past decades.
no code implementations • 21 Aug 2024 • Xun Zhou, Liang Feng, Xingyu Wu, Zhichao Lu, Kay Chen Tan
In LAPT, LLM is applied to automatically reason the design principles from a set of given architectures, and then a principle adaptation method is applied to refine these principles progressively based on the new search results.
no code implementations • 15 Jul 2024 • Rui Zhang, Fei Liu, Xi Lin, Zhenkun Wang, Zhichao Lu, Qingfu Zhang
Automated heuristic design (AHD) has gained considerable attention for its potential to automate the development of effective heuristics.
2 code implementations • 7 Jul 2024 • Zhonghang Liu, Panzhong Lu, Guoyang Xie, Zhichao Lu, Wen-Yan Lin
In the realm of unsupervised image outlier detection, assigning outlier scores holds greater significance than its subsequent task: thresholding for predicting labels.
no code implementations • 18 Jun 2024 • Shuaijie Shen, Rui Zhang, Chao Wang, Renzhuo Huang, Aiersi Tuerhong, Qinghai Guo, Zhichao Lu, JianGuo Zhang, Luziwei Leng
Spiking neural networks (SNNs) are gaining increasing attention as potential computationally efficient alternatives to traditional artificial neural networks(ANNs).
1 code implementation • 30 Apr 2024 • Jie Hu, Yawen Huang, Yilin Lu, Guoyang Xie, Guannan Jiang, Yefeng Zheng, Zhichao Lu
The AnomalyXFusion framework comprises two distinct yet synergistic modules: the Multi-modal In-Fusion (MIF) module and the Dynamic Dif-Fusion (DDF) module.
1 code implementation • 29 Apr 2024 • Zhuohao Li, Guoyang Xie, Guannan Jiang, Zhichao Lu
Transformer recently emerged as the de facto model for computer vision tasks and has also been successfully applied to shadow removal.
Ranked #3 on Shadow Removal on ISTD+
1 code implementation • 25 Apr 2024 • Yifan Zhao, Zhenyu Liang, Zhichao Lu, Ran Cheng
To bridge the gap, we introduce a tailored streamline to transform the task of HW-NAS for real-time semantic segmentation into standard MOPs.
3 code implementations • 4 Jan 2024 • Fei Liu, Xialiang Tong, Mingxuan Yuan, Xi Lin, Fu Luo, Zhenkun Wang, Zhichao Lu, Qingfu Zhang
EoH represents the ideas of heuristics in natural language, termed thoughts.
no code implementations • 12 Aug 2023 • Zhichao Lu, Chuntao Ding, Shangguang Wang, Ran Cheng, Felix Juefei-Xu, Vishnu Naresh Boddeti
However, the limited resources available on LEO satellites contrast with the demands of resource-intensive CNN models, necessitating the adoption of ground-station server assistance for training and updating these models.
no code implementations • CVPR 2023 • Chuntao Ding, Zhichao Lu, Shangguang Wang, Ran Cheng, Vishnu Naresh Boddeti
Our key idea is to employ non-learnable primitives to extract a diverse set of task-agnostic features and recombine them into a shared branch common to all tasks and explicit task-specific branches reserved for each task.
no code implementations • 15 Feb 2023 • Zhichao Lu, Chuntao Ding, Felix Juefei-Xu, Vishnu Naresh Boddeti, Shangguang Wang, Yun Yang
The high performance and small number of model parameters and FLOPs of TFormer are attributed to the proposed hybrid layer and the proposed partially connected feed-forward network (PCS-FFN).
1 code implementation • CVPR 2023 • Shihua Huang, Zhichao Lu, Kalyanmoy Deb, Vishnu Naresh Boddeti
Then we design a robust residual block, dubbed RobustResBlock, and a compound scaling rule, dubbed RobustScaling, to distribute depth and width at the desired FLOP count.
1 code implementation • 21 Dec 2022 • Shihua Huang, Zhichao Lu, Kalyanmoy Deb, Vishnu Naresh Boddeti
In contrast, little attention was devoted to analyzing the role of architectural elements (such as topology, depth, and width) on adversarial robustness.
no code implementations • 14 Aug 2022 • Zhichao Lu, Ran Cheng, Shihua Huang, Haoming Zhang, Changxiao Qiu, Fan Yang
The main challenges of applying NAS to semantic segmentation arise from two aspects: (i) high-resolution images to be processed; (ii) additional requirement of real-time inference speed (i. e., real-time semantic segmentation) for applications such as autonomous driving.
2 code implementations • 8 Aug 2022 • Zhichao Lu, Ran Cheng, Yaochu Jin, Kay Chen Tan, Kalyanmoy Deb
From an optimization point of view, the NAS tasks involving multiple design criteria are intrinsically multiobjective optimization problems; hence, it is reasonable to adopt evolutionary multiobjective optimization (EMO) algorithms for tackling them.
no code implementations • 17 Jun 2022 • Teng Wang, Wenhao Jiang, Zhichao Lu, Feng Zheng, Ran Cheng, Chengguo Yin, Ping Luo
Existing vision-language pre-training (VLP) methods primarily rely on paired image-text datasets, which are either annotated by enormous human labors, or crawled from the internet followed by elaborate data cleaning techniques.
no code implementations • 13 Apr 2022 • Teng Wang, Zhu Liu, Feng Zheng, Zhichao Lu, Ran Cheng, Ping Luo
This report describes the details of our approach for the event dense-captioning task in ActivityNet Challenge 2021.
no code implementations • 14 Feb 2022 • Junde Wu, Huihui Fang, Fei Li, Huazhu Fu, Fengbin Lin, Jiongcheng Li, Lexing Huang, Qinji Yu, Sifan Song, Xinxing Xu, Yanyu Xu, Wensai Wang, Lingxiao Wang, Shuai Lu, Huiqi Li, Shihua Huang, Zhichao Lu, Chubin Ou, Xifei Wei, Bingyuan Liu, Riadh Kobbi, Xiaoying Tang, Li Lin, Qiang Zhou, Qiang Hu, Hrvoje Bogunovic, José Ignacio Orlando, Xiulan Zhang, Yanwu Xu
However, although numerous algorithms are proposed based on fundus images or OCT volumes in computer-aided diagnosis, there are still few methods leveraging both of the modalities for the glaucoma assessment.
1 code implementation • CVPR 2022 • Shen Yan, Xuehan Xiong, Anurag Arnab, Zhichao Lu, Mi Zhang, Chen Sun, Cordelia Schmid
Video understanding requires reasoning at multiple spatiotemporal resolutions -- from short fine-grained motions to events taking place over longer durations.
Ranked #5 on Action Recognition on EPIC-KITCHENS-100 (using extra training data)
no code implementations • 8 Oct 2021 • Shengran Hu, Ran Cheng, Cheng He, Zhichao Lu, Jing Wang, Miao Zhang
For the goal of automated design of high-performance deep convolutional neural networks (CNNs), Neural Architecture Search (NAS) methodology is becoming increasingly important for both academia and industries. Due to the costly stochastic gradient descent (SGD) training of CNNs for performance evaluation, most existing NAS methods are computationally expensive for real-world deployments.
2 code implementations • ICCV 2021 • Teng Wang, Ruimao Zhang, Zhichao Lu, Feng Zheng, Ran Cheng, Ping Luo
Dense video captioning aims to generate multiple associated captions with their temporal locations from the video.
Ranked #5 on Dense Video Captioning on YouCook2
3 code implementations • ICCV 2021 • Shihua Huang, Zhichao Lu, Ran Cheng, Cheng He
Recent advancements in deep neural networks have made remarkable leap-forwards in dense image prediction.
Ranked #22 on Semantic Segmentation on ADE20K val
3 code implementations • ICCV 2021 • Vighnesh Birodkar, Zhichao Lu, Siyang Li, Vivek Rathod, Jonathan Huang
Under this family, we study Mask R-CNN and discover that instead of its default strategy of training the mask-head with a combination of proposals and groundtruth boxes, training the mask-head with only groundtruth boxes dramatically improves its performance on novel classes.
no code implementations • 11 Jan 2021 • Kunpeng Li, Zizhao Zhang, Guanhang Wu, Xuehan Xiong, Chen-Yu Lee, Zhichao Lu, Yun Fu, Tomas Pfister
To address this issue, we introduce a new method for pre-training video action recognition models using queried web videos.
no code implementations • 27 Nov 2020 • Shengran Hu, Ran Cheng, Cheng He, Zhichao Lu
In the recent past, neural architecture search (NAS) has attracted increasing attention from both academia and industries.
no code implementations • 28 Sep 2020 • Yinxiao Li, Zhichao Lu, Xuehan Xiong, Jonathan Huang
In recent years, many works in the video action recognition literature have shown that two stream models (combining spatial and temporal input streams) are necessary for achieving state of the art performance.
Ranked #5 on Action Recognition on UCF101
no code implementations • 29 Jul 2020 • Jonathan C. Stroud, Zhichao Lu, Chen Sun, Jia Deng, Rahul Sukthankar, Cordelia Schmid, David A. Ross
Based on this observation, we propose to use text as a method for learning video representations.
1 code implementation • ECCV 2020 • Zhichao Lu, Kalyanmoy Deb, Erik Goodman, Wolfgang Banzhaf, Vishnu Naresh Boddeti
In this paper, we propose an efficient NAS algorithm for generating task-specific models that are competitive under multiple competing objectives.
Ranked #17 on Neural Architecture Search on ImageNet
2 code implementations • 12 May 2020 • Zhichao Lu, Gautam Sreekumar, Erik Goodman, Wolfgang Banzhaf, Kalyanmoy Deb, Vishnu Naresh Boddeti
At the same time, the architecture search and transfer is orders of magnitude more efficient than existing NAS methods.
Ranked #1 on Neural Architecture Search on STL-10
Fine-Grained Image Classification Neural Architecture Search +1
no code implementations • CVPR 2020 • Mahyar Najibi, Guangda Lai, Abhijit Kundu, Zhichao Lu, Vivek Rathod, Thomas Funkhouser, Caroline Pantofaru, David Ross, Larry S. Davis, Alireza Fathi
In contrast, we propose a general-purpose method that works on both indoor and outdoor scenes.
1 code implementation • CVPR 2020 • Zhichao Lu, Kalyanmoy Deb, Vishnu Naresh Boddeti
To overcome this limitation, we present MUXConv, a layer that is designed to increase the flow of information by progressively multiplexing channel and spatial information in the network, while mitigating computational complexity.
Ranked #4 on Pneumonia Detection on ChestX-ray14
1 code implementation • CVPR 2020 • Zhichao Lu, Vivek Rathod, Ronny Votel, Jonathan Huang
Traditionally multi-object tracking and object detection are performed using separate systems with most prior works focusing exclusively on one of these aspects over the other.
Ranked #1 on Multiple Object Tracking on Waymo Open Dataset
1 code implementation • 3 Dec 2019 • Zhichao Lu, Ian Whalen, Yashesh Dhebar, Kalyanmoy Deb, Erik Goodman, Wolfgang Banzhaf, Vishnu Naresh Boddeti
While existing approaches have achieved competitive performance in image classification, they are not well suited to problems where the computational budget is limited for two reasons: (1) the obtained architectures are either solely optimized for classification performance, or only for one deployment scenario; (2) the search process requires vast computational resources in most approaches.
Ranked #1 on Pneumonia Detection on ChestX-ray14
2 code implementations • 8 Oct 2018 • Zhichao Lu, Ian Whalen, Vishnu Boddeti, Yashesh Dhebar, Kalyanmoy Deb, Erik Goodman, Wolfgang Banzhaf
This paper introduces NSGA-Net -- an evolutionary approach for neural architecture search (NAS).
no code implementations • 27 Sep 2018 • Zhichao Lu, Ian Whalen, Vishnu Boddeti, Yashesh Dhebar, Kalyanmoy Deb, Erik Goodman, Wolfgang Banzhaf
This paper introduces NSGA-Net, an evolutionary approach for neural architecture search (NAS).
no code implementations • CVPR 2014 • Jing Li, Zhichao Lu, Gang Zeng, Rui Gan, Hongbin Zha
This paper describes a patchwork assembly algorithm for depth image super-resolution.