no code implementations • 15 Feb 2023 • Zhichao Lu, Chuntao Ding, Felix Juefei-Xu, Vishnu Naresh Boddeti, Shangguang Wang, Yun Yang
The high performance and small number of model parameters and FLOPs of TFormer are attributed to the proposed hybrid layer and the proposed partially connected feed-forward network (PCS-FFN).
1 code implementation • CVPR 2023 • Shihua Huang, Zhichao Lu, Kalyanmoy Deb, Vishnu Naresh Boddeti
Then we design a robust residual block, dubbed RobustResBlock, and a compound scaling rule, dubbed RobustScaling, to distribute depth and width at the desired FLOP count.
no code implementations • CVPR 2023 • Chuntao Ding, Zhichao Lu, Shangguang Wang, Ran Cheng, Vishnu Naresh Boddeti
Our key idea is to employ non-learnable primitives to extract a diverse set of task-agnostic features and recombine them into a shared branch common to all tasks and explicit task-specific branches reserved for each task.
1 code implementation • 21 Dec 2022 • Shihua Huang, Zhichao Lu, Kalyanmoy Deb, Vishnu Naresh Boddeti
In contrast, little attention was devoted to analyzing the role of architectural elements (such as topology, depth, and width) on adversarial robustness.
no code implementations • 14 Aug 2022 • Zhichao Lu, Ran Cheng, Shihua Huang, Haoming Zhang, Changxiao Qiu, Fan Yang
The main challenges of applying NAS to semantic segmentation arise from two aspects: (i) high-resolution images to be processed; (ii) additional requirement of real-time inference speed (i. e., real-time semantic segmentation) for applications such as autonomous driving.
1 code implementation • 8 Aug 2022 • Zhichao Lu, Ran Cheng, Yaochu Jin, Kay Chen Tan, Kalyanmoy Deb
From an optimization point of view, the NAS tasks involving multiple design criteria are intrinsically multiobjective optimization problems; hence, it is reasonable to adopt evolutionary multiobjective optimization (EMO) algorithms for tackling them.
1 code implementation • 17 Jun 2022 • Teng Wang, Wenhao Jiang, Zhichao Lu, Feng Zheng, Ran Cheng, Chengguo Yin, Ping Luo
Existing vision-language pre-training (VLP) methods primarily rely on paired image-text datasets, which are either annotated by enormous human labors, or crawled from the internet followed by elaborate data cleaning techniques.
no code implementations • 13 Apr 2022 • Teng Wang, Zhu Liu, Feng Zheng, Zhichao Lu, Ran Cheng, Ping Luo
This report describes the details of our approach for the event dense-captioning task in ActivityNet Challenge 2021.
no code implementations • 14 Feb 2022 • Junde Wu, Huihui Fang, Fei Li, Huazhu Fu, Fengbin Lin, Jiongcheng Li, Lexing Huang, Qinji Yu, Sifan Song, Xinxing Xu, Yanyu Xu, Wensai Wang, Lingxiao Wang, Shuai Lu, Huiqi Li, Shihua Huang, Zhichao Lu, Chubin Ou, Xifei Wei, Bingyuan Liu, Riadh Kobbi, Xiaoying Tang, Li Lin, Qiang Zhou, Qiang Hu, Hrvoje Bogunovic, José Ignacio Orlando, Xiulan Zhang, Yanwu Xu
However, although numerous algorithms are proposed based on fundus images or OCT volumes in computer-aided diagnosis, there are still few methods leveraging both of the modalities for the glaucoma assessment.
1 code implementation • CVPR 2022 • Shen Yan, Xuehan Xiong, Anurag Arnab, Zhichao Lu, Mi Zhang, Chen Sun, Cordelia Schmid
Video understanding requires reasoning at multiple spatiotemporal resolutions -- from short fine-grained motions to events taking place over longer durations.
Ranked #2 on
Action Recognition
on EPIC-KITCHENS-100
(using extra training data)
no code implementations • 8 Oct 2021 • Shengran Hu, Ran Cheng, Cheng He, Zhichao Lu, Jing Wang, Miao Zhang
For the goal of automated design of high-performance deep convolutional neural networks (CNNs), Neural Architecture Search (NAS) methodology is becoming increasingly important for both academia and industries. Due to the costly stochastic gradient descent (SGD) training of CNNs for performance evaluation, most existing NAS methods are computationally expensive for real-world deployments.
1 code implementation • ICCV 2021 • Teng Wang, Ruimao Zhang, Zhichao Lu, Feng Zheng, Ran Cheng, Ping Luo
Dense video captioning aims to generate multiple associated captions with their temporal locations from the video.
Ranked #2 on
Dense Video Captioning
on YouCook2
3 code implementations • ICCV 2021 • Shihua Huang, Zhichao Lu, Ran Cheng, Cheng He
Recent advancements in deep neural networks have made remarkable leap-forwards in dense image prediction.
Ranked #21 on
Semantic Segmentation
on ADE20K val
3 code implementations • ICCV 2021 • Vighnesh Birodkar, Zhichao Lu, Siyang Li, Vivek Rathod, Jonathan Huang
Under this family, we study Mask R-CNN and discover that instead of its default strategy of training the mask-head with a combination of proposals and groundtruth boxes, training the mask-head with only groundtruth boxes dramatically improves its performance on novel classes.
no code implementations • 11 Jan 2021 • Kunpeng Li, Zizhao Zhang, Guanhang Wu, Xuehan Xiong, Chen-Yu Lee, Zhichao Lu, Yun Fu, Tomas Pfister
To address this issue, we introduce a new method for pre-training video action recognition models using queried web videos.
no code implementations • 27 Nov 2020 • Shengran Hu, Ran Cheng, Cheng He, Zhichao Lu
In the recent past, neural architecture search (NAS) has attracted increasing attention from both academia and industries.
no code implementations • 28 Sep 2020 • Yinxiao Li, Zhichao Lu, Xuehan Xiong, Jonathan Huang
In recent years, many works in the video action recognition literature have shown that two stream models (combining spatial and temporal input streams) are necessary for achieving state of the art performance.
Ranked #4 on
Action Recognition
on UCF101
no code implementations • 29 Jul 2020 • Jonathan C. Stroud, Zhichao Lu, Chen Sun, Jia Deng, Rahul Sukthankar, Cordelia Schmid, David A. Ross
Based on this observation, we propose to use text as a method for learning video representations.
1 code implementation • ECCV 2020 • Zhichao Lu, Kalyanmoy Deb, Erik Goodman, Wolfgang Banzhaf, Vishnu Naresh Boddeti
In this paper, we propose an efficient NAS algorithm for generating task-specific models that are competitive under multiple competing objectives.
Ranked #17 on
Neural Architecture Search
on ImageNet
2 code implementations • 12 May 2020 • Zhichao Lu, Gautam Sreekumar, Erik Goodman, Wolfgang Banzhaf, Kalyanmoy Deb, Vishnu Naresh Boddeti
At the same time, the architecture search and transfer is orders of magnitude more efficient than existing NAS methods.
Ranked #1 on
Neural Architecture Search
on DTD
Fine-Grained Image Classification
Neural Architecture Search
+1
no code implementations • CVPR 2020 • Mahyar Najibi, Guangda Lai, Abhijit Kundu, Zhichao Lu, Vivek Rathod, Thomas Funkhouser, Caroline Pantofaru, David Ross, Larry S. Davis, Alireza Fathi
In contrast, we propose a general-purpose method that works on both indoor and outdoor scenes.
1 code implementation • CVPR 2020 • Zhichao Lu, Kalyanmoy Deb, Vishnu Naresh Boddeti
To overcome this limitation, we present MUXConv, a layer that is designed to increase the flow of information by progressively multiplexing channel and spatial information in the network, while mitigating computational complexity.
Ranked #4 on
Pneumonia Detection
on ChestX-ray14
1 code implementation • CVPR 2020 • Zhichao Lu, Vivek Rathod, Ronny Votel, Jonathan Huang
Traditionally multi-object tracking and object detection are performed using separate systems with most prior works focusing exclusively on one of these aspects over the other.
Ranked #1 on
Multiple Object Tracking
on Waymo Open Dataset
1 code implementation • 3 Dec 2019 • Zhichao Lu, Ian Whalen, Yashesh Dhebar, Kalyanmoy Deb, Erik Goodman, Wolfgang Banzhaf, Vishnu Naresh Boddeti
While existing approaches have achieved competitive performance in image classification, they are not well suited to problems where the computational budget is limited for two reasons: (1) the obtained architectures are either solely optimized for classification performance, or only for one deployment scenario; (2) the search process requires vast computational resources in most approaches.
Ranked #1 on
Pneumonia Detection
on ChestX-ray14
2 code implementations • 8 Oct 2018 • Zhichao Lu, Ian Whalen, Vishnu Boddeti, Yashesh Dhebar, Kalyanmoy Deb, Erik Goodman, Wolfgang Banzhaf
This paper introduces NSGA-Net -- an evolutionary approach for neural architecture search (NAS).
no code implementations • 27 Sep 2018 • Zhichao Lu, Ian Whalen, Vishnu Boddeti, Yashesh Dhebar, Kalyanmoy Deb, Erik Goodman, Wolfgang Banzhaf
This paper introduces NSGA-Net, an evolutionary approach for neural architecture search (NAS).
no code implementations • CVPR 2014 • Jing Li, Zhichao Lu, Gang Zeng, Rui Gan, Hongbin Zha
This paper describes a patchwork assembly algorithm for depth image super-resolution.