2 code implementations • 24 Jun 2020 • Ming Lin, Hesen Chen, Xiuyu Sun, Qi Qian, Hao Li, Rong Jin
To address this issue, we propose a general principle for designing GPU-efficient networks based on extensive empirical studies.
2 code implementations • 1 Feb 2021 • Ming Lin, Pichao Wang, Zhenhong Sun, Hesen Chen, Xiuyu Sun, Qi Qian, Hao Li, Rong Jin
Comparing with previous NAS methods, the proposed Zen-NAS is magnitude times faster on multiple server-side and mobile-side GPU platforms with state-of-the-art accuracy on ImageNet.
Ranked #2 on Neural Architecture Search on ImageNet
2 code implementations • ICCV 2021 • Ming Lin, Pichao Wang, Zhenhong Sun, Hesen Chen, Xiuyu Sun, Qi Qian, Hao Li, Rong Jin
To address this issue, instead of using an accuracy predictor, we propose a novel zero-shot index dubbed Zen-Score to rank the architectures.
Neural Architecture Search Vocal Bursts Intensity Prediction
1 code implementation • 26 Nov 2021 • Zhenhong Sun, Ming Lin, Xiuyu Sun, Zhiyu Tan, Hao Li, Rong Jin
Recent researches attempt to reduce this cost by optimizing the backbone architecture with the help of Neural Architecture Search (NAS).
Ranked #88 on Object Detection on COCO minival
1 code implementation • Conference on Neural Information Processing Systems 2022 • Zhenhong Sun, Ce Ge, Junyan Wang, Ming Lin, Hesen Chen, Hao Li, Xiuyu Sun
Deploying deep convolutional neural networks on Internet-of-Things (IoT) devices is challenging due to the limited computational resources, such as limited SRAM memory and Flash storage.
1 code implementation • CVPR 2023 • Xuan Shen, Yaohua Wang, Ming Lin, Yilun Huang, Hao Tang, Xiuyu Sun, Yanzhi Wang
To this end, a novel framework termed Mathematical Architecture Design for Deep CNN (DeepMAD) is proposed to design high-performance CNN models in a principled way.
Ranked #1 on Neural Architecture Search on ImageNet
1 code implementation • 5 Mar 2023 • Junyan Wang, Zhenhong Sun, Yichen Qian, Dong Gong, Xiuyu Sun, Ming Lin, Maurice Pagnucco, Yang song
In this work, we propose to automatically design efficient 3D CNN architectures via a novel training-free neural architecture search approach tailored for 3D CNNs considering the model complexity.
Ranked #84 on Action Recognition on Something-Something V2
1 code implementation • NeurIPS 2019 • Junbang Liang, Ming Lin, Vladlen Koltun
We propose a differentiable cloth simulator that can be embedded as a layer in deep neural networks.
2 code implementations • ICLR 2022 • Yichen Qian, Ming Lin, Xiuyu Sun, Zhiyu Tan, Rong Jin
One critical component in lossy deep image compression is the entropy model, which predicts the probability distribution of the quantized latent representation in the encoding and decoding modules.
2 code implementations • ICLR 2022 • Yiqi Jiang, Zhiyu Tan, Junyan Wang, Xiuyu Sun, Ming Lin, Hao Li
This heavy-backbone design paradigm is mostly due to the historical legacy when transferring image recognition models to object detection rather than an end-to-end optimized design for object detection.
2 code implementations • ICLR 2022 • Yaohua Wang, Yaobin Zhang, Fangyi Zhang, Ming Lin, Yuqi Zhang, Senzhang Wang, Xiuyu Sun
In Ada-NETS, each face is transformed to a new structure space, obtaining robust features by considering face features of the neighbour images.
1 code implementation • CVPR 2023 • Shuning Chang, Pichao Wang, Ming Lin, Fan Wang, David Junhao Zhang, Rong Jin, Mike Zheng Shou
In this work, we propose a novel Semantic Token ViT (STViT), for efficient global and local vision transformers, which can also be revised to serve as backbone for downstream tasks.
1 code implementation • 19 Feb 2020 • Yonathan Aflalo, Asaf Noy, Ming Lin, Itamar Friedman, Lihi Zelnik
Through this we produce compact architectures with the same FLOPs as EfficientNet-B0 and MobileNetV3 but with higher accuracy, by $1\%$ and $0. 3\%$ respectively on ImageNet, and faster runtime on GPU.
Ranked #3 on Network Pruning on ImageNet
2 code implementations • ICLR 2021 • Yichen Qian, Zhiyu Tan, Xiuyu Sun, Ming Lin, Dongyang Li, Zhenhong Sun, Hao Li, Rong Jin
In this work, we propose a novel Global Reference Model for image compression to effectively leverage both the local and the global context information, leading to an enhanced compression rate.
1 code implementation • 28 May 2021 • Pichao Wang, Xue Wang, Fan Wang, Ming Lin, Shuning Chang, Hao Li, Rong Jin
A key component in vision transformers is the fully-connected self-attention which is more powerful than CNNs in modelling long range dependencies.
1 code implementation • 28 Oct 2022 • Jiaqi Leng, Yuxiang Peng, Yi-Ling Qiao, Ming Lin, Xiaodi Wu
We formulate the first differentiable analog quantum computing framework with a specific parameterization design at the analog signal (pulse) level to better exploit near-term quantum devices via variational methods.
1 code implementation • 5 Jul 2023 • Guihong Li, Duc Hoang, Kartikeya Bhardwaj, Ming Lin, Zhangyang Wang, Radu Marculescu
Recently, zero-shot (or training-free) Neural Architecture Search (NAS) approaches have been proposed to liberate NAS from the expensive training process.
3 code implementations • 15 Mar 2023 • Divya Kothandaraman, Tianyi Zhou, Ming Lin, Dinesh Manocha
Aerial Diffusion leverages a pretrained text-image diffusion model for prior knowledge.
1 code implementation • 21 Mar 2022 • Divya Kothandaraman, Tianrui Guan, Xijun Wang, Sean Hu, Ming Lin, Dinesh Manocha
Our formulation uses a novel Fourier object disentanglement method to innately separate out the human agent (which is typically small) from the background.
Ranked #1 on Action Recognition on UAV Human
1 code implementation • 8 Oct 2022 • Yaohua Wang, Fangyi Zhang, Ming Lin, Senzhang Wang, Xiuyu Sun, Rong Jin
A natural way to construct a graph among images is to treat each image as a node and assign pairwise image similarities as weights to corresponding edges.
2 code implementations • 27 Nov 2023 • Divya Kothandaraman, Tianyi Zhou, Ming Lin, Dinesh Manocha
It seamlessly blends the visual features from the input image within a pretrained text-to-2Dimage stable diffusion model with a test-time optimization process for a careful bias-variance trade-off, which uses an Inverse Perspective Mapping (IPM) homography transformation to provide subtle cues for aerialview synthesis.
no code implementations • 19 Jul 2017 • Xiang Li, Aoxiao Zhong, Ming Lin, Ning Guo, Mu Sun, Arkadiusz Sitek, Jieping Ye, James Thrall, Quanzheng Li
However, the development of a robust and reliable deep learning model for computer-aided diagnosis is still highly challenging due to the combination of the high heterogeneity in the medical images and the relative lack of training samples.
no code implementations • 2 Mar 2017 • Ming Lin, Shuang Qiu, Bin Hong, Jieping Ye
We show that the conventional gradient descent heuristic is biased by the skewness of the distribution therefore is no longer the best practice of learning the SLM.
no code implementations • 17 Mar 2017 • Shuang Qiu, Tingjin Luo, Jieping Ye, Ming Lin
We study an extreme scenario in multi-label learning where each training instance is endowed with a single one-bit label out of multiple labels.
no code implementations • NeurIPS 2016 • Ming Lin, Jieping Ye
We develop an efficient alternating framework for learning a generalized version of Factorization Machine (gFM) on steaming data with provable guarantees.
no code implementations • 17 Jun 2016 • Shoou-I Yu, Yi Yang, Zhongwen Xu, Shicheng Xu, Deyu Meng, Zexi Mao, Zhigang Ma, Ming Lin, Xuanchong Li, Huan Li, Zhenzhong Lan, Lu Jiang, Alexander G. Hauptmann, Chuang Gan, Xingzhong Du, Xiaojun Chang
The large number of user-generated videos uploaded on to the Internet everyday has led to many commercial video search engines, which mainly rely on text metadata for search.
no code implementations • 16 Nov 2015 • Zhenzhong Lan, Shoou-I Yu, Ming Lin, Bhiksha Raj, Alexander G. Hauptmann
We approach this problem by first showing that local handcrafted features and Convolutional Neural Networks (CNNs) share the same convolution-pooling network structure.
no code implementations • 17 May 2015 • Zhenzhong Lan, Dezhong Yao, Ming Lin, Shoou-I Yu, Alexander Hauptmann
First, we propose a two-stream Stacked Convolutional Independent Subspace Analysis (ConvISA) architecture to show that unsupervised learning methods can significantly boost the performance of traditional local features extracted from data-independent models.
no code implementations • CVPR 2015 • Zhenzhong Lan, Ming Lin, Xuanchong Li, Alexander G. Hauptmann, Bhiksha Raj
MIFS compensates for information lost from using differential operators by recapturing information at coarse scales.
no code implementations • 13 Feb 2015 • Zhenzhong Lan, Xuanchong Li, Ming Lin, Alexander G. Hauptmann
Therefore, they need to occur frequently enough in the videos and to be be able to tell the difference among different types of motions.
no code implementations • 30 Jan 2019 • Ming Lin, Shuang Qiu, Jieping Ye, Xiaomin Song, Qi Qian, Liang Sun, Shenghuo Zhu, Rong Jin
This bound is sub-optimal comparing to the information theoretical lower bound $\mathcal{O}(kd)$.
no code implementations • 3 Jun 2019 • Ming Lin, Xiaomin Song, Qi Qian, Hao Li, Liang Sun, Shenghuo Zhu, Rong Jin
We validate the superiority of the proposed method in our real-time high precision positioning system against several popular state-of-the-art robust regression methods.
no code implementations • 23 Jul 2020 • Shivam Akhauri, Laura Zheng, Ming Lin
Simulation data can be utilized to extend real-world driving data in order to cover edge cases, such as vehicle accidents.
no code implementations • 1 Jan 2021 • Yu Shen, Laura Yu Zheng, Manli Shu, Weizi Li, Tom Goldstein, Ming Lin
To ensure the wide adoption and safety of autonomous driving, the vehicles need to be able to drive under various lighting, weather, and visibility conditions in different environments.
no code implementations • 3 Oct 2020 • Yi Xu, Asaf Noy, Ming Lin, Qi Qian, Hao Li, Rong Jin
To this end, we develop two novel algorithms, termed "AugDrop" and "MixLoss", to correct the data bias in the data augmentation.
no code implementations • 12 Oct 2020 • Jian Liang, Kun Chen, Ming Lin, ChangShui Zhang, Fei Wang
FMR is an effective scheme for handling sample heterogeneity, where a single regression model is not enough for capturing the complexities of the conditional distribution of the observed samples given the features.
no code implementations • 15 Mar 2021 • Shivam Akhauri, Laura Zheng, Tom Goldstein, Ming Lin
Practical learning-based autonomous driving models must be capable of generalizing learned behaviors from simulated to real domains, and from training data to unseen domains with unusual image properties.
no code implementations • 12 Jul 2021 • Ya Wang, Hesen Chen, Fangyi Zhang, Yaohua Wang, Xiuyu Sun, Ming Lin, Hao Li
Data augmentation is a commonly used approach to improving the generalization of deep learning models.
no code implementations • 29 Sep 2021 • Zhenhong Sun, Ming Lin, Zhiyu Tan, Xiuyu Sun, Rong Jin
Recent researches attempt to reduce this cost by optimizing the backbone architecture with the help of Neural Architecture Search (NAS).
no code implementations • 29 Sep 2021 • Hanlin Chen, Ming Lin, Xiuyu Sun, Hao Li
Based on these new discoveries, we propose i) a novel hybrid zero-shot proxy which outperforms existing ones by a large margin and is transferable among popular search spaces; ii) a new index for better measuring the true performance of ZS-NAS proxies in constrained NAS.
no code implementations • 29 Sep 2021 • Hesen Chen, Ming Lin, Xiuyu Sun, Rong Jin
In this work, we propose a novel approach termed Hierarchical Cross Contrastive Learning(HCCL) to further distill the information mismatched by the conventional contrastive loss.
no code implementations • NeurIPS 2021 • Yu Shen, Laura Zheng, Manli Shu, Weizi Li, Tom Goldstein, Ming Lin
We introduce a simple yet effective framework for improving the robustness of learning algorithms against image corruptions for autonomous driving.
no code implementations • 15 Sep 2022 • Divya Kothandaraman, Ming Lin, Dinesh Manocha
We build a differentiable static-dynamic frequency mask prior to model the salient static and dynamic pixels in the video, crucial for the underlying task of action recognition.
1 code implementation • 8 Jan 2023 • Jidong Ge, Yuxiang Liu, Jie Gui, Lanting Fang, Ming Lin, James Tin-Yau Kwok, LiGuo Huang, Bin Luo
However, the relation between these two losses is not clear.
no code implementations • 9 Mar 2023 • Xuan Li, Yi-Ling Qiao, Peter Yichen Chen, Krishna Murthy Jatavallabhula, Ming Lin, Chenfanfu Jiang, Chuang Gan
In this work, we aim to identify parameters characterizing a physical system from a set of multi-view videos without any assumption on object geometry or topology.
no code implementations • 17 Aug 2023 • Xijun Wang, Anqi Liang, Junbang Liang, Ming Lin, Yu Lou, Shan Yang
Based on this notion, we propose a compatibility learning framework, a category-aware Flexible Bidirectional Transformer (FBT), for visual "scene-based set compatibility reasoning" with the cross-domain visual similarity input and auto-regressive complementary item generation.
no code implementations • 28 Nov 2023 • Shutong Zhang, Yi-Ling Qiao, Guanglei Zhu, Eric Heiden, Dylan Turpin, Jingzhou Liu, Ming Lin, Miles Macklin, Animesh Garg
We demonstrate that HandyPriors attains comparable or superior results in the pose estimation task, and that the differentiable physics module can predict contact information for pose refinement.
no code implementations • 9 Dec 2023 • Xuan Shen, Peiyan Dong, Lei Lu, Zhenglun Kong, Zhengang Li, Ming Lin, Chao Wu, Yanzhi Wang
Recent works show that 8-bit or lower weight quantization is feasible with minimal impact on end-to-end task performance, while the activation is still not quantized.
no code implementations • 13 Dec 2023 • Xijun Wang, Junbang Liang, Chun-Kai Wang, Kenan Deng, Yu Lou, Ming Lin, Shan Yang
Our VLAP model addresses both efficient frame sampling and effective cross-modal alignment in a unified way.
Ranked #1 on Video Question Answering on STAR Benchmark
no code implementations • 30 Dec 2023 • Shreelekha Revankar, Shijia Liao, Yu Shen, Junbang Liang, Huaishu Peng, Ming Lin
We perform a comprehensive analysis on the impact of camera poses on HPS reconstruction outcomes.
no code implementations • 28 Feb 2024 • Youpeng Zhao, Ming Lin, Huadong Tang, Qiang Wu, Jun Wang
Generative Large Language Models (LLMs) stand as a revolutionary advancement in the modern era of artificial intelligence (AI).