Search Results for author: Ji Lin

Found 34 papers, 20 papers with code

Tiny Machine Learning: Progress and Futures

1 code implementation • 28 Mar 2024 • Ji Lin, Ligeng Zhu, Wei-Ming Chen, Wei-Chen Wang, Song Han

By squeezing deep learning models into billions of IoT devices and microcontrollers (MCUs), we expand the scope of AI applications and enable ubiquitous intelligence.

392

Paper
Code

VILA: On Pre-training for Visual Language Models

2 code implementations • 12 Dec 2023 • Ji Lin, Hongxu Yin, Wei Ping, Yao Lu, Pavlo Molchanov, Andrew Tao, Huizi Mao, Jan Kautz, Mohammad Shoeybi, Song Han

Visual language models (VLMs) rapidly progressed with the recent success of large language models.

Ranked #21 on Visual Question Answering on MM-Vet

In-Context Learning Language Modelling +2

1,754

Paper
Code

Rule-Guided Joint Embedding Learning over Knowledge Graphs

no code implementations • 1 Dec 2023 • Qisong Li, Ji Lin, Sijia Wei, Neng Liu

Recent studies focus on embedding learning over knowledge graphs, which map entities and relations in knowledge graphs into low-dimensional vector spaces.

Knowledge Graph Embedding Knowledge Graphs

Paper
Add Code

PockEngine: Sparse and Efficient Fine-tuning in a Pocket

no code implementations • 26 Oct 2023 • Ligeng Zhu, Lanxiang Hu, Ji Lin, Wei-Chen Wang, Wei-Ming Chen, Chuang Gan, Song Han

On-device learning and efficient fine-tuning enable continuous and privacy-preserving customization (e. g., locally fine-tuning large language models on personalized data).

Privacy Preserving

Paper
Add Code

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

5 code implementations • 1 Jun 2023 • Ji Lin, Jiaming Tang, Haotian Tang, Shang Yang, Xingyu Dang, Chuang Gan, Song Han

Large language models (LLMs) have shown excellent performance on various tasks, but the astronomical model size raises the hardware barrier for serving (memory size) and slows down token generation (memory bandwidth).

Common Sense Reasoning Language Modelling +1

17,720

Paper
Code

Offsite-Tuning: Transfer Learning without Full Model

1 code implementation • 9 Feb 2023 • Guangxuan Xiao, Ji Lin, Song Han

In this paper, we propose Offsite-Tuning, a privacy-preserving and efficient transfer learning framework that can adapt billion-parameter foundation models to downstream data without access to the full model.

Privacy Preserving Transfer Learning

359

Paper
Code

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

3 code implementations • 18 Nov 2022 • Guangxuan Xiao, Ji Lin, Mickael Seznec, Hao Wu, Julien Demouth, Song Han

We propose SmoothQuant, a training-free, accuracy-preserving, and general-purpose post-training quantization (PTQ) solution to enable 8-bit weight, 8-bit activation (W8A8) quantization for LLMs.

Quantization

11,563

Paper
Code

Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models

1 code implementation • 3 Nov 2022 • Muyang Li, Ji Lin, Chenlin Meng, Stefano Ermon, Song Han, Jun-Yan Zhu

With about $1\%$-area edits, SIGE accelerates DDPM by $3. 0\times$ on NVIDIA RTX 3090 and $4. 6\times$ on Apple M1 Pro GPU, Stable Diffusion by $7. 2\times$ on 3090, and GauGAN by $5. 6\times$ on 3090 and $5. 2\times$ on M1 Pro GPU.

246

Paper
Code

On-Device Training Under 256KB Memory

1 code implementation • 30 Jun 2022 • Ji Lin, Ligeng Zhu, Wei-Ming Chen, Wei-Chen Wang, Chuang Gan, Song Han

To reduce the memory footprint, we propose Sparse Update to skip the gradient computation of less important layers and sub-tensors.

Quantization Transfer Learning

392

Paper
Code

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

no code implementations • 25 Apr 2022 • Han Cai, Ji Lin, Yujun Lin, Zhijian Liu, Haotian Tang, Hanrui Wang, Ligeng Zhu, Song Han

Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial intelligence (AI), including computer vision, natural language processing and speech recognition.

Model Compression Neural Architecture Search +3

Paper
Add Code

Memory-efficient Patch-based Inference for Tiny Deep Learning

no code implementations • NeurIPS 2021 • Ji Lin, Wei-Ming Chen, Han Cai, Chuang Gan, Song Han

We further propose receptive field redistribution to shift the receptive field and FLOPs to the later stage and reduce the computation overhead.

Image Classification Neural Architecture Search +3

Paper
Add Code

MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning

1 code implementation • 28 Oct 2021 • Ji Lin, Wei-Ming Chen, Han Cai, Chuang Gan, Song Han

We further propose network redistribution to shift the receptive field and FLOPs to the later stage and reduce the computation overhead.

Image Classification Neural Architecture Search +3

392

Paper
Code

Network Augmentation for Tiny Deep Learning

no code implementations • ICLR 2022 • Han Cai, Chuang Gan, Ji Lin, Song Han

We introduce Network Augmentation (NetAug), a new training method for improving the performance of tiny neural networks.

Data Augmentation Image Classification +2

Paper
Add Code

TSM: Temporal Shift Module for Efficient and Scalable Video Understanding on Edge Device

4 code implementations • 27 Sep 2021 • Ji Lin, Chuang Gan, Kuan Wang, Song Han

Secondly, TSM has high efficiency; it achieves a high frame rate of 74fps and 29fps for online video recognition on Jetson Nano and Galaxy Note8.

Video Recognition Video Understanding

2,014

Paper
Code

Anycost GANs for Interactive Image Synthesis and Editing

1 code implementation • CVPR 2021 • Ji Lin, Richard Zhang, Frieder Ganz, Song Han, Jun-Yan Zhu

Generative adversarial networks (GANs) have enabled photorealistic image synthesis and editing.

Image Generation

767

Paper
Code

Hardware-Centric AutoML for Mixed-Precision Quantization

no code implementations • 11 Aug 2020 • Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, Song Han

Compared with conventional methods, our framework is fully automated and can specialize the quantization policy for different neural network architectures and hardware architectures.

AutoML Quantization

Paper
Add Code

Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution

6 code implementations • ECCV 2020 • Haotian Tang, Zhijian Liu, Shengyu Zhao, Yujun Lin, Ji Lin, Hanrui Wang, Song Han

Self-driving cars need to understand 3D scenes efficiently and accurately in order to drive safely.

Ranked #1 on Robust 3D Semantic Segmentation on SemanticKITTI-C

3D Object Detection LIDAR Semantic Segmentation +4

1,097

Paper
Code

MCUNet: Tiny Deep Learning on IoT Devices

1 code implementation • NeurIPS 2020 • Ji Lin, Wei-Ming Chen, Yujun Lin, John Cohn, Chuang Gan, Song Han

Machine learning on tiny IoT devices based on microcontroller units (MCU) is appealing but challenging: the memory of microcontrollers is 2-3 orders of magnitude smaller even than mobile phones.

BIG-bench Machine Learning Neural Architecture Search +1

392

Paper
Code

Differentiable Augmentation for Data-Efficient GAN Training

11 code implementations • NeurIPS 2020 • Shengyu Zhao, Zhijian Liu, Ji Lin, Jun-Yan Zhu, Song Han

Furthermore, with only 20% training data, we can match the top performance on CIFAR-10 and CIFAR-100.

Ranked #1 on Image Generation on CIFAR-10 (20% data)

Image Generation

3,362

Paper
Code

APQ: Joint Search for Network Architecture, Pruning and Quantization Policy

1 code implementation • CVPR 2020 • Tianzhe Wang, Kuan Wang, Han Cai, Ji Lin, Zhijian Liu, Song Han

However, training this quantization-aware accuracy predictor requires collecting a large number of quantized <model, accuracy> pairs, which involves quantization-aware finetuning and thus is highly time-consuming.

Quantization

155

Paper
Code

Lite Transformer with Long-Short Range Attention

2 code implementations • ICLR 2020 • Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han

For language modeling, Lite Transformer achieves 1. 8 lower perplexity than the transformer at around 500M MACs.

Ranked #36 on Machine Translation on WMT2014 English-French

Abstractive Text Summarization AutoML +5

591

Paper
Code

GAN Compression: Efficient Architectures for Interactive Conditional GANs

1 code implementation • CVPR 2020 • Muyang Li, Ji Lin, Yaoyao Ding, Zhijian Liu, Jun-Yan Zhu, Song Han

Directly applying existing compression methods yields poor performance due to the difficulty of GAN training and the differences in generator architectures.

Image Generation Neural Architecture Search

1,090

Paper
Code

Training Kinetics in 15 Minutes: Large-scale Distributed Training on Videos

no code implementations • 1 Oct 2019 • Ji Lin, Chuang Gan, Song Han

With such hardware-aware model design, we are able to scale up the training on Summit supercomputer and reduce the training time on Kinetics dataset from 49 hours 55 minutes to 14 minutes 13 seconds, achieving a top-1 accuracy of 74. 0%, which is 1. 6x and 2. 9x faster than previous 3D video models with higher accuracy.

Video Recognition

Paper
Add Code

Design Automation for Efficient Deep Learning Computing

no code implementations • 24 Apr 2019 • Song Han, Han Cai, Ligeng Zhu, Ji Lin, Kuan Wang, Zhijian Liu, Yujun Lin

Moreover, we shorten the design cycle by 200x than previous work, so that we can afford to design specialized neural network models for different hardware platforms.

Quantization

Paper
Add Code

Defensive Quantization: When Efficiency Meets Robustness

no code implementations • ICLR 2019 • Ji Lin, Chuang Gan, Song Han

This paper aims to raise people's awareness about the security of the quantized models, and we designed a novel quantization methodology to jointly optimize the efficiency and robustness of deep learning models.

Adversarial Attack Quantization

Paper
Add Code

Joint Monocular 3D Vehicle Detection and Tracking

1 code implementation • ICCV 2019 • Hou-Ning Hu, Qi-Zhi Cai, Dequan Wang, Ji Lin, Min Sun, Philipp Krähenbühl, Trevor Darrell, Fisher Yu

The framework can not only associate detections of vehicles in motion over time, but also estimate their complete 3D bounding box information from a sequence of 2D images captured on a moving platform.

Ranked #12 on Multiple Object Tracking on KITTI Tracking test

3D Object Detection 3D Pose Estimation +4

652

Paper
Code

HAQ: Hardware-Aware Automated Quantization with Mixed Precision

11 code implementations • CVPR 2019 • Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, Song Han

Compared with conventional methods, our framework is fully automated and can specialize the quantization policy for different neural network architectures and hardware architectures.

Quantization

1,834

Paper
Code

TSM: Temporal Shift Module for Efficient Video Understanding

13 code implementations • ICCV 2019 • Ji Lin, Chuang Gan, Song Han

The explosive growth in video streaming gives rise to challenges on performing video understanding at high accuracy and low computation cost.

Ranked #2 on 3D Action Recognition on Assembly101

3D Action Recognition Action Classification +6

3,862

Paper
Code

Adaptive Mixture of Low-Rank Factorizations for Compact Neural Modeling

no code implementations • NIPS Workshop CDNNRIA 2018 • Ting Chen, Ji Lin, Tian Lin, Song Han, Chong Wang, Denny Zhou

Modern deep neural networks have a large amount of weights, which make them difficult to deploy on computation constrained devices such as mobile phones.

Image Classification Language Modelling

Paper
Add Code

Reinforcement Learning from Imperfect Demonstrations

no code implementations • ICLR 2018 • Yang Gao, Huazhe Xu, Ji Lin, Fisher Yu, Sergey Levine, Trevor Darrell

We propose a unified reinforcement learning algorithm, Normalized Actor-Critic (NAC), that effectively normalizes the Q-function, reducing the Q-values of actions unseen in the demonstration data.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

AMC: AutoML for Model Compression and Acceleration on Mobile Devices

12 code implementations • ECCV 2018 • Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, Song Han

Model compression is a critical technique to efficiently deploy neural network models on mobile devices which have limited computation resources and tight power budgets.

Model Compression Neural Architecture Search

4,301

Paper
Code

Runtime Neural Pruning

no code implementations • NeurIPS 2017 • Ji Lin, Yongming Rao, Jiwen Lu, Jie zhou

In this paper, we propose a Runtime Neural Pruning (RNP) framework which prunes the deep neural network dynamically at the runtime.

Paper
Add Code

Learning Discriminative Aggregation Network for Video-Based Face Recognition

no code implementations • ICCV 2017 • Yongming Rao, Ji Lin, Jiwen Lu, Jie zhou

In this paper, we propose a discriminative aggregation network (DAN) for video face recognition, which aims to integrate information from video frames effectively and efficiently.

Face Recognition Metric Learning

Paper
Add Code

Consistent-Aware Deep Learning for Person Re-Identification in a Camera Network

no code implementations • CVPR 2017 • Ji Lin, Liangliang Ren, Jiwen Lu, Jianjiang Feng, Jie zhou

In this paper, we propose a consistent-aware deep learning (CADL) framework for person re-identification in a camera network.

Person Re-Identification

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.