no code implementations • ECCV 2020 • Zhijian Liu, Zhanghao Wu, Chuang Gan, Ligeng Zhu, Song Han
Third, our solution is extit{efficient} on the edge since the majority of the workload is delegated to the cloud, and our mixing and de-mixing processes introduce very few extra computations.
2 code implementations • 20 Feb 2025 • Shang Yang, Junxian Guo, Haotian Tang, Qinghao Hu, Guangxuan Xiao, Jiaming Tang, Yujun Lin, Zhijian Liu, Yao Lu, Song Han
On average, LServe accelerates LLM prefilling by up to 2. 9x and decoding by 1. 3-2. 1x over vLLM, maintaining long-context accuracy.
2 code implementations • 5 Dec 2024 • Zhijian Liu, Ligeng Zhu, Baifeng Shi, Zhuoyang Zhang, Yuming Lou, Shang Yang, Haocheng Xi, Shiyi Cao, Yuxian Gu, Dacheng Li, Xiuyu Li, Yunhao Fang, Yukang Chen, Cheng-Yu Hsieh, De-An Huang, An-Chieh Cheng, Vishwesh Nath, Jinyi Hu, Sifei Liu, Ranjay Krishna, Daguang Xu, Xiaolong Wang, Pavlo Molchanov, Jan Kautz, Hongxu Yin, Song Han, Yao Lu
This paper introduces NVILA, a family of open VLMs designed to optimize both efficiency and accuracy.
Ranked #5 on
Video Question Answering
on NExT-QA
no code implementations • 19 Nov 2024 • Vishwesh Nath, Wenqi Li, Dong Yang, Andriy Myronenko, Mingxin Zheng, Yao Lu, Zhijian Liu, Hongxu Yin, Yee Man Law, Yucheng Tang, Pengfei Guo, Can Zhao, Ziyue Xu, Yufan He, Greg Heinrich, Stephen Aylward, Marc Edgar, Michael Zephyr, Pavlo Molchanov, Baris Turkbey, Holger Roth, Daguang Xu
In contrast, we propose that for medical VLMs, a fourth stage of specialized IFT is necessary, which focuses on medical data and includes information from domain expert models.
1 code implementation • 19 Aug 2024 • Yukang Chen, Fuzhao Xue, Dacheng Li, Qinghao Hu, Ligeng Zhu, Xiuyu Li, Yunhao Fang, Haotian Tang, Shang Yang, Zhijian Liu, Ethan He, Hongxu Yin, Pavlo Molchanov, Jan Kautz, Linxi Fan, Yuke Zhu, Yao Lu, Song Han
We introduce the long-context Multi-Modal Sequence Parallelism (MM-SP) system that efficiently parallelizes long video training and inference, enabling 2M context length training on 256 GPUs without any gradient checkpointing.
Ranked #8 on
Video Question Answering
on NExT-QA
no code implementations • 26 Jul 2024 • Zhijian Liu, Zhuoyang Zhang, Samir Khaki, Shang Yang, Haotian Tang, Chenfeng Xu, Kurt Keutzer, Song Han
Finally, it leverages a gated ensembler to apply these sparse refinements to the initial coarse predictions.
1 code implementation • 3 Apr 2024 • Vlas Zyrianov, Henry Che, Zhijian Liu, Shenlong Wang
We present LidarDM, a novel LiDAR generative model capable of producing realistic, layout-aware, physically plausible, and temporally coherent LiDAR videos.
1 code implementation • CVPR 2024 • Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao
This paper is not motivated to seek innovation within the attention mechanism.
1 code implementation • 19 Dec 2023 • Akio Kodaira, Chenfeng Xu, Toshiki Hazama, Takanori Yoshimoto, Kohei Ohno, Shogo Mitsuhori, Soichi Sugano, Hanying Cho, Zhijian Liu, Kurt Keutzer
We introduce StreamDiffusion, a real-time diffusion pipeline designed for interactive image generation.
3 code implementations • 15 Dec 2023 • Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao
This paper is not motivated to seek innovation within the attention mechanism.
Ranked #1 on
3D Semantic Segmentation
on ScanNet++
(using extra training data)
1 code implementation • 25 Oct 2023 • Haotian Tang, Shang Yang, Zhijian Liu, Ke Hong, Zhongming Yu, Xiuyu Li, Guohao Dai, Yu Wang, Song Han
On top of this, we design the Sparse Autotuner, which extends the design space of existing sparse convolution libraries and searches for the best dataflow configurations for training and inference workloads.
4 code implementations • 21 Sep 2023 • Yukang Chen, Shengju Qian, Haotian Tang, Xin Lai, Zhijian Liu, Song Han, Jiaya Jia
For example, training on the context length of 8192 needs 16x computational costs in self-attention layers as that of 2048.
no code implementations • ICCV 2023 • Xiyue Zhu, Vlas Zyrianov, Zhijian Liu, Shenlong Wang
Despite tremendous advancements in bird's-eye view (BEV) perception, existing models fall short in generating realistic and coherent semantic map layouts, and they fail to account for uncertainties arising from partial sensor information (such as occlusion or limited coverage).
no code implementations • 9 Jul 2023 • Zhijian Liu, Nian Cai, Wensheng Ouyang, Chengbin Zhang, Nili Tian, Han Wang
Automatic hardhat wearing detection can strengthen the safety management in construction sites, which is still challenging due to complicated video surveillance scenes.
1 code implementation • CVPR 2023 • Xuanyao Chen, Zhijian Liu, Haotian Tang, Li Yi, Hang Zhao, Song Han
High-resolution images enable neural networks to learn richer visual representations.
no code implementations • CVPR 2023 • Zhijian Liu, Xinyu Yang, Haotian Tang, Shang Yang, Song Han
Transformer, as an alternative to CNN, has been proven effective in many modalities (e. g., texts and images).
1 code implementation • 26 May 2022 • Zhijian Liu, Haotian Tang, Alexander Amini, Xinyu Yang, Huizi Mao, Daniela Rus, Song Han
Multi-sensor fusion is essential for an accurate and reliable autonomous driving system.
Ranked #4 on
3D Object Detection
on nuScenes
no code implementations • 25 Apr 2022 • Han Cai, Ji Lin, Yujun Lin, Zhijian Liu, Haotian Tang, Hanrui Wang, Ligeng Zhu, Song Han
Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial intelligence (AI), including computer vision, natural language processing and speech recognition.
no code implementations • 25 Apr 2022 • Zhijian Liu, Haotian Tang, Shengyu Zhao, Kevin Shao, Song Han
3D neural networks are widely used in real-world applications (e. g., AR/VR headsets, self-driving cars).
1 code implementation • 21 Apr 2022 • Haotian Tang, Zhijian Liu, Xiuyu Li, Yujun Lin, Song Han
TorchSparse directly optimizes the two bottlenecks of sparse convolution: irregular computation and data movement.
1 code implementation • 23 Nov 2021 • Alexander Amini, Tsun-Hsuan Wang, Igor Gilitschenski, Wilko Schwarting, Zhijian Liu, Song Han, Sertac Karaman, Daniela Rus
Simulation has the potential to transform the development of robust algorithms for mobile agents deployed in safety-critical scenarios.
no code implementations • ICCV 2021 • Zhijian Liu, Simon Stent, Jie Li, John Gideon, Song Han
Computer vision tasks such as object detection and semantic/instance segmentation rely on the painstaking annotation of large training datasets.
no code implementations • 20 May 2021 • Zhijian Liu, Alexander Amini, Sibo Zhu, Sertac Karaman, Song Han, Daniela Rus
On the other hand, increasing the robustness of these systems is also critical; however, even estimating the model's uncertainty is very challenging due to the cost of sampling-based methods.
no code implementations • 11 Aug 2020 • Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, Song Han
Compared with conventional methods, our framework is fully automated and can specialize the quantization policy for different neural network architectures and hardware architectures.
6 code implementations • ECCV 2020 • Haotian Tang, Zhijian Liu, Shengyu Zhao, Yujun Lin, Ji Lin, Hanrui Wang, Song Han
Self-driving cars need to understand 3D scenes efficiently and accurately in order to drive safely.
Ranked #1 on
Robust 3D Semantic Segmentation
on SemanticKITTI-C
13 code implementations • NeurIPS 2020 • Shengyu Zhao, Zhijian Liu, Ji Lin, Jun-Yan Zhu, Song Han
Furthermore, with only 20% training data, we can match the top performance on CIFAR-10 and CIFAR-100.
Ranked #1 on
Image Generation
on CIFAR-10 (20% data)
1 code implementation • CVPR 2020 • Tianzhe Wang, Kuan Wang, Han Cai, Ji Lin, Zhijian Liu, Song Han
However, training this quantization-aware accuracy predictor requires collecting a large number of quantized <model, accuracy> pairs, which involves quantization-aware finetuning and thus is highly time-consuming.
4 code implementations • ACL 2020 • Hanrui Wang, Zhanghao Wu, Zhijian Liu, Han Cai, Ligeng Zhu, Chuang Gan, Song Han
To enable low-latency inference on resource-constrained hardware platforms, we propose to design Hardware-Aware Transformers (HAT) with neural architecture search.
Ranked #21 on
Machine Translation
on WMT2014 English-French
2 code implementations • ICLR 2020 • Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
For language modeling, Lite Transformer achieves 1. 8 lower perplexity than the transformer at around 500M MACs.
Ranked #37 on
Machine Translation
on WMT2014 English-French
1 code implementation • CVPR 2020 • Muyang Li, Ji Lin, Yaoyao Ding, Zhijian Liu, Jun-Yan Zhu, Song Han
Directly applying existing compression methods yields poor performance due to the difficulty of GAN training and the differences in generator architectures.
4 code implementations • NeurIPS 2019 • Zhijian Liu, Haotian Tang, Yujun Lin, Song Han
The computation cost and memory footprints of the voxel-based models grow cubically with the input resolution, making it memory-prohibitive to scale up the resolution.
Ranked #1 on
3D Object Detection
on KITTI Pedestrian Hard val
7 code implementations • NeurIPS 2019 • Ligeng Zhu, Zhijian Liu, Song Han
Exchanging gradients is a widely used method in modern multi-node machine learning system (e. g., distributed training, collaborative learning).
no code implementations • ICLR 2019 • Zhenjia Xu, Zhijian Liu, Chen Sun, Kevin Murphy, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu
Humans easily recognize object parts and their hierarchical structure by watching how they move; they can then predict how each part moves in the future.
no code implementations • 24 Apr 2019 • Song Han, Han Cai, Ligeng Zhu, Ji Lin, Kuan Wang, Zhijian Liu, Yujun Lin
Moreover, we shorten the design cycle by 200x than previous work, so that we can afford to design specialized neural network models for different hardware platforms.
no code implementations • 12 Mar 2019 • Zhenjia Xu, Zhijian Liu, Chen Sun, Kevin Murphy, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu
Humans easily recognize object parts and their hierarchical structure by watching how they move; they can then predict how each part moves in the future.
no code implementations • NeurIPS 2018 • Yilun Du, Zhijian Liu, Hector Basevi, Ales Leonardis, Bill Freeman, Josh Tenenbaum, Jiajun Wu
We first show that applying physics supervision to an existing scene understanding model increases performance, produces more stable predictions, and allows training to an equivalent performance level with fewer annotated training examples.
11 code implementations • CVPR 2019 • Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, Song Han
Compared with conventional methods, our framework is fully automated and can specialize the quantization policy for different neural network architectures and hardware architectures.
no code implementations • ECCV 2018 • Zhijian Liu, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu
As annotated data for object parts and physics are rare, we propose a novel formulation that learns physical primitives by explaining both an object's appearance and its behaviors in physical events.
12 code implementations • ECCV 2018 • Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, Song Han
Model compression is a critical technique to efficiently deploy neural network models on mobile devices which have limited computation resources and tight power budgets.
no code implementations • 31 Dec 2017 • Zhijian Liu, Di wu, Hongyu Wei, Guoqing Cao
It is indicated that the theories and applications of machine learning method in the field of energy conservation and indoor environment are not mature, due to the difficulty of the determination for model structure with better prediction.
no code implementations • 6 Oct 2017 • Hao Li, Zhijian Liu
This Chapter consists of: i) Comparative studies on varieties of machine learning models (artificial neural networks (ANNs), support vector machine (SVM) and extreme learning machine (ELM)) to predict the performances of SWHs; ii) Development of an ANN-based software to assist the quick prediction and iii) Introduction of a computational HTS method to design a high-performance SWH system.