2 code implementations • 13 Jun 2023 • Sehoon Kim, Coleman Hooper, Amir Gholami, Zhen Dong, Xiuyu Li, Sheng Shen, Michael W. Mahoney, Kurt Keutzer
When applied to the LLaMA models, our 3-bit quantization significantly reduces the perplexity gap from the FP16 baseline by up to 2. 1x as compared to the state-of-the-art methods with the same memory requirement.
2 code implementations • NeurIPS 2020 • Zhen Dong, Zhewei Yao, Yaohui Cai, Daiyaan Arfeen, Amir Gholami, Michael W. Mahoney, Kurt Keutzer
However, the search space for a mixed-precision quantization is exponential in the number of layers.
1 code implementation • ICCV 2019 • Zhen Dong, Zhewei Yao, Amir Gholami, Michael Mahoney, Kurt Keutzer
Another challenge is a similar factorial complexity for determining block-wise fine-tuning order when quantizing the model to a target precision.
1 code implementation • 20 Nov 2020 • Zhewei Yao, Zhen Dong, Zhangcheng Zheng, Amir Gholami, Jiali Yu, Eric Tan, Leyuan Wang, Qijing Huang, Yida Wang, Michael W. Mahoney, Kurt Keutzer
Current low-precision quantization algorithms often have the hidden cost of conversion back and forth from floating point to quantized integer values.
1 code implementation • 14 Feb 2024 • Ze Ma, Daquan Zhou, Chun-Hsiao Yeh, Xue-She Wang, Xiuyu Li, Huanrui Yang, Zhen Dong, Kurt Keutzer, Jiashi Feng
To achieve this, we propose three novel components that are essential for high-quality identity preservation and stable video generation: 1) a noise initialization method with 3D Gaussian Noise Prior for better inter-frame stability; 2) an ID module based on extended Textual Inversion trained with the cropped identity to disentangle the ID information from the background 3) Face VCD and Tiled VCD modules to reinforce faces and upscale the video to higher resolution while preserving the identity's features.
1 code implementation • 12 Aug 2018 • Yue Pan, Bisheng Yang, Fuxun Liang, Zhen Dong
Then, we formulate the correspondence matching task as an energy function, which models the global similarity of keypoints on the hybrid spaces of BSC feature and Euclidean geometry.
3 code implementations • CVPR 2020 • Yaohui Cai, Zhewei Yao, Zhen Dong, Amir Gholami, Michael W. Mahoney, Kurt Keutzer
Importantly, ZeroQ has a very low computational overhead, and it can finish the entire quantization process in less than 30s (0. 5\% of one epoch training time of ResNet50 on ImageNet).
Ranked #1 on Data Free Quantization on CIFAR10 (CIFAR-10 W8A8 Top-1 Accuracy metric)
1 code implementation • ICCV 2023 • Xiuyu Li, Yijiang Liu, Long Lian, Huanrui Yang, Zhen Dong, Daniel Kang, Shanghang Zhang, Kurt Keutzer
We propose a novel PTQ method specifically tailored towards the unique multi-timestep pipeline and model architecture of the diffusion models, which compresses the noise estimation network to accelerate the generation process.
1 code implementation • 5 Oct 2023 • Haiping Wang, YuAn Liu, Bing Wang, Yujing Sun, Zhen Dong, Wenping Wang, Bisheng Yang
Matching cross-modality features between images and point clouds is a fundamental problem for image-to-point cloud registration.
1 code implementation • CVPR 2023 • Haiping Wang, YuAn Liu, Zhen Dong, Yulan Guo, Yu-Shen Liu, Wenping Wang, Bisheng Yang
Previous multiview registration methods rely on exhaustive pairwise registration to construct a densely-connected pose graph and apply Iteratively Reweighted Least Square (IRLS) on the pose graph to compute the scan poses.
2 code implementations • 26 Feb 2024 • Zhihang Yuan, Yuzhang Shang, Yang Zhou, Zhen Dong, Zhe Zhou, Chenhao Xue, Bingzhe Wu, Zhikai Li, Qingyi Gu, Yong Jae Lee, Yan Yan, Beidi Chen, Guangyu Sun, Kurt Keutzer
Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model for systematic analysis of LLM inference techniques.
2 code implementations • 29 Sep 2023 • Yuzhang Shang, Zhihang Yuan, Qiang Wu, Zhen Dong
This paper explores network binarization, a radical form of quantization, compressing model weights to a single bit, specifically for Large Language Models (LLMs) compression.
1 code implementation • 1 Sep 2021 • Haiping Wang, YuAn Liu, Zhen Dong, Wenping Wang
In this paper, we propose a novel local descriptor-based framework, called You Only Hypothesize Once (YOHO), for the registration of two unaligned point clouds.
Ranked #5 on Point Cloud Registration on ETH (trained on 3DMatch) (Recall (30cm, 5 degrees) metric)
1 code implementation • 21 Nov 2023 • Youqi Liao, Shuhao Kang, Jianping Li, Yang Liu, Yun Liu, Zhen Dong, Bisheng Yang, Xieyuanli Chen
Our framework features a two-stream encoder, an active fusion decoder (AFD) and a dual-task regularization approach.
1 code implementation • 30 Nov 2023 • Chen Long, Wenxiao Zhang, Zhe Chen, Haiping Wang, YuAn Liu, Zhen Cao, Zhen Dong, Bisheng Yang
The key contributions of SparseDC are two-fold.
1 code implementation • 24 Oct 2023 • Jay Zhangjie Wu, Xiuyu Li, Difei Gao, Zhen Dong, Jinbin Bai, Aishani Singh, Xiaoyu Xiang, Youzeng Li, Zuwei Huang, Yuanxi Sun, Rui He, Feng Hu, Junhua Hu, Hai Huang, Hanyu Zhu, Xu Cheng, Jie Tang, Mike Zheng Shou, Kurt Keutzer, Forrest Iandola
In this paper we present a retrospective on the competition and describe the winning method.
1 code implementation • CVPR 2022 • Xin Wen, Junsheng Zhou, Yu-Shen Liu, Zhen Dong, Zhizhong Han
Reconstructing 3D shape from a single 2D image is a challenging task, which needs to estimate the detailed 3D structures based on the semantic attributes from 2D image.
1 code implementation • ICCV 2021 • Runsong Zhu, YuAn Liu, Zhen Dong, Tengping Jiang, YuAn Wang, Wenping Wang, Bisheng Yang
Existing works use a network to learn point-wise weights for weighted least squares surface fitting to estimate the normals, which has difficulty in finding accurate normals in complex regions or containing noisy points.
Ranked #5 on Surface Normals Estimation on PCPNet
1 code implementation • ICCV 2021 • Bing Wang, Changhao Chen, Zhaopeng Cui, Jie Qin, Chris Xiaoxuan Lu, Zhengdi Yu, Peijun Zhao, Zhen Dong, Fan Zhu, Niki Trigoni, Andrew Markham
Accurately describing and detecting 2D and 3D keypoints is crucial to establishing correspondences across images and point clouds.
1 code implementation • 22 Jan 2021 • Shixing Yu, Zhewei Yao, Amir Gholami, Zhen Dong, Sehoon Kim, Michael W Mahoney, Kurt Keutzer
To address this problem, we introduce a new Hessian Aware Pruning (HAP) method coupled with a Neural Implant approach that uses second-order sensitivity as a metric for structured pruning.
1 code implementation • 2 Feb 2024 • Zequan Chen, Jianping Li, Qusheng Li, Bisheng Yang, Zhen Dong
The experimental results demonstrate DeepAAT's substantial improvements over conventional AAT methods, highlighting its potential in the efficiency and accuracy of UAV-based 3D reconstruction tasks.
1 code implementation • 14 Nov 2023 • Lin Xu, Zhiyuan Hu, Daquan Zhou, Hongyu Ren, Zhen Dong, Kurt Keutzer, See Kiong Ng, Jiashi Feng
Large Language Models (LLMs) have marked a significant advancement in the field of natural language processing, demonstrating exceptional capabilities in reasoning, tool usage, and memory.
1 code implementation • 20 Sep 2021 • Chen Long, Wenxiao Zhang, Ruihui Li, Hao Wang, Zhen Dong, Bisheng Yang
Point cloud upsampling is to densify a sparse point set acquired from 3D sensors, providing a denser representation for the underlying surface.
2 code implementations • 19 Feb 2020 • Qijing Huang, Dequan Wang, Yizhao Gao, Yaohui Cai, Zhen Dong, Bichen Wu, Kurt Keutzer, John Wawrzynek
In this work, we first investigate the overhead of the deformable convolution on embedded FPGA SoCs, and then show the accuracy-latency tradeoffs for a set of algorithm modifications including full versus depthwise, fixed-shape, and limited-range.
3 code implementations • 12 Jun 2020 • Zhen Dong, Dequan Wang, Qijing Huang, Yizhao Gao, Yaohui Cai, Tian Li, Bichen Wu, Kurt Keutzer, John Wawrzynek
Deploying deep learning models on embedded systems has been challenging due to limited computing resources.
1 code implementation • 30 Oct 2020 • Tian Li, Xiang Chen, Shanghang Zhang, Zhen Dong, Kurt Keutzer
Due to scarcity of labels on the target domain, we introduce mutual information maximization (MIM) apart from CL to exploit the features that best support the final prediction.
1 code implementation • 5 Dec 2020 • Tian Li, Xiang Chen, Shanghang Zhang, Zhen Dong, Kurt Keutzer
In this paper, we propose a contrastive learning framework for cross-domain sentiment classification.
1 code implementation • 18 Mar 2023 • Jingyi Hou, Zhen Dong, Jiayu Zhou, Zhijie Liu
Many real-world data mining tasks, however, lack sufficient variables for relation reasoning, and therefore these methods may not properly handle such forecasting problems.
1 code implementation • 7 Mar 2024 • Aosong Feng, Weikang Qiu, Jinbin Bai, Kaicheng Zhou, Zhen Dong, Xiao Zhang, Rex Ying, Leandros Tassiulas
Building on the success of text-to-image diffusion models (DPMs), image editing is an important application to enable human interaction with AI-generated content.
1 code implementation • 20 Jun 2022 • Tian Li, Xiang Chen, Zhen Dong, Weijiang Yu, Yijun Yan, Kurt Keutzer, Shanghang Zhang
Then during training, DASK injects pivot-related knowledge graph information into source domain texts.
1 code implementation • 24 Oct 2023 • Jinbin Bai, Zhen Dong, Aosong Feng, Xiao Zhang, Tian Ye, Kaicheng Zhou, Mike Zheng Shou
In the field of image processing, applying intricate semantic modifications within existing images remains an enduring challenge.
no code implementations • 22 Mar 2016 • Zhen Dong, Su Jia, Chi Zhang, Mingtao Pei
To sufficiently discover the useful information contained in face videos, we present a novel network architecture called input aggregated network which is able to learn fixed-length representations for variable-length face videos.
1 code implementation • 5 Nov 2018 • Manash Pratim Das, Zhen Dong, Sebastian Scherer
While external pose estimation and fiducial marker based localization would require setup, maintenance, and manual calibration; marker-free self-localization can be achieved using the onboard depth sensor and camera.
Robotics
no code implementations • 12 Sep 2019 • Sheng Shen, Zhen Dong, Jiayu Ye, Linjian Ma, Zhewei Yao, Amir Gholami, Michael W. Mahoney, Kurt Keutzer
In particular, we propose a new group-wise quantization scheme, and we use a Hessian based mix-precision method to compress the model further.
Ranked #13 on Semantic Textual Similarity on STS Benchmark
no code implementations • IJCNLP 2019 • Zhen Dong, Shizhao Sun, Hongzhi Liu, Jian-Guang Lou, Dongmei Zhang
On text-to-SQL generation, the input utterance usually contains lots of tokens that are related to column names or cells in the table, called \textit{table-related tokens}.
no code implementations • CVPR 2021 • YuAn Liu, Lingjie Liu, Cheng Lin, Zhen Dong, Wenping Wang
We propose a novel formulation of fitting coherent motions with a smooth function on a graph of correspondences and show that this formulation allows a closed-form solution by graph Laplacian.
no code implementations • 25 Mar 2021 • Amir Gholami, Sehoon Kim, Zhen Dong, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer
Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks.
no code implementations • 26 Apr 2021 • Zhen Dong, Yizhao Gao, Qijing Huang, John Wawrzynek, Hayden K. H. So, Kurt Keutzer
Automatic algorithm-hardware co-design for DNN has shown great success in improving the performance of DNNs on FPGAs.
Hardware Aware Neural Architecture Search Image Classification +2
no code implementations • 25 Oct 2021 • Allison McCarn Deiana, Nhan Tran, Joshua Agar, Michaela Blott, Giuseppe Di Guglielmo, Javier Duarte, Philip Harris, Scott Hauck, Mia Liu, Mark S. Neubauer, Jennifer Ngadiuba, Seda Ogrenci-Memik, Maurizio Pierini, Thea Aarrestad, Steffen Bahr, Jurgen Becker, Anne-Sophie Berthold, Richard J. Bonventre, Tomas E. Muller Bravo, Markus Diefenthaler, Zhen Dong, Nick Fritzsche, Amir Gholami, Ekaterina Govorkova, Kyle J Hazelwood, Christian Herwig, Babar Khan, Sehoon Kim, Thomas Klijnsma, Yaling Liu, Kin Ho Lo, Tri Nguyen, Gianantonio Pezzullo, Seyedramin Rasoulinezhad, Ryan A. Rivera, Kate Scholberg, Justin Selig, Sougata Sen, Dmitri Strukov, William Tang, Savannah Thais, Kai Lukas Unger, Ricardo Vilalta, Belinavon Krosigk, Thomas K. Warburton, Maria Acosta Flechas, Anthony Aportela, Thomas Calvet, Leonardo Cristella, Daniel Diaz, Caterina Doglioni, Maria Domenica Galati, Elham E Khoda, Farah Fahim, Davide Giri, Benjamin Hawks, Duc Hoang, Burt Holzman, Shih-Chieh Hsu, Sergo Jindariani, Iris Johnson, Raghav Kansal, Ryan Kastner, Erik Katsavounidis, Jeffrey Krupa, Pan Li, Sandeep Madireddy, Ethan Marx, Patrick McCormack, Andres Meza, Jovan Mitrevski, Mohammed Attia Mohammed, Farouk Mokhtar, Eric Moreno, Srishti Nagu, Rohin Narayan, Noah Palladino, Zhiqiang Que, Sang Eon Park, Subramanian Ramamoorthy, Dylan Rankin, Simon Rothman, ASHISH SHARMA, Sioni Summers, Pietro Vischia, Jean-Roch Vlimant, Olivia Weng
In this community review report, we discuss applications and techniques for fast machine learning (ML) in science -- the concept of integrating power ML methods into the real-time experimental data processing loop to accelerate scientific discovery.
2 code implementations • 23 Nov 2021 • Zhen Cao, Wenxiao Zhang, Xin Wen, Zhen Dong, Yu-Shen Liu, Xiongwu Xiao, Bisheng Yang
The student network takes the incomplete one as input and restores the corresponding complete shape.
no code implementations • 24 Apr 2022 • Yiqiao Xu, Alessandra Parisio, Zhongguo Li, Zhen Dong, Zhengtao Ding
This paper presents a novel scheme termed Optimization-based Ramping Reserve Allocation (ORRA) for addressing an ongoing challenge in Automatic Generation Control (AGC) enhancement, i. e., the optimal coordination of multiple Battery Energy Storage Systems (BESSs).
no code implementations • 4 May 2022 • Zhen Dong, Kaicheng Zhou, Guohao Li, Qiang Zhou, Mingfei Guo, Bernard Ghanem, Kurt Keutzer, Shanghang Zhang
Neural architecture search (NAS) has shown great success in the automatic design of deep neural networks (DNNs).
no code implementations • 14 Jun 2022 • Runsong Zhu, Di Kang, Ka-Hei Hui, Yue Qian, Xuefei Zhe, Zhen Dong, Linchao Bao, Pheng-Ann Heng, Chi-Wing Fu
To guide the network quickly fit the coarse shape, we propose to utilize the signed supervision in regions that are obviously outside the object and can be easily determined, resulting in our semi-signed supervision.
no code implementations • 14 Sep 2022 • Lingran Zhao, Zhen Dong, Kurt Keutzer
Quantization is wildly taken as a model compression technique, which obtains efficient models by converting floating-point weights and activations in the neural network into lower-bit integers.
no code implementations • CVPR 2023 • Yijiang Liu, Huanrui Yang, Zhen Dong, Kurt Keutzer, Li Du, Shanghang Zhang
Building on the theoretical insight, NoisyQuant achieves the first success on actively altering the heavy-tailed activation distribution with additive noisy bias to fit a given quantizer.
no code implementations • 6 Dec 2022 • Lirui Xiao, Huanrui Yang, Zhen Dong, Kurt Keutzer, Li Du, Shanghang Zhang
CSQ stabilizes the bit-level mixed-precision training process with a bi-level gradual continuous sparsification on both the bit values of the quantized weights and the bit selection in determining the quantization precision of each layer.
no code implementations • 29 Jan 2023 • Yiqiao Xu, XIAOYU GUO, Zhen Dong, Zhengtao Ding, Alessandra Parisio
The Multi-stack Fuel Cell System (MFCS), which is an assembly of FC stacks, can be a remedy for obstacles in high-power applications.
no code implementations • 13 Apr 2023 • Javier Campos, Zhen Dong, Javier Duarte, Amir Gholami, Michael W. Mahoney, Jovan Mitrevski, Nhan Tran
We develop an end-to-end workflow for the training and implementation of co-designed neural networks (NNs) for efficient field-programmable gate array (FPGA) and application-specific integrated circuit (ASIC) hardware.
no code implementations • ICCV 2023 • Yifan Zhang, Zhen Dong, Huanrui Yang, Ming Lu, Cheng-Ching Tseng, Yuan Du, Kurt Keutzer, Li Du, Shanghang Zhang
Multi-view 3D detection based on BEV (bird-eye-view) has recently achieved significant improvements.
no code implementations • 26 Sep 2023 • Shuhao Kang, Youqi Liao, Jianping Li, Fuxun Liang, Yuhao Li, Fangning Li, Zhen Dong, Bisheng Yang
Specifically, In the coarse matching phase, a novel I2P transformer module is employed to capture both homogeneous and heterogeneous global information from the image and point cloud data.
no code implementations • 11 Oct 2023 • Zhikai Li, Xiaoxuan Liu, Banghua Zhu, Zhen Dong, Qingyi Gu, Kurt Keutzer
Large Language Models (LLMs) have showcased remarkable impacts across a wide spectrum of natural language processing tasks.
no code implementations • 22 Oct 2023 • Zhanyuan Tian, Tianrui Zhu, Zerui Tian, Zhen Dong
In the process of constructing a road network with point cloud information, we summarize several major features of the point cloud collected by laser scanners and analyze the potential problems of constructing the network, such as misjudging the feature points as ground points and grid voids.
no code implementations • 12 Nov 2023 • Chenyu Wang, Zhen Dong, Daquan Zhou, Zhenhua Zhu, Yu Wang, Jiashi Feng, Kurt Keutzer
On the hardware side, we modify the datapath of current PIM accelerators to accommodate epitomes and implement a feature map reuse technique to reduce computation cost.
no code implementations • 14 Dec 2023 • Anthony Chen, Huanrui Yang, Yulu Gan, Denis A Gudovskiy, Zhen Dong, Haofan Wang, Tomoyuki Okuno, Yohei Nakata, Shanghang Zhang, Kurt Keutzer
In particular, we build a tree-like Split-Ensemble architecture by performing iterative splitting and pruning from a shared backbone model, where each branch serves as a submodel corresponding to a subtask.
no code implementations • 27 Dec 2023 • Rongyu Zhang, Yulin Luo, Jiaming Liu, Huanrui Yang, Zhen Dong, Denis Gudovskiy, Tomoyuki Okuno, Yohei Nakata, Kurt Keutzer, Yuan Du, Shanghang Zhang
In this work, we propose an efficient MoE architecture with weight sharing across the experts.
no code implementations • 29 Feb 2024 • Jiahao Zhou, Chen Long, Yue Xie, Jialiang Wang, Boheng Li, Haiping Wang, Zhe Chen, Zhen Dong
Therefore, such a unique attribute can assist in exploring the potential for the multi-task model and even the foundation model without separate training methods.
no code implementations • 4 Jan 2024 • Rui Ma, Qiang Zhou, Bangjun Xiao, Yizhu Jin, Daquan Zhou, Xiuyu Li, Aishani Singh, Yi Qu, Kurt Keutzer, Xiaodong Xie, Jingtong Hu, Zhen Dong, Shanghang Zhang
Copyright is a legal right that grants creators the exclusive authority to reproduce, distribute, and profit from their creative works.