no code implementations • 18 Mar 2025 • Cong Guo, Changqin Huang, Wenhua Zhou, Xiaodi Huang
To overcome these limitations, this study introduces a novel multi-label feature selection method called Binary Hashing and Dynamic Graph Constraint (BHDG), the first method to integrate binary hashing into multi-label learning.
no code implementations • 3 Feb 2025 • Mark Horton, Tergel Molom-Ochir, Peter Liu, Bhavna Gopal, Chiyue Wei, Cong Guo, Brady Taylor, Deliang Fan, Shan X. Wang, Hai Li, Yiran Chen
HAD achieves just $\mathbf{1. 78}\%$ performance losses on GLUE compared to $9. 08\%$ in state-of-the-art binarization work, and $\mathbf{2. 5}\%$ performance losses on ImageNet compared to $12. 14\%$, all while targeting custom hardware with a $\mathbf{79}\%$ area reduction and $\mathbf{87}\%$ power reduction compared to its standard attention counterpart.
no code implementations • 8 Oct 2024 • Cong Guo, Feng Cheng, Zhixu Du, James Kiessling, Jonathan Ku, Shiyu Li, Ziru Li, Mingyuan Ma, Tergel Molom-Ochir, Benjamin Morris, Haoxuan Shan, Jingwei Sun, Yitu Wang, Chiyue Wei, Xueying Wu, Yuhao Wu, Hao Frank Yang, Jingyang Zhang, Junyao Zhang, Qilin Zheng, Guanglei Zhou, Hai, Li, Yiran Chen
The rapid development of large language models (LLMs) has significantly transformed the field of artificial intelligence, demonstrating remarkable capabilities in natural language processing and moving towards multi-modal functionality.
1 code implementation • 22 Jul 2024 • Jiale Xu, Rui Zhang, Cong Guo, Weiming Hu, Zihan Liu, Feiyang Wu, Yu Feng, Shixuan Sun, Changxu Shao, Yuhong Guo, Junping Zhao, Ke Zhang, Minyi Guo, Jingwen Leng
This study introduces the vTensor, an innovative tensor structure for LLM inference based on GPU virtual memory management (VMM).
no code implementations • 7 Dec 2023 • Cong Guo
Existing methods address this challenge by first employing imputation methods to complete the incomplete data and then conducting feature selection based on the imputed data.
no code implementations • 14 Nov 2023 • Cong Guo, Chun Liu, Wei Yang
Existing imputation methods estimate the missing parts based on the observed values in the original feature space, and they treat all features as equally important during data completion, while in fact different features have different importance.
no code implementations • 16 Aug 2023 • Shuwen Lu, Zhihui Zhang, Cong Guo, Jingwen Leng, Yangjie Zhou, Minyi Guo
However, designing GNN accelerators faces two fundamental challenges: the high bandwidth requirement of GNN models and the diversity of GNN models.
no code implementations • 27 May 2023 • Yangjie Zhou, Yaoxu Song, Jingwen Leng, Zihan Liu, Weihao Cui, Zhendong Zhang, Cong Guo, Quan Chen, Li Li, Minyi Guo
Graph neural networks (GNNs) are powerful tools for exploring and learning from graph structures and features.
1 code implementation • 23 May 2023 • Wenxiao Cai, Ke Jin, Jinyan Hou, Cong Guo, Letian Wu, Wankou Yang
Semantic segmentation of drone images is critical for various aerial vision tasks as it provides essential semantic details to understand scenes on the ground.
Ranked #1 on
Semantic Segmentation
on VDD
no code implementations • 22 Sep 2022 • Cong Guo, Yuxian Qiu, Jingwen Leng, Chen Zhang, Ying Cao, Quanlu Zhang, Yunxin Liu, Fan Yang, Minyi Guo
An activation function is an element-wise mathematical function and plays a crucial role in deep neural networks (DNN).
1 code implementation • 30 Aug 2022 • Cong Guo, Chen Zhang, Jingwen Leng, Zihan Liu, Fan Yang, Yunxin Liu, Minyi Guo, Yuhao Zhu
In this work, we propose a fixed-length adaptive numerical data type called ANT to achieve low-bit quantization with tiny hardware overheads.
no code implementations • 25 Aug 2022 • Zhengyi Li, Cong Guo, Zhanda Zhu, Yangjie Zhou, Yuxian Qiu, Xiaotian Gao, Jingwen Leng, Minyi Guo
To deal with the runtime overhead, we use a coarse-grained version of the border function.
1 code implementation • ICLR 2022 • Cong Guo, Yuxian Qiu, Jingwen Leng, Xiaotian Gao, Chen Zhang, Yunxin Liu, Fan Yang, Yuhao Zhu, Minyi Guo
This paper proposes an on-the-fly DFQ framework with sub-second quantization time, called SQuant, which can quantize networks on inference-only devices with low computation and memory requirements.
no code implementations • 20 May 2021 • Yang Wang, Chen Zhang, Zhiqiang Xie, Cong Guo, Yunxin Liu, Jingwen Leng
We demonstrate the feasibility of our design with minimal changes to the existing production-scale inner-product-based Tensor Core.
1 code implementation • 29 Aug 2020 • Cong Guo, Bo Yang Hsueh, Jingwen Leng, Yuxian Qiu, Yue Guan, Zehuan Wang, Xiaoying Jia, Xipeng Li, Minyi Guo, Yuhao Zhu
Network pruning can reduce the high computation cost of deep neural network (DNN) models.
no code implementations • 18 Feb 2020 • Cong Guo, Yangjie Zhou, Jingwen Leng, Yuhao Zhu, Zidong Du, Quan Chen, Chao Li, Bin Yao, Minyi Guo
We propose Simultaneous Multi-mode Architecture (SMA), a novel architecture design and execution model that offers general-purpose programmability on DNN accelerators in order to accelerate end-to-end applications.
no code implementations • CVPR 2019 • Yuxian Qiu, Jingwen Leng, Cong Guo, Quan Chen, Chao Li, Minyi Guo, Yuhao Zhu
Recently, researchers have started decomposing deep neural network models according to their semantics or functions.