ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identification

no code implementations23 May 2024 Yefei He, Luoming Zhang, Weijia Wu, Jing Liu, Hong Zhou, Bohan Zhuang

In terms of efficiency, ZipCache also showcases a $37. 3\%$ reduction in prefill-phase latency, a $56. 9\%$ reduction in decoding-phase latency, and a $19. 8\%$ reduction in GPU memory usage when evaluating LLaMA3-8B model with a input length of $4096$.

GSM8K Quantization

Multimodal Sense-Informed Prediction of 3D Human Motions

no code implementations5 May 2024 Zhenyu Lou, Qiongjie Cui, Haofan Wang, Xu Tang, Hong Zhou

Predicting future human pose is a fundamental application for machine intelligence, which drives robots to plan their behavior and paths ahead of time to seamlessly accomplish human-robot collaboration in real-world 3D scenarios.

motion prediction Trajectory Prediction

The intelligent prediction and assessment of financial information risk in the cloud computing model

no code implementations14 Apr 2024 Yufu Wang, Mingwei Zhu, Jiaqiang Yuan, Guanghui Wang, Hong Zhou

Cloud computing (cloud computing) is a kind of distributed computing, referring to the network "cloud" will be a huge data calculation and processing program into countless small programs, and then, through the system composed of multiple servers to process and analyze these small programs to get the results and return to the user.

Cloud Computing Distributed Computing +1

Implementation of an AI-based MRD evaluation and prediction model for multiple myeloma

no code implementations29 Feb 2024 Jianfeng Chen, Jize Xiong, Yixu Wang, Qi Xin, Hong Zhou

With the application of hematopoietic stem cell transplantation and new drugs, the progression-free survival rate and overall survival rate of multiple myeloma have been greatly improved, but it is still considered as a kind of disease that cannot be completely cured.

Towards Accurate Post-training Quantization for Reparameterized Models

1 code implementation25 Feb 2024 Luoming Zhang, Yefei He, Wen Fei, Zhenyu Lou, Weijia Wu, YangWei Ying, Hong Zhou

Our framework outperforms previous methods by approximately 1\% for 8-bit PTQ and 2\% for 6-bit PTQ, showcasing its superior performance.


EL-VIT: Probing Vision Transformer with Interactive Visualization

no code implementations23 Jan 2024 Hong Zhou, Rui Zhang, Peifeng Lai, Chaoran Guo, Yong Wang, Zhida Sun, Junjie Li

Therefore, a visualization system is needed to assist ViT users in understanding its functionality.

Multimodal Sense-Informed Forecasting of 3D Human Motions

no code implementations CVPR 2024 Zhenyu Lou, Qiongjie Cui, Haofan Wang, Xu Tang, Hong Zhou

To address this limitation this work introduces a novel multi-modal sense-informed motion prediction approach which conditions high-fidelity generation on two modal information: external 3D scene and internal human gaze and is able to recognize their salience for future human activity.

motion prediction Trajectory Prediction

Continual Learning for Image Segmentation with Dynamic Query

1 code implementation29 Nov 2023 Weijia Wu, Yuzhong Zhao, Zhuang Li, Lianlei Shan, Hong Zhou, Mike Zheng Shou

Image segmentation based on continual learning exhibits a critical drop of performance, mainly due to catastrophic forgetting and background shift, as they are required to incorporate new classes continually.

Continual Learning Image Segmentation +5

Improving Vision-and-Language Reasoning via Spatial Relations Modeling

no code implementations9 Nov 2023 Cheng Yang, Rui Xu, Ye Guo, Peixiang Huang, Yiru Chen, Wenkui Ding, Zhongyuan Wang, Hong Zhou

Further, we design two pre-training tasks named object position regression (OPR) and spatial relation classification (SRC) to learn to reconstruct the spatial relation graph respectively.

Position regression Relation +3

Dual Grained Quantization: Efficient Fine-Grained Quantization for LLM

no code implementations7 Oct 2023 Luoming Zhang, Wen Fei, Weijia Wu, Yefei He, Zhenyu Lou, Hong Zhou

Fine-grained quantization has smaller quantization loss, consequently achieving superior performance.


EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models

1 code implementation5 Oct 2023 Yefei He, Jing Liu, Weijia Wu, Hong Zhou, Bohan Zhuang

In this paper, we introduce a data-free and parameter-efficient fine-tuning framework for low-bit diffusion models, dubbed EfficientDM, to achieve QAT-level performance with PTQ-like efficiency.

Denoising Image Generation +1

DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models

1 code implementation NeurIPS 2023 Weijia Wu, Yuzhong Zhao, Hao Chen, YuChao Gu, Rui Zhao, Yefei He, Hong Zhou, Mike Zheng Shou, Chunhua Shen

To showcase the power of the proposed approach, we generate datasets with rich dense pixel-wise labels for a wide range of downstream tasks, including semantic segmentation, instance segmentation, and depth estimation.

Decoder Depth Estimation +6

A Surrogate Data Assimilation Model for the Estimation of Dynamical System in a Limited Area

no code implementations14 Jul 2023 Wei Kang, Liang Xu, Hong Zhou

We propose a novel learning-based surrogate data assimilation (DA) model for efficient state estimation in a limited area.

A Large Cross-Modal Video Retrieval Dataset with Reading Comprehension

1 code implementation5 May 2023 Weijia Wu, Yuzhong Zhao, Zhuang Li, Jiahong Li, Hong Zhou, Mike Zheng Shou, Xiang Bai

Most existing cross-modal language-to-video retrieval (VR) research focuses on single-modal input from video, i. e., visual representation, while the text is omnipresent in human environments and frequently critical to understand video.

Reading Comprehension Retrieval +2

Experimental Design for Any $p$-Norm

no code implementations3 May 2023 Lap Chi Lau, Robert Wang, Hong Zhou

We prove that a randomized local search approach provides a unified algorithm to solve this problem for all $p$.

Experimental Design

BiViT: Extremely Compressed Binary Vision Transformers

no code implementations ICCV 2023 Yefei He, Zhenyu Lou, Luoming Zhang, Jing Liu, Weijia Wu, Hong Zhou, Bohan Zhuang

To solve this, we propose Softmax-aware Binarization, which dynamically adapts to the data distribution and reduces the error caused by binarization.

Binarization object-detection +1

BiViT: Extremely Compressed Binary Vision Transformer

no code implementations14 Nov 2022 Yefei He, Zhenyu Lou, Luoming Zhang, Jing Liu, Weijia Wu, Hong Zhou, Bohan Zhuang

To solve this, we propose Softmax-aware Binarization, which dynamically adapts to the data distribution and reduces the error caused by binarization.

Binarization object-detection +1

Real-time End-to-End Video Text Spotter with Contrastive Representation Learning

1 code implementation18 Jul 2022 Wejia Wu, Zhuang Li, Jiahong Li, Chunhua Shen, Hong Zhou, Size Li, Zhongyuan Wang, Ping Luo

Our contributions are three-fold: 1) CoText simultaneously address the three tasks (e. g., text detection, tracking, recognition) in a real-time end-to-end trainable framework.

Contrastive Learning Representation Learning +2

Binarizing by Classification: Is soft function really necessary?

no code implementations16 May 2022 Yefei He, Luoming Zhang, Weijia Wu, Hong Zhou

Extensive experiments demonstrate that the proposed method yields surprising performance both in image classification and human pose estimation tasks.

 Ranked #1 on Binarization on ImageNet (Top 1 Accuracy metric)

Binarization Binary Classification +3

Data-Free Quantization with Accurate Activation Clipping and Adaptive Batch Normalization

no code implementations8 Apr 2022 Yefei He, Luoming Zhang, Weijia Wu, Hong Zhou

In this paper, we present a simple yet effective data-free quantization method with accurate activation clipping and adaptive batch normalization.

Data Free Quantization

End-to-End Video Text Spotting with Transformer

1 code implementation20 Mar 2022 Weijia Wu, Yuanqiang Cai, Chunhua Shen, Debing Zhang, Ying Fu, Hong Zhou, Ping Luo

Recent video text spotting methods usually require the three-staged pipeline, i. e., detecting text in individual images, recognizing localized text, tracking text streams with post-processing to generate final results.

Text Detection Text Spotting

The Observability in Unobservable Systems

no code implementations11 Jan 2022 Wei Kang, Liang Xu, Hong Zhou

In this paper, we introduce the concept of observability of targeted state variables for systems that may not be fully observable.

Contrastive Learning of Semantic and Visual Representations for Text Tracking

1 code implementation30 Dec 2021 Zhuang Li, Weijia Wu, Mike Zheng Shou, Jiahong Li, Size Li, Zhongyuan Wang, Hong Zhou

Semantic representation is of great benefit to the video text tracking(VTT) task that requires simultaneously classifying, detecting, and tracking texts in the video.

Contrastive Learning

Divide-and-Assemble: Learning Block-wise Memory for Unsupervised Anomaly Detection

no code implementations ICCV 2021 Jinlei Hou, Yingying Zhang, Qiaoyong Zhong, Di Xie, ShiLiang Pu, Hong Zhou

Surprisingly, by varying the granularity of division on feature maps, we are able to modulate the reconstruction capability of the model for both normal and abnormal samples.

Unsupervised Anomaly Detection

Polygon-free: Unconstrained Scene Text Detection with Box Annotations

1 code implementation26 Nov 2020 Weijia Wu, Enze Xie, Ruimao Zhang, Wenhai Wang, Hong Zhou, Ping Luo

For example, without using polygon annotations, PSENet achieves an 80. 5% F-score on TotalText [3] (vs. 80. 9% of fully supervised counterpart), 31. 1% better than training directly with upright bounding box annotations, and saves 80%+ labeling costs.

Scene Text Detection Text Detection

A Local Search Framework for Experimental Design

no code implementations29 Oct 2020 Lap Chi Lau, Hong Zhou

We present a local search framework to design and analyze both combinatorial algorithms and rounding algorithms for experimental design problems.

Experimental Design Fairness

TextCohesion: Detecting Text for Arbitrary Shapes

no code implementations22 Apr 2019 Weijia Wu, Jici Xing, Hong Zhou

In this paper, we propose a pixel-wise method named TextCohesion for scene text detection, which splits a text instance into five key components: a Text Skeleton and four Directional Pixel Regions.

Curved Text Detection Text Detection

Brain Tumor Segmentation Based on Refined Fully Convolutional Neural Networks with A Hierarchical Dice Loss

1 code implementation25 Dec 2017 Jiachi Zhang, Xiaolei Shen, Tianqi Zhuo, Hong Zhou

Since the proposal of fully convolutional neural network (FCNN), it has been widely used in semantic segmentation because of its high accuracy of pixel-wise classification as well as high precision of localization.

Binary Classification Brain Tumor Segmentation +6

An Automatic Diagnosis Method of Facial Acne Vulgaris Based on Convolutional Neural Network

no code implementations13 Nov 2017 Xiaolei Shen, Jiachi Zhang, Chenjun Yan, Hong Zhou

The core of our method is to extract features of images based on convolutional neural network and achieve classification by classifier.

Classification General Classification

