Search Results for author: Ru Huang

Found 21 papers, 5 papers with code

GAIA: Rethinking Action Quality Assessment for AI-Generated Videos

1 code implementation10 Jun 2024 Zijian Chen, Wei Sun, Yuan Tian, Jun Jia, ZiCheng Zhang, Jiarui Wang, Ru Huang, Xiongkuo Min, Guangtao Zhai, Wenjun Zhang

Assessing action quality is both imperative and challenging due to its significant impact on the quality of AI-generated videos, further complicated by the inherently ambiguous nature of actions within AI-generated video (AIGV).

Action Quality Assessment

FastQuery: Communication-efficient Embedding Table Query for Private LLM Inference

no code implementations25 May 2024 Chenqi Lin, Tianshi Xu, Zebin Yang, Runsheng Wang, Ru Huang, Meng Li

We observe the overhead mainly comes from the neglect of 1) the one-hot nature of user queries and 2) the robustness of the embedding table to low bit-width quantization noise.

Quantization

EasyACIM: An End-to-End Automated Analog CIM with Synthesizable Architecture and Agile Design Space Exploration

no code implementations12 Apr 2024 Haoyi Zhang, Jiahao Song, Xiaohan Gao, Xiyuan Tang, Yibo Lin, Runsheng Wang, Ru Huang

Leveraging the multi-objective genetic algorithm (MOGA)-based design space explorer, EasyACIM can obtain high-quality ACIM solutions based on the proposed synthesizable architecture, targeting versatile application scenarios.

Edge-computing

PDNNet: PDN-Aware GNN-CNN Heterogeneous Network for Dynamic IR Drop Prediction

no code implementations27 Mar 2024 Yuxiang Zhao, Zhuomin Chai, Xun Jiang, Yibo Lin, Runsheng Wang, Ru Huang

We are the first work to apply graph structure to deep-learning based dynamic IR drop prediction method.

ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding

no code implementations21 Feb 2024 Shuzhang Zhong, Zebin Yang, Meng Li, Ruihao Gong, Runsheng Wang, Ru Huang

Additionally, it introduces a dynamic token tree generation algorithm to balance the computation and parallelism of the verification phase in real-time and maximize the overall efficiency across different batch sizes, sequence lengths, and tasks, etc.

ASCEND: Accurate yet Efficient End-to-End Stochastic Computing Acceleration of Vision Transformer

no code implementations20 Feb 2024 Tong Xie, Yixuan Hu, Renjie Wei, Meng Li, YuAn Wang, Runsheng Wang, Ru Huang

To overcome the compatibility challenges, ASCEND proposes a novel deterministic SC block for GELU and leverages an SC-friendly iterative approximate algorithm to design an accurate and efficient softmax circuit.

AttentionLego: An Open-Source Building Block For Spatially-Scalable Large Language Model Accelerator With Processing-In-Memory Technology

no code implementations21 Jan 2024 Rongqing Cong, Wenyang He, Mingxuan Li, Bangning Luo, Zebin Yang, Yuchao Yang, Ru Huang, Bonan Yan

Large language models (LLMs) with Transformer architectures have become phenomenal in natural language processing, multimodal generative artificial intelligence, and agent-oriented artificial intelligence.

Language Modelling Large Language Model

FS-BAND: A Frequency-Sensitive Banding Detector

no code implementations30 Nov 2023 Zijian Chen, Wei Sun, ZiCheng Zhang, Ru Huang, Fangfang Lu, Xiongkuo Min, Guangtao Zhai, Wenjun Zhang

Banding artifact, as known as staircase-like contour, is a common quality annoyance that happens in compression, transmission, etc.

Image Quality Assessment

Memory-aware Scheduling for Complex Wired Networks with Iterative Graph Optimization

no code implementations26 Aug 2023 Shuzhang Zhong, Meng Li, Yun Liang, Runsheng Wang, Ru Huang

Memory-aware network scheduling is becoming increasingly important for deep neural network (DNN) inference on resource-constrained devices.

Scheduling

Falcon: Accelerating Homomorphically Encrypted Convolutions for Efficient Private Mobile Network Inference

no code implementations25 Aug 2023 Tianshi Xu, Meng Li, Runsheng Wang, Ru Huang

Efficient networks, e. g., MobileNetV2, EfficientNet, etc, achieves state-of-the-art (SOTA) accuracy with lightweight computation.

Reducing operator complexity in Algebraic Multigrid with Machine Learning Approaches

no code implementations15 Jul 2023 Ru Huang, Kai Chang, Huan He, Ruipeng Li, Yuanzhe Xi

We propose a data-driven and machine-learning-based approach to compute non-Galerkin coarse-grid operators in algebraic multigrid (AMG) methods, addressing the well-known issue of increasing operator complexity.

HybridNet: Dual-Branch Fusion of Geometrical and Topological Views for VLSI Congestion Prediction

no code implementations7 May 2023 Yuxiang Zhao, Zhuomin Chai, Yibo Lin, Runsheng Wang, Ru Huang

Accurate early congestion prediction can prevent unpleasant surprises at the routing stage, playing a crucial character in assisting designers to iterate faster in VLSI design cycles.

EBSR: Enhanced Binary Neural Network for Image Super-Resolution

no code implementations22 Mar 2023 Renjie Wei, Shuwen Zhang, Zechun Liu, Meng Li, Yuchen Fan, Runsheng Wang, Ru Huang

While the performance of deep convolutional neural networks for image super-resolution (SR) has improved significantly, the rapid increase of memory and computation requirements hinders their deployment on resource-constrained devices.

Binarization Image Super-Resolution +1

Extremely-Fast, Energy-Efficient Massive MIMO Precoding with Analog RRAM Matrix Computing

no code implementations7 Nov 2022 Pushen Zuo, Zhong Sun, Ru Huang

Signal processing in wireless communications, such as precoding, detection, and channel estimation, are basically about solving inverse matrix problems, which, however, are slow and inefficient in conventional digital computers, thus requiring a radical paradigm shift to achieve fast, real-time solutions.

CircuitNet: An Open-Source Dataset for Machine Learning Applications in Electronic Design Automation (EDA)

no code implementations1 Aug 2022 Zhuomin Chai, Yuxiang Zhao, Yibo Lin, Wei Liu, Runsheng Wang, Ru Huang

The electronic design automation (EDA) community has been actively exploring machine learning (ML) for very large-scale integrated computer-aided design (VLSI CAD).

BIG-bench Machine Learning

Learning optimal multigrid smoothers via neural networks

no code implementations24 Feb 2021 Ru Huang, Ruipeng Li, Yuanzhe Xi

Multigrid methods are one of the most efficient techniques for solving linear systems arising from Partial Differential Equations (PDEs) and graph Laplacians from machine learning applications.

Generating a Doppelganger Graph: Resembling but Distinct

1 code implementation23 Jan 2021 Yuliang Ji, Ru Huang, Jie Chen, Yuanzhe Xi

Deep generative models, since their inception, have become increasingly more capable of generating novel and perceptually realistic signals (e. g., images and sound waves).

Benchmarking Graph Representation Learning +1

DaSGD: Squeezing SGD Parallelization Performance in Distributed Training Using Delayed Averaging

no code implementations31 May 2020 Qinggang Zhou, Yawen Zhang, Pengcheng Li, Xiaoyong Liu, Jun Yang, Runsheng Wang, Ru Huang

The state-of-the-art deep learning algorithms rely on distributed training systems to tackle the increasing sizes of models and training data sets.

Cannot find the paper you are looking for? You can Submit a new open access paper.