Search Results for author: Yuhong Li

Found 25 papers, 16 papers with code

SnapKV: LLM Knows What You are Looking for Before Generation

1 code implementation22 Apr 2024 Yuhong Li, Yingbing Huang, Bowen Yang, Bharat Venkitesh, Acyr Locatelli, Hanchen Ye, Tianle Cai, Patrick Lewis, Deming Chen

Specifically, SnapKV achieves a consistent decoding speed with a 3. 6x increase in generation speed and an 8. 2x enhancement in memory efficiency compared to baseline when processing inputs of 16K tokens.


Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

1 code implementation19 Jan 2024 Tianle Cai, Yuhong Li, Zhengyang Geng, Hongwu Peng, Jason D. Lee, Deming Chen, Tri Dao

We present two levels of fine-tuning procedures for Medusa to meet the needs of different use cases: Medusa-1: Medusa is directly fine-tuned on top of a frozen backbone LLM, enabling lossless inference acceleration.

Decouple Before Interact: Multi-Modal Prompt Learning for Continual Visual Question Answering

no code implementations ICCV 2023 Zi Qian, Xin Wang, Xuguang Duan, Pengda Qin, Yuhong Li, Wenwu Zhu

Based on our formulation, we further propose MulTi-Modal PRompt LearnIng with DecouPLing bEfore InTeraction (TRIPLET), a novel approach that builds on a pre-trained vision-language model and consists of decoupled prompts and prompt interaction strategies to capture the complex interactions between modalities.

Continual Learning Language Modelling +2

Extensible and Efficient Proxy for Neural Architecture Search

no code implementations ICCV 2023 Yuhong Li, Jiajie Li, Cong Hao, Pan Li, JinJun Xiong, Deming Chen

We further propose a Discrete Proxy Search (DPS) method to find the optimized training settings for Eproxy with only a handful of benchmarked architectures on the target tasks.

Neural Architecture Search

Extensible Proxy for Efficient NAS

1 code implementation17 Oct 2022 Yuhong Li, Jiajie Li, Cong Han, Pan Li, JinJun Xiong, Deming Chen

(2) Efficient proxies are not extensible to multi-modality downstream tasks.

Neural Architecture Search

What Makes Convolutional Models Great on Long Sequence Modeling?

1 code implementation17 Oct 2022 Yuhong Li, Tianle Cai, Yi Zhang, Deming Chen, Debadeepta Dey

We focus on the structure of the convolution kernel and identify two critical but intuitive principles enjoyed by S4 that are sufficient to make up an effective global convolutional model: 1) The parameterization of the convolutional kernel needs to be efficient in the sense that the number of parameters should scale sub-linearly with sequence length.

Long-range modeling

Compilation and Optimizations for Efficient Machine Learning on Embedded Systems

no code implementations6 Jun 2022 Xiaofan Zhang, Yao Chen, Cong Hao, Sitao Huang, Yuhong Li, Deming Chen

Deep Neural Networks (DNNs) have achieved great success in a variety of machine learning (ML) applications, delivering high-quality inferencing solutions in computer vision, natural language processing, and virtual reality, etc.

BIG-bench Machine Learning

Generic Neural Architecture Search via Regression

2 code implementations NeurIPS 2021 Yuhong Li, Cong Hao, Pan Li, JinJun Xiong, Deming Chen

Such a self-supervised regression task can effectively evaluate the intrinsic power of an architecture to capture and transform the input signal patterns, and allow more sufficient usage of training samples.

 Ranked #1 on Neural Architecture Search on NAS-Bench-101 (Spearman Correlation metric)

Image Classification Neural Architecture Search +1

TVDIM: Enhancing Image Self-Supervised Pretraining via Noisy Text Data

no code implementations3 Jun 2021 Pengda Qin, Yuhong Li, Kefeng Deng, Qiang Wu

Among ubiquitous multimodal data in the real world, text is the modality generated by human, while image reflects the physical world honestly.

Contrastive Learning Image Classification +1

Improving Random-Sampling Neural Architecture Search by Evolving the Proxy Search Space

1 code implementation1 Jan 2021 Yuhong Li, Cong Hao, Xiaofan Zhang, JinJun Xiong, Wen-mei Hwu, Deming Chen

This raises the question of whether we can find an effective proxy search space (PS) that is only a small subset of GS to dramatically improve RandomNAS’s search efficiency while at the same time keeping a good correlation for the top-performing architectures.

Image Classification Neural Architecture Search

Effective Algorithm-Accelerator Co-design for AI Solutions on Edge Devices

no code implementations14 Oct 2020 Cong Hao, Yao Chen, Xiaofan Zhang, Yuhong Li, JinJun Xiong, Wen-mei Hwu, Deming Chen

High quality AI solutions require joint optimization of AI algorithms, such as deep neural networks (DNNs), and their hardware accelerators.

GAP++: Learning to generate target-conditioned adversarial examples

no code implementations9 Jun 2020 Xiaofeng Mao, Yuefeng Chen, Yuhong Li, Yuan He, Hui Xue

Different from previous single-target attack models, our model can conduct target-conditioned attacks by learning the relations of attack target and the semantics in image.

Computational Efficiency

EDD: Efficient Differentiable DNN Architecture and Implementation Co-search for Embedded AI Solutions

no code implementations6 May 2020 Yuhong Li, Cong Hao, Xiaofan Zhang, Xinheng Liu, Yao Chen, JinJun Xiong, Wen-mei Hwu, Deming Chen

We formulate the co-search problem by fusing DNN search variables and hardware implementation variables into one solution space, and maximize both algorithm accuracy and hardware implementation quality.

Neural Architecture Search

Self-supervised Adversarial Training

1 code implementation15 Nov 2019 Kejiang Chen, Hang Zhou, Yuefeng Chen, Xiaofeng Mao, Yuhong Li, Yuan He, Hui Xue, Weiming Zhang, Nenghai Yu

Recent work has demonstrated that neural networks are vulnerable to adversarial examples.

Self-Supervised Learning

Learning To Characterize Adversarial Subspaces

no code implementations15 Nov 2019 Xiaofeng Mao, Yuefeng Chen, Yuhong Li, Yuan He, Hui Xue

To detect these adversarial examples, previous methods use artificially designed metrics to characterize the properties of \textit{adversarial subspaces} where adversarial examples lie.

Self-Supervised Learning For Few-Shot Image Classification

2 code implementations14 Nov 2019 Da Chen, Yuefeng Chen, Yuhong Li, Feng Mao, Yuan He, Hui Xue

In this paper, we proposed to train a more generalized embedding network with self-supervised learning (SSL) which can provide robust representation for downstream tasks by learning from the data itself.

Classification cross-domain few-shot learning +3

SkyNet: A Champion Model for DAC-SDC on Low Power Object Detection

1 code implementation25 Jun 2019 Xiaofan Zhang, Cong Hao, Haoming Lu, Jiachen Li, Yuhong Li, Yuchen Fan, Kyle Rupnow, JinJun Xiong, Thomas Huang, Honghui Shi, Wen-mei Hwu, Deming Chen

Developing artificial intelligence (AI) at the edge is always challenging, since edge devices have limited computation capability and memory resources but need to meet demanding requirements, such as real-time processing, high throughput performance, and high inference accuracy.

object-detection Object Detection

A Bi-Directional Co-Design Approach to Enable Deep Learning on IoT Devices

2 code implementations20 May 2019 Xiaofan Zhang, Cong Hao, Yuhong Li, Yao Chen, JinJun Xiong, Wen-mei Hwu, Deming Chen

Developing deep learning models for resource-constrained Internet-of-Things (IoT) devices is challenging, as it is difficult to achieve both good quality of results (QoR), such as DNN model inference accuracy, and quality of service (QoS), such as inference latency, throughput, and power consumption.

object-detection Object Detection

FPGA/DNN Co-Design: An Efficient Design Methodology for IoT Intelligence on the Edge

2 code implementations9 Apr 2019 Cong Hao, Xiaofan Zhang, Yuhong Li, Sitao Huang, JinJun Xiong, Kyle Rupnow, Wen-mei Hwu, Deming Chen

While embedded FPGAs are attractive platforms for DNN acceleration on edge-devices due to their low latency and high energy efficiency, the scarcity of resources of edge-scale FPGA devices also makes it challenging for DNN deployment.

C++ code object-detection +1

Bilinear Representation for Language-based Image Editing Using Conditional Generative Adversarial Networks

1 code implementation18 Mar 2019 Xiaofeng Mao, Yuefeng Chen, Yuhong Li, Tao Xiong, Yuan He, Hui Xue

The task of Language-Based Image Editing (LBIE) aims at generating a target image by editing the source image based on the given language description.

Generative Adversarial Network

SiamVGG: Visual Tracking using Deeper Siamese Networks

4 code implementations7 Feb 2019 Yuhong Li, Xiaofan Zhang, Deming Chen

It combines a Convolutional Neural Network (CNN) backbone and a cross-correlation operator, and takes advantage of the features from exemplary images for more accurate object tracking.

Visual Object Tracking Visual Tracking

CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

13 code implementations CVPR 2018 Yuhong Li, Xiaofan Zhang, Deming Chen

We demonstrate CSRNet on four datasets (ShanghaiTech dataset, the UCF_CC_50 dataset, the WorldEXPO'10 dataset, and the UCSD dataset) and we deliver the state-of-the-art performance.

Crowd Counting Scene Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.