Search Results for author: Yunxin Liu

Found 38 papers, 15 papers with code

Generalized Robot Learning Framework

no code implementations18 Sep 2024 Jiahuan Yan, Zhouyang Hong, Yu Zhao, Yu Tian, Yunxin Liu, Travis Davies, Luhui Hu

Imitation based robot learning has recently gained significant attention in the robotics field due to its theoretical potential for transferability and generalizability.

Imitation Learning

A First Look At Efficient And Secure On-Device LLM Inference Against KV Leakage

no code implementations6 Sep 2024 Huan Yang, Deyu Zhang, Yudong Zhao, Yuanchun Li, Yunxin Liu

With the advent of lightweight LLM models and specially designed GPUs, on-device LLM inference has achieved the necessary accuracy and performance metrics.

NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?

2 code implementations16 Jul 2024 Mo Li, Songyang Zhang, Yunxin Liu, Kai Chen

In evaluating the long-context capabilities of large language models (LLMs), identifying content relevant to a user's query from original long documents is a crucial prerequisite for any LLM to answer questions based on long text.

4k 8k +2

LoRA-Switch: Boosting the Efficiency of Dynamic LLM Adapters via System-Algorithm Co-design

no code implementations28 May 2024 Rui Kong, Qiyang Li, Xinyu Fang, Qingtian Feng, Qingfeng He, Yazhu Dong, Weijun Wang, Yuanchun Li, Linghe Kong, Yunxin Liu

Recent literature has found that an effective method to customize or further improve large language models (LLMs) is to add dynamic adapters, such as low-rank adapters (LoRA) with Mixture-of-Experts (MoE) structures.

A Survey of Resource-efficient LLM and Multimodal Foundation Models

1 code implementation16 Jan 2024 Mengwei Xu, Wangsong Yin, Dongqi Cai, Rongjie Yi, Daliang Xu, QiPeng Wang, Bingyang Wu, Yihao Zhao, Chen Yang, Shihe Wang, Qiyang Zhang, Zhenyan Lu, Li Zhang, Shangguang Wang, Yuanchun Li, Yunxin Liu, Xin Jin, Xuanzhe Liu

Large foundation models, including large language models (LLMs), vision transformers (ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine learning lifecycle, from training to deployment.

Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security

2 code implementations10 Jan 2024 Yuanchun Li, Hao Wen, Weijun Wang, Xiangyu Li, Yizhen Yuan, Guohong Liu, Jiacheng Liu, Wenxing Xu, Xiang Wang, Yi Sun, Rui Kong, Yile Wang, Hanfei Geng, Jian Luan, Xuefeng Jin, Zilong Ye, Guanjing Xiong, Fan Zhang, Xiang Li, Mengwei Xu, Zhijun Li, Peng Li, Yang Liu, Ya-Qin Zhang, Yunxin Liu

Next, we discuss several key challenges to achieve intelligent, efficient and secure Personal LLM Agents, followed by a comprehensive survey of representative solutions to address these challenges.

BiSwift: Bandwidth Orchestrator for Multi-Stream Video Analytics on Edge

no code implementations25 Dec 2023 Lin Sun, Weijun Wang, Tingting Yuan, Liang Mi, Haipeng Dai, Yunxin Liu, XiaoMing Fu

To achieve this goal, we propose BiSwift, a bi-level framework that scales the concurrent real-time video analytics by a novel adaptive hybrid codec integrated with multi-level pipelines, and a global bandwidth controller for multiple video streams.

Fairness Management +3

Empowering In-Browser Deep Learning Inference on Edge Devices with Just-in-Time Kernel Optimizations

no code implementations16 Sep 2023 Fucheng Jia, Shiqi Jiang, Ting Cao, Wei Cui, Tianrui Xia, Xu Cao, Yuanchun Li, Deyu Zhang, Ju Ren, Yunxin Liu, Lili Qiu, Mao Yang

Web is increasingly becoming the primary platform to deliver AI services onto edge devices, making in-browser deep learning (DL) inference more prominent.

Generative Model for Models: Rapid DNN Customization for Diverse Tasks and Resource Constraints

no code implementations29 Aug 2023 Wenxing Xu, Yuanchun Li, Jiacheng Liu, Yi Sun, Zhengyang Cao, Yixuan Li, Hao Wen, Yunxin Liu

Unlike cloud-based deep learning models that are often large and uniform, edge-deployed models usually demand customization for domain-specific tasks and resource-limited environments.

Image Classification object-detection +1

AutoDroid: LLM-powered Task Automation in Android

1 code implementation29 Aug 2023 Hao Wen, Yuanchun Li, Guohong Liu, Shanhui Zhao, Tao Yu, Toby Jia-Jun Li, Shiqi Jiang, Yunhao Liu, Yaqin Zhang, Yunxin Liu

Mobile task automation is an attractive technique that aims to enable voice-based hands-free user interaction with smartphones.

Language Modelling

SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable Memory Budget

no code implementations29 Aug 2023 Rui Kong, Yuanchun Li, Qingtian Feng, Weijun Wang, Xiaozhou Ye, Ye Ouyang, Linghe Kong, Yunxin Liu

Mixture of experts (MoE) is a popular technique to improve capacity of Large Language Models (LLMs) with conditionally-activated parallel experts.

object-detection Object Detection +1

PatchBackdoor: Backdoor Attack against Deep Neural Networks without Model Modification

1 code implementation22 Aug 2023 Yizhen Yuan, Rui Kong, Shenghao Xie, Yuanchun Li, Yunxin Liu

However, most backdoor attacks have to modify the neural network models through training with poisoned data and/or direct model editing, which leads to a common but false belief that backdoor attack can be easily avoided by properly protecting the model.

Backdoor Attack Real-World Adversarial Attack

AIGC Empowering Telecom Sector White Paper_chinese

no code implementations21 Jul 2023 Ye Ouyang, Yaqin Zhang, Xiaozhou Ye, Yunxin Liu, Yong Song, Yang Liu, Sen Bian, Zhiyong Liu

Through the study of GPT, a typical representative of AIGC, the authors have analyzed how GPT empowers the telecom sector in the form of scenarios, discussed the gap between the current GPT general model and telecom services, proposed for the first time a Telco Augmented Cognition capability system, provided answers to how to construct a telecom service GPT in the telecom sector, and carried out various practices.

6G Network Business Support System

no code implementations19 Jul 2023 Ye Ouyang, Yaqin Zhang, Peng Wang, Yunxin Liu, Wen Qiao, Jun Zhu, Yang Liu, Feng Zhang, Shuling Wang, Xidong Wang

6G is the next-generation intelligent and integrated digital information infrastructure, characterized by ubiquitous interconnection, native intelligence, multi-dimensional perception, global coverage, green and low-carbon, native network security, etc.

AdaptiveNet: Post-deployment Neural Architecture Adaptation for Diverse Edge Environments

no code implementations13 Mar 2023 Hao Wen, Yuanchun Li, Zunshuai Zhang, Shiqi Jiang, Xiaozhou Ye, Ye Ouyang, Ya-Qin Zhang, Yunxin Liu

Model elastification generates a high-quality search space of model architectures with the guidance of a developer-specified oracle model.

valid

StrokeGAN+: Few-Shot Semi-Supervised Chinese Font Generation with Stroke Encoding

no code implementations11 Nov 2022 Jinshan Zeng, Yefei Wang, Qi Chen, Yunxin Liu, Mingwen Wang, Yuan YAO

The effectiveness of the proposed model for the zero-shot traditional Chinese font generation is also evaluated in this paper.

Font Generation

Nesting Forward Automatic Differentiation for Memory-Efficient Deep Neural Network Training

no code implementations22 Sep 2022 Cong Guo, Yuxian Qiu, Jingwen Leng, Chen Zhang, Ying Cao, Quanlu Zhang, Yunxin Liu, Fan Yang, Minyi Guo

An activation function is an element-wise mathematical function and plays a crucial role in deep neural networks (DNN).

ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization

1 code implementation30 Aug 2022 Cong Guo, Chen Zhang, Jingwen Leng, Zihan Liu, Fan Yang, Yunxin Liu, Minyi Guo, Yuhao Zhu

In this work, we propose a fixed-length adaptive numerical data type called ANT to achieve low-bit quantization with tiny hardware overheads.

Quantization

Reducing Capacity Gap in Knowledge Distillation with Review Mechanism for Crowd Counting

1 code implementation11 Jun 2022 Yunxin Liu, Qiaosi Yi, Jinshan Zeng

Besides the lightweight models, we also show that the suggested review mechanism can be used as a plug-and-play module to further boost the performance of a kind of heavy crowd counting models without modifying the neural network architecture and introducing any additional model parameter.

Computational Efficiency Crowd Counting +1

SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation

1 code implementation ICLR 2022 Cong Guo, Yuxian Qiu, Jingwen Leng, Xiaotian Gao, Chen Zhang, Yunxin Liu, Fan Yang, Yuhao Zhu, Minyi Guo

This paper proposes an on-the-fly DFQ framework with sub-second quantization time, called SQuant, which can quantize networks on inference-only devices with low computation and memory requirements.

Data Free Quantization

FedBalancer: Data and Pace Control for Efficient Federated Learning on Heterogeneous Clients

1 code implementation5 Jan 2022 Jaemin Shin, Yuanchun Li, Yunxin Liu, Sung-Ju Lee

Federated Learning (FL) trains a machine learning model on distributed clients without exposing individual data.

Federated Learning

Boosting Mobile CNN Inference through Semantic Memory

no code implementations5 Dec 2021 Yun Li, Chen Zhang, Shihao Han, Li Lyna Zhang, Baoqun Yin, Yunxin Liu, Mengwei Xu

Human brains are known to be capable of speeding up visual recognition of repeatedly presented objects through faster memory encoding and accessing procedures on activated neurons.

DAPPER: Label-Free Performance Estimation after Personalization for Heterogeneous Mobile Sensing

no code implementations22 Nov 2021 Taesik Gong, Yewon Kim, Adiba Orzikulova, Yunxin Liu, Sung Ju Hwang, Jinwoo Shin, Sung-Ju Lee

However, various factors such as different users, devices, and environments impact the performance of such applications, thus making the domain shift (i. e., distributional shift between the training domain and the target domain) a critical issue in mobile sensing.

Domain Adaptation

Representational Continuity for Unsupervised Continual Learning

2 code implementations ICLR 2022 Divyam Madaan, Jaehong Yoon, Yuanchun Li, Yunxin Liu, Sung Ju Hwang

Continual learning (CL) aims to learn a sequence of tasks without forgetting the previously acquired knowledge.

Continual Learning

ModelDiff: Testing-Based DNN Similarity Comparison for Model Reuse Detection

1 code implementation11 Jun 2021 Yuanchun Li, Ziqi Zhang, Bingyan Liu, Ziyue Yang, Yunxin Liu

The knowledge of a deep learning model may be transferred to a student model, leading to intellectual property infringement or vulnerability propagation.

Model Compression Transfer Learning

Dual-side Sparse Tensor Core

no code implementations20 May 2021 Yang Wang, Chen Zhang, Zhiqiang Xie, Cong Guo, Yunxin Liu, Jingwen Leng

We demonstrate the feasibility of our design with minimal changes to the existing production-scale inner-product-based Tensor Core.

LEAP: TrustZone Based Developer-Friendly TEE for Intelligent Mobile Apps

no code implementations4 Feb 2021 Lizhi Sun, Shuocheng Wang, Hao Wu, Yuhang Gong, Fengyuan Xu, Yunxin Liu, Hao Han, Sheng Zhong

ARM TrustZone is widely deployed on commercial-off-the-shelf mobile devices for secure execution.

Cryptography and Security

DeepPayload: Black-box Backdoor Attack on Deep Learning Models through Neural Payload Injection

no code implementations18 Jan 2021 Yuanchun Li, Jiayi Hua, Haoyu Wang, Chunyang Chen, Yunxin Liu

The core of the attack is a neural conditional branch constructed with a trigger detector and several operators and injected into the victim model as a malicious payload.

Backdoor Attack

StrokeGAN: Reducing Mode Collapse in Chinese Font Generation via Stroke Encoding

1 code implementation16 Dec 2020 Jinshan Zeng, Qi Chen, Yunxin Liu, Mingwen Wang, Yuan YAO

However, these deep generative models may suffer from the mode collapse issue, which significantly degrades the diversity and quality of generated results.

Diversity Font Generation

PaGraph: Scaling GNN Training on Large Graphs via Computation-aware Caching and Partitioning

no code implementations Proceedings of the 11th ACM Symposium on Cloud Computing 2020 Zhiqi Lin, Cheng Li, Youshan Miao, Yunxin Liu, Yinlong Xu

Emerging graph neural networks (GNNs) have extended the successes of deep learning techniques against datasets like images and texts to more complex graph-structured data.

Characterizing Impacts of Heterogeneity in Federated Learning upon Large-Scale Smartphone Data

no code implementations12 Jun 2020 Chengxu Yang, Qipeng Wang, Mengwei Xu, Zhenpeng Chen, Kaigui Bian, Yunxin Liu, Xuanzhe Liu

Based on the data and the platform, we conduct extensive experiments to compare the performance of state-of-the-art FL algorithms under heterogeneity-aware and heterogeneity-unaware settings.

Fairness Federated Learning +1

Fast Hardware-Aware Neural Architecture Search

1 code implementation25 Oct 2019 Li Lyna Zhang, Yuqing Yang, Yuhang Jiang, Wenwu Zhu, Yunxin Liu

Unlike previous approaches that apply search algorithms on a small, human-designed search space without considering hardware diversity, we propose HURRICANE that explores the automatic hardware-aware search over a much larger search space and a two-stage search algorithm, to efficiently generate tailored models for different types of hardware.

Diversity Hardware Aware Neural Architecture Search +1

Approximate Query Service on Autonomous IoT Cameras

no code implementations2 Sep 2019 Mengwei Xu, Xiwen Zhang, Yunxin Liu, Gang Huang, Xuanzhe Liu, Felix Xiaozhu Lin

Elf is a runtime for an energy-constrained camera to continuously summarize video scenes as approximate object counts.

Databases

Video Analytics with Zero-streaming Cameras

no code implementations28 Apr 2019 Mengwei Xu, Tiantu Xu, Yunxin Liu, Felix Xiaozhu Lin

For efficiency, we advocate for these cameras to be zero streaming: capturing videos to local storage and communicating with the cloud only when analytics is requested.

DeepCache: Principled Cache for Mobile Deep Vision

1 code implementation1 Dec 2017 Mengwei Xu, Mengze Zhu, Yunxin Liu, Felix Xiaozhu Lin, Xuanzhe Liu

We present DeepCache, a principled cache design for deep learning inference in continuous mobile vision.

Video Compression

Cannot find the paper you are looking for? You can Submit a new open access paper.