Search Results for author: Mu Yuan

Found 5 papers, 4 papers with code

Secure Transformer Inference

1 code implementation14 Nov 2023 Mu Yuan, Lan Zhang, Xiang-Yang Li

Our protocol, Secure Transformer Inference Protocol (STIP), can be applied to real-world services like ChatGPT.

PacketGame: Multi-Stream Packet Gating for Concurrent Video Inference at Scale

1 code implementation journal 2023 Mu Yuan, Lan Zhang, Xuanke You, Xiang-Yang Li

The resource efficiency of video analytics workloads is critical for large-scale deployments on edge nodes and cloud clusters.

Video Compression

MLink: Linking Black-Box Models from Multiple Domains for Collaborative Inference

3 code implementations28 Sep 2022 Mu Yuan, Lan Zhang, Zimu Zheng, Yi-Nan Zhang, Xiang-Yang Li

The cost efficiency of model inference is critical to real-world machine learning (ML) applications, especially for delay-sensitive tasks and resource-limited devices.

Collaborative Inference Multi-Task Learning +1

InFi: End-to-End Learning to Filter Input for Resource-Efficiency in Mobile-Centric Inference

3 code implementations28 Sep 2022 Mu Yuan, Lan Zhang, Fengxiang He, Xueting Tong, Miao-Hui Song, Zhengyuan Xu, Xiang-Yang Li

Previous efforts have tailored effective solutions for many applications, but left two essential questions unanswered: (1) theoretical filterability of an inference workload to guide the application of input filtering techniques, thereby avoiding the trial-and-error cost for resource-constrained mobile applications; (2) robust discriminability of feature embedding to allow input filtering to be widely effective for diverse inference tasks and input content.

Comprehensive and Efficient Data Labeling via Adaptive Model Scheduling

no code implementations8 Feb 2020 Mu Yuan, Lan Zhang, Xiang-Yang Li, Hui Xiong

With limited computing resources and stringent delay, given a data stream and a collection of applicable resource-hungry deep-learning models, we design a novel approach to adaptively schedule a subset of these models to execute on each data item, aiming to maximize the value of the model output (e. g., the number of high-confidence labels).

Image Retrieval Management +3

Cannot find the paper you are looking for? You can Submit a new open access paper.