Collaborative Inference
9 papers with code • 0 benchmarks • 0 datasets
In collaborative inference, a single inference task is performed by multiple models distributed on two or more (typically resource-constrained IoT) devices
Benchmarks
These leaderboards are used to track progress in Collaborative Inference
Most implemented papers
MLink: Linking Black-Box Models from Multiple Domains for Collaborative Inference
The cost efficiency of model inference is critical to real-world machine learning (ML) applications, especially for delay-sensitive tasks and resource-limited devices.
Dual Attention Networks for Multimodal Reasoning and Matching
We propose Dual Attention Networks (DANs) which jointly leverage visual and textual attention mechanisms to capture fine-grained interplay between vision and language.
PRICURE: Privacy-Preserving Collaborative Inference in a Multi-Party Setting
This paper presents PRICURE, a system that combines complementary strengths of secure multi-party computation (SMPC) and differential privacy (DP) to enable privacy-preserving collaborative prediction among multiple model owners.
SubTab: Subsetting Features of Tabular Data for Self-Supervised Representation Learning
Self-supervised learning has been shown to be very effective in learning useful representations, and yet much of the success is achieved in data types such as images, audio, and text.
Multi-Agent Collaborative Inference via DNN Decoupling: Intermediate Feature Compression and Edge Learning
In this paper, we study the multi-agent collaborative inference scenario, where a single edge server coordinates the inference of multiple UEs.
Decentralized Low-Latency Collaborative Inference via Ensembles on the Edge
The success of deep neural networks (DNNs) is heavily dependent on computational resources.
DuetFace: Collaborative Privacy-Preserving Face Recognition via Channel Splitting in the Frequency Domain
To compensate, the method introduces a plug-in interactive block to allow attention transfer from the client-side by producing a feature mask.
Petals: Collaborative Inference and Fine-tuning of Large Models
However, these techniques have innate limitations: offloading is too slow for interactive inference, while APIs are not flexible enough for research that requires access to weights, attention or logits.
Architectural Vision for Quantum Computing in the Edge-Cloud Continuum
We discuss the necessity, challenges, and solution approaches for extending existing work on classical edge computing to integrate QPUs.