Collaborative Inference

9 papers with code • 0 benchmarks • 0 datasets

In collaborative inference, a single inference task is performed by multiple models distributed on two or more (typically resource-constrained IoT) devices

Most implemented papers

MLink: Linking Black-Box Models from Multiple Domains for Collaborative Inference

yuanmu97/mlink 28 Sep 2022

The cost efficiency of model inference is critical to real-world machine learning (ML) applications, especially for delay-sensitive tasks and resource-limited devices.

Dual Attention Networks for Multimodal Reasoning and Matching

iammrhelo/pytorch-vqa-dan CVPR 2017

We propose Dual Attention Networks (DANs) which jointly leverage visual and textual attention mechanisms to capture fine-grained interplay between vision and language.

PRICURE: Privacy-Preserving Collaborative Inference in a Multi-Party Setting

um-dsp/PRICURE 19 Feb 2021

This paper presents PRICURE, a system that combines complementary strengths of secure multi-party computation (SMPC) and differential privacy (DP) to enable privacy-preserving collaborative prediction among multiple model owners.

SubTab: Subsetting Features of Tabular Data for Self-Supervised Representation Learning

astrazeneca/subtab NeurIPS 2021

Self-supervised learning has been shown to be very effective in learning useful representations, and yet much of the success is achieved in data types such as images, audio, and text.

Multi-Agent Collaborative Inference via DNN Decoupling: Intermediate Feature Compression and Edge Learning

Hao840/MAHPPO 24 May 2022

In this paper, we study the multi-agent collaborative inference scenario, where a single edge server coordinates the inference of multiple UEs.

Decentralized Low-Latency Collaborative Inference via Ensembles on the Edge

maymalka10/ensembles-on-the-edge 7 Jun 2022

The success of deep neural networks (DNNs) is heavily dependent on computational resources.

DuetFace: Collaborative Privacy-Preserving Face Recognition via Channel Splitting in the Frequency Domain

Tencent/TFace 15 Jul 2022

To compensate, the method introduces a plug-in interactive block to allow attention transfer from the client-side by producing a feature mask.

Petals: Collaborative Inference and Fine-tuning of Large Models

bigscience-workshop/petals 2 Sep 2022

However, these techniques have innate limitations: offloading is too slow for interactive inference, while APIs are not flexible enough for research that requires access to weights, attention or logits.

Architectural Vision for Quantum Computing in the Edge-Cloud Continuum

rezafuru/quantensplit 9 May 2023

We discuss the necessity, challenges, and solution approaches for extending existing work on classical edge computing to integrate QPUs.