Privacy-preserving Task-Agnostic Vision Transformer for Image Processing

29 Sep 2021 · Boah Kim, Jeongsol Kim, Jong Chul Ye ·

Distributed collaborative learning approaches such as federated and split learning have attracted significant attention lately due to their ability to train neural networks using data from multiple sources without sharing data. However, they are not usually suitable in applications where each client carries out different tasks with its own data. Inspired by the recent success of Vision Transformer (ViT), here we present a new distributed learning framework for image processing applications, allowing clients to learn multiple tasks with their private data. The key idea arises from a novel task-agnostic Vision Transformer that is introduced to learn the global attention independent of specific tasks. Specifically, by connecting task-specific heads and tails at client sides to a task-agnostic Transformer body at a server side, each client learns a translation from its own task to a common representation, while the Transformer body learns global attention between the features embedded in the common representation. To enable decomposition between the task-specific and common representation, we propose an alternating training strategy in which task-specific learning for the heads and tails is run on the clients by fixing the Transformer, which alternates with task-agnostic learning for the Transformer on the server by freezing the heads and tails. Experimental results on multi-task learning for various image processing show that our method synergistically improves the performance of the task-specific network of each client while maintaining privacy.

PDF Abstract