Inspired by Bayesian hierarchical models, we develop ActPerFL, a self-aware personalized FL method where each client can automatically balance the training of its local personal model and the global model that implicitly contributes to other clients’ training.
On the other hand, due to the increasing demands for the protection of clients' data privacy, Federated Learning (FL) has been widely adopted: FL requires models to be trained in a multi-client system and restricts sharing of raw data among clients.
To alleviate the challenge, we reformulate the problem as a variant of the restless multi-armed bandit (RMAB) problem and leverage Whittle's index theory to design an index-based scheduling policy algorithm.
Meanwhile, decentralized applications have also attracted intense attention from the online gambling community, with more and more decentralized gambling platforms created through the help of smart contracts.
Extensive experiments on federated datasets and real-world datasets demonstrate that FedLC leads to a more accurate global model and much improved performance.
In this work we design a novel knowledge distillation framework to guide the learning of the object detector and thereby restrain the overfitting in both the pre-training stage on base classes and fine-tuning stage on novel classes.
Most of existing methods for few-shot object detection follow the fine-tuning paradigm, which potentially assumes that the class-agnostic generalizable knowledge can be learned and transferred implicitly from base classes with abundant samples to novel classes with limited samples via such a two-stage training strategy.
To deal with this issue, we advocate a novel lexical enhancement method, InterFormer, that effectively reduces the amount of computational and memory costs by constructing non-flat lattices.
Ranked #9 on Chinese Named Entity Recognition on Resume NER
3) To enhance texture details, we encode facial features with geometric guidance and employ local GANs to refine the face, feet, and hands.
In the context of personalized federated learning (FL), the critical challenge is to balance local model improvement and global model tuning when the personal and global objectives may not be exactly aligned.
State-of-the-art methods strive to incorporate additional visual evidences from neighboring frames (supporting frames) to facilitate the pose estimation of the current frame (key frame).
To move towards a practical certifiable patch defense, we introduce Vision Transformer (ViT) into the framework of Derandomized Smoothing (DS).
Federated learning (FL) has been developed as a promising framework to leverage the resources of edge devices, enhance customers' privacy, comply with regulations, and reduce development costs.
Music and dance have always co-existed as pillars of human activities, contributing immensely to the cultural, social, and entertainment functions in virtually all societies.
In this paper, we introduce a novel convolutional neural model to effectively leverage explicit prior knowledge of motion anatomy, and simultaneously capture both spatial and temporal information of joint trajectory dynamics.
We first design a novel frequency enhancement module (FEM) to dig clues of camouflaged objects in the frequency domain.
The proposed method can efficiently imitate the target model through a small number of queries and achieve high attack success rate.
One aspect that has been obviated so far, is the fact that how we represent the skeletal pose has a critical impact on the prediction results.
One-shot Federated Learning (FL) has recently emerged as a promising approach, which allows the central server to learn a model in a single communication round.
We introduce an optimal transport distance for evaluating the authenticity of the generated dance distribution and a Gromov-Wasserstein distance to measure the correspondence between the dance distribution and the input music.
The deep policy gradient method has demonstrated promising results in many large-scale games, where the agent learns purely from its own experience.
This paper presents a novel Multi-metadata Embedding based Cross-Transformer (MECT) to improve the performance of Chinese NER by fusing the structural information of Chinese characters.
Our method is compelling in that it enables manipulable motion prediction across activity types and allows customization of the human movement in a variety of fine-grained ways.
Multi-frame human pose estimation in complicated situations is challenging.
Ranked #1 on Multi-Person Pose Estimation on PoseTrack2017 (using extra training data)
In this way, all the operations in the training and inference can be bit-wise operations, pushing towards faster processing speed, decreased memory cost, and higher energy efficiency.
Compact convolutional neural networks gain efficiency mainly through depthwise convolutions, expanded channels and complex topologies, which contrarily aggravate the training process.
To address this problem, in this paper, we present a robust and efficient graph correspondence transfer (REGCT) approach for explicit spatial alignment in Re-ID.
In this paper, we propose a graph correspondence transfer (GCT) approach for person re-identification.
Batch Normalization (BN) has been proven to be quite effective at accelerating and improving the training of deep neural networks (DNNs).
Researches on deep neural networks with discrete parameters and their deployment in embedded systems have been active and promising topics.
When the vocabulary size is large, the space taken to store the model parameters becomes the bottleneck for the use of recurrent neural language models.
Current state-of-the-art systems for visual content analysis require large training sets for each class of interest, and performance degrades rapidly with fewer examples.