In this paper, we propose a novel gender bias detection method by utilizing attention map for transformer-based models.
Conventional wisdom in pruning Transformer-based language models is that pruning reduces the model expressiveness and thus is more likely to underfit rather than overfit.
Recent top-$k$ computation efforts explore the possibility of revising various sorting algorithms to answer top-$k$ queries on GPUs.
Federated edge learning (FEEL) has emerged as a revolutionary paradigm to develop AI services at the edge of 6G wireless networks as it supports collaborative model training at a massive number of mobile devices.
Deep complex networks (DCN), in contrast, can learn from complex data, but have high computational costs; therefore, they cannot satisfy the instant decision-making requirements of many deployable systems dealing with short observations or short signal bursts.
This work demonstrates that it is practicable for the blind people to feel the world through the brush in their hands.
Then, we analyze the model aggregation error in a single-relay case and show that our relay-assisted scheme achieves a smaller error than the one without relays provided that the relay transmit power and the relay channel gains are sufficiently large.
With weights stored in the ReRAM crossbar cells as conductance, when the input vector is applied to word lines, the matrix-vector multiplication results can be generated as the current in bit lines.
While discrete-event simulators are essential tools for architecture research, design, and development, their practicality is limited by an extremely long time-to-solution for realistic applications under investigation.
In this paper, as the first attempt, we formulate the gradient attack problem on the Transformer-based language models and propose a gradient attack algorithm, TAG, to reconstruct the local training data.
Federated Learning Cryptography and Security
We study over-the-air model aggregation in federated edge learning (FEEL) systems, where channel state information at the transmitters (CSIT) is assumed to be unavailable.
Conversational Query Rewriting (CQR) aims to simplify the multi-turn dialogue modeling into a single-turn problem by explicitly rewriting the conversational query into a self-contained utterance.
In a RIS-aided MIMO system, the acquisition of channel state information (CSI) is important for achieving passive beamforming gains of the RIS, but is also challenging due to the cascaded property of the transmitter-RIS-receiver channel and the lack of signal processing capability of the passive RIS elements.
Bayesian Inference Information Theory Information Theory
However, due to the heterogeneity of communication capacities among edge devices, over-the-air FL suffers from the straggler issue in which the device with the weakest channel acts as a bottleneck of the model aggregation performance.
In this paper, we propose, to the best of our knowledge, the first GPU-based framework for graph sampling/random walk.
Graph Sampling Distributed, Parallel, and Cluster Computing
Pre-trained large-scale language models have increasingly demonstrated high accuracy on many natural language processing (NLP) tasks.
Distributed learning such as federated learning or collaborative learning enables model training on decentralized data from users and only collects local gradients, where data is processed close to its sources for data privacy.
The large model size, high computational operations, and vulnerability against membership inference attack (MIA) have impeded deep learning or deep neural networks (DNNs) popularity, especially on mobile devices.
In natural language processing (NLP), the "Transformer" architecture was proposed as the first transduction model replying entirely on self-attention mechanisms without using sequence-aligned recurrent neural networks (RNNs) or convolution, and it achieved significant improvements for sequence to sequence tasks.
Using the proposed deep RL scheme, each MU in the system is able to make decisions without a priori statistical knowledge of dynamics.
Reconfigurable intelligent surfaces (RISs) are regarded as a promising emerging hardware technology to improve the spectrum and energy efficiency of wireless networks by artificially reconfiguring the propagation environment of electromagnetic waves.
Information Theory Signal Processing Information Theory
A graph-based segmentation algorithm is used to segment the depth map from the depth sensor, and the segmented regions are used to guide a focus algorithm to locate in-focus image blocks from among multi-focus source images to construct the reference all-in-focus image.