We first characterize the proportion of data to sample from each region of a program's input space (corresponding to different execution paths of the program) based on the complexity of learning a surrogate of the corresponding execution path.
A natural method is to learn the temporal dynamic patterns.
Though numerous research groups and institutes collect a multitude of EEG datasets for the same BCI task, sharing EEG data from multiple sites is still challenging due to the heterogeneity of devices.
We apply our sparse accelerator on widely-used Transformer-based language models including Bert-Mini, DistilBERT, Bert-Base, and BERT-Large.
We found that the PRS313 achieved overlapping Areas under the ROC Curve (AUCs) in females of Lantix (AUC, 0. 68; 95 CI, 0. 65-0. 71) and European ancestry (AUC, 0. 70; 95 CI, 0. 69-0. 71) but lower AUCs for the AFR and EAA populations (AFR: AUC, 0. 61; 95 CI, 0. 56-0. 65; EAA: AUC, 0. 64; 95 CI, 0. 60-0. 680).
While it is tempting to use prior machine learning techniques for predicting job duration, we find that the structure of the maintenance job scheduling problem creates a unique challenge.
In this work, we propose a new pipeline for creating and running Fast Transformer models on CPUs, utilizing hardware-aware pruning, knowledge distillation, quantization, and our own Transformer inference runtime engine with optimized kernels for sparse and quantized operators.
Such solutions monitor past system executions to learn the system's behavior under different hardware resource allocations before dynamically tuning resources to optimize the application execution.
Our evaluation shows that compared to the state-of-the-art SEML approach in computer systems optimization, Cello improves latency by 1. 19X for minimizing latency under a power constraint, and improves energy by 1. 18X for minimizing energy under a latency constraint.
The visual encoding from the visual block is concatenated with the attention feature to emphasize the visual information.
To predict stragglers accurately and early without labeled positive examples or assumptions on latency distributions, this paper presents NURD, a novel Negative-Unlabeled learning approach with Reweighting and Distribution-compensation that only trains on negative and unlabeled streaming data.
We show that our proposed Discretization and Regression with generalized fOlded concaVe penalty on Effect discontinuity (DROVE) approach enjoys desirable theoretical properties and allows for statistical inference of the optimal value associated with optimal decision-making.
We propose an audio-visual spatial-temporal deep neural network with: (1) a visual block containing a pretrained 2D-CNN followed by a temporal convolutional network (TCN); (2) an aural block containing several parallel TCNs; and (3) a leader-follower attentive fusion block combining the audio-visual information.
It captures temporal dynamics of EEG which then serves as input to the proposed local and global graph-filtering layers.
TSception consists of dynamic temporal, asymmetric spatial, and high-level fusion layers, which learn discriminative representations in the time and channel dimensions simultaneously.
In this paper, we evaluate different augmentation strategies for algorithms tackling the "learning with noisy labels" problem.
Ranked #8 on Image Classification on Clothing1M (using extra training data)
In this paper, a network called Brachial Plexus Multi-instance Segmentation Network (BPMSegNet) is proposed to identify different tissues (nerves, arteries, veins, muscles) in ultrasound images.
In this paper, a novel deep learning-based key generation network (DeepKeyGen) is proposed as a stream cipher generator to generate the private key, which can then be used for encrypting and decrypting of medical images.
Moreover, the multi-view fusion loss, which consists of the segmentation loss, the transition loss and the decision loss, is proposed to facilitate the training process of multi-view learning networks so as to keep the consistency of appearance and space, not only in the process of fusing segmentation results, but also in the process of training the learning network.
Guided by the scale values generated by SCA for measuring channel importance, we further propose a new channel pruning approach called Channel Pruning guided by Spatial and Channel Attention (CPSCA).
Planimation is a modular and extensible open source framework to visualise sequential solutions of planning problems specified in PDDL.
Specifically, in DeepEDN, the Cycle-Generative Adversarial Network (Cycle-GAN) is employed as the main learning network to transfer the medical image from its original domain into the target domain.
TSception consists of temporal and spatial convolutional layers, which learn discriminative representations in the time and channel domains simultaneously.
On a large retrospective cohort, this mixture-based approach outperforms physician, kernel only, and DRL-only experts.
In this setting, we propose to screen out control units that have a weak dynamical relationship to the single treated unit before the model is fit.
no code implementations • 31 May 2018 • Omer Gottesman, Fredrik Johansson, Joshua Meier, Jack Dent, Dong-hun Lee, Srivatsan Srinivasan, Linying Zhang, Yi Ding, David Wihl, Xuefeng Peng, Jiayu Yao, Isaac Lage, Christopher Mosch, Li-wei H. Lehman, Matthieu Komorowski, Aldo Faisal, Leo Anthony Celi, David Sontag, Finale Doshi-Velez
Much attention has been devoted recently to the development of machine learning algorithms with the goal of improving treatment policies in healthcare.
The early detection and early diagnosis of lung cancer are crucial to improve the survival rate of lung cancer patients.
Gaussian process regression generally does not scale to beyond a few thousands data points without applying some sort of kernel approximation method.
Despite their encouraging results reported, the existing online AUC maximization algorithms often adopt simple online gradient descent approaches that fail to exploit the geometrical knowledge of the data observed during the online learning process, and thus could suffer from relatively larger regret.