In this work, we show that Relation Graph augmented Learning (RGL) can improve the performance of few-shot natural language understanding tasks.
The first half of this tutorial will make deep nets more accessible to a broader audience, following “Deep Nets for Poets” and “A Gentle Introduction to Fine-Tuning.” We will also introduce GFT (general fine tuning), a little language for fine tuning deep nets with short (one line) programs that are as easy to code as regression in statistics packages such as R using glm (general linear models).
In this work, we study a novel self-supervised pre-training pipeline, namely Multi-task Self-super-vised Continual Learning (MUSCLE), for multiple medical imaging tasks, such as classification and segmentation, using X-ray images collected from multiple body parts, including heads, lungs, and bones.
Natural language understanding (NLU) is integral to various social media applications.
Word embedding has become ubiquitous and is widely used in various text mining and natural language processing (NLP) tasks, such as information retrieval, semantic analysis, and machine translation, among many others.
In this paper, we extend the pretraining method for cross-lingual multi-speaker speech synthesis tasks, including cross-lingual multi-speaker voice cloning and cross-lingual multi-speaker speech editing.
Interactive image segmentation aims at segmenting a target region through a way of human-computer interaction.
In recent years, the rapid development of deep learning has brought great advancements to image and video segmentation methods based on neural networks.
Therefore, we designed a U-shaped High-Resolution Network (U-HRNet), which adds more stages after the feature map with strongest semantic representation and relaxes the constraint in HRNet that all resolutions need to be calculated parallel for a newly added stage.
Though image classification datasets could provide the backbone networks with rich visual features and discriminative ability, they are incapable of fully pre-training the target model (i. e., backbone+segmentation modules) in an end-to-end manner.
PaddleSpeech is an open-source all-in-one speech toolkit.
Notably, it averagely brings about 10% relative improvement to triplet-based embedding methods on OGBL-WikiKG2 and takes 5%-83% time to achieve comparable results as the state-of-the-art GC-OTE.
1 code implementation • 20 Apr 2022 • Guowei Chen, Yi Liu, Jian Wang, Juncai Peng, Yuying Hao, Lutao Chu, Shiyu Tang, Zewu Wu, Zeyu Chen, Zhiliang Yu, Yuning Du, Qingqing Dang, Xiaoguang Hu, dianhai yu
Also, we propose a semantic context branch (SCB) that adopts a semantic segmentation subtask.
Ranked #2 on Image Matting on Distinctions-646
3 code implementations • 6 Apr 2022 • Juncai Peng, Yi Liu, Shiyu Tang, Yuying Hao, Lutao Chu, Guowei Chen, Zewu Wu, Zeyu Chen, Zhiliang Yu, Yuning Du, Qingqing Dang, Baohua Lai, Qiwen Liu, Xiaoguang Hu, dianhai yu, Yanjun Ma
Real-world applications have high demands for semantic segmentation methods.
Ranked #4 on Real-Time Semantic Segmentation on Cityscapes val
This work is the first to construct a large-scale video portrait dataset that contains 291 videos from 23 conference scenes with 14K fine-labeled frames and extensions to multi-camera teleconferencing.
In addition, with the proposed method, we develop an efficient interactive segmentation tool for practical data annotation tasks.
Ranked #2 on Interactive Segmentation on PASCAL VOC (NoC@85 metric)
In addition to the representations, we also use various statistical probabilities among the head entities, the relations and the tail entities for the final prediction.
Our experimental results in multiple metrics proved that our framework captured some typical, micro and dynamic facial features along spatiotemporal dimensions, contributing to the mild fatigue detection in the wild.
The toolkit aims to help both developers and researchers in the whole process of designing segmentation models, training models, optimizing performance and inference speed, and deploying models.
This paper further presents a real-time feed-forward model to leverage Style Projection for arbitrary image style transfer, which includes a regularization term for matching the semantics between input contents and stylized outputs.
Our experiments using the real-world data showed that SecureGBM can well secure the communication and computation of LightGBM training and inference procedures for the both parties while only losing less than 3% AUC, using the same number of iterations for gradient boosting, on a wide range of benchmark datasets.
Instead of constraining the weights of neural network, DELTA aims to preserve the outer layer outputs of the target network.
Instance segmentation has attracted recent attention in computer vision and existing methods in this domain mostly have an object detection stage.