Despite achieving remarkable performance, previous knowledge-enhanced works usually only use a single-source homogeneous knowledge base of limited knowledge coverage.
Thanks to the strong representation learning capability of deep learning, especially pre-training techniques with language model loss, dependency parsing has achieved great performance boost in the in-domain scenario with abundant labeled training data for target domains.
We are interested in exploring its capability in human pose estimation, and thus propose a novel model based on transformer architecture, enhanced with a feature pyramid fusion structure.
Experimental results show that the proposed method achieves much better performance than comparison approaches, and both adopting the proposed hybrid search space and grafting transformer module improves classification accuracy.
To address these challenges, we propose a novel vertical federated learning framework named Cascade Vertical Federated Learning (CVFL) to fully utilize all horizontally partitioned labels to train neural networks with privacy-preservation.
no code implementations • 9 Apr 2021 • Prithwish Chakraborty, James Codella, Piyush Madan, Ying Li, Hu Huang, Yoonyoung Park, Chao Yan, Ziqi Zhang, Cheng Gao, Steve Nyemba, Xu Min, Sanjib Basak, Mohamed Ghalwash, Zach Shahn, Parthasararathy Suryanarayanan, Italo Buleje, Shannon Harrer, Sarah Miller, Amol Rajmane, Colin Walsh, Jonathan Wanderer, Gigi Yuen Reed, Kenney Ng, Daby Sow, Bradley A. Malin
Deep learning architectures have an extremely high-capacity for modeling complex data in a wide variety of domains.
Discriminative correlation filters (DCF) and siamese networks have achieved promising performance on visual tracking tasks thanks to their superior computational efficiency and reliable similarity metric learning, respectively.
More importantly, the topological surface states in the LSM phase fill in the gap between the topological matters and silicon, which provide an opportunity to integrate the topological quantum devices and silicon chips together.
We demonstrate different ways to enhance a diverse range of quantum electrodynamic phenomena based on plasmonic configurations by using the classical dyadic tensor Green function formalism.
This study proposes a novel framework based on Mask R-CNN, named HTMask R-CNN, to extract new and old rural buildings even when the label is scarce.
Correspondingly, different models need to be designed for different datasets, which further increases the workload of designing architectures; 2) the mainstream framework is a patch-to-pixel framework.
For the inner search space, we propose a layer-wise architecture sharing strategy (LWAS), resulting in more flexible architectures and better performance.
In contrast to previous approaches, we do not impose restrictions over the source data sets, in which they do not have to be collected by the same sensors as the target data sets.
Point density varies significantly across such a long range, and different scanning patterns further diversify object representation in LiDAR.
This paper describes a novel multi-view classification model for knowledge graph completion, where multiple classification views are performed based on both content and context information for candidate triple evaluation.
The major challenge for current parsing research is to improve parsing performance on out-of-domain texts that are very different from the in-domain training data when there is only a small-scale out-domain labeled data.
Benefiting from its ability to efficiently learn how an object is changing, correlation filters have recently demonstrated excellent performance for rapidly tracking objects.
Given a query, our approach first retrieves a set of prototype dialogues that are relevant to the query.
There is a fundamental trade-off between the channel representation resolution of codebooks and the overheads of feedback communications in the fifth generation new radio (5G NR) frequency division duplex (FDD) massive multiple-input and multiple-output (MIMO) systems.
Stack interchanges are essential components of transportation systems.
However, the existing CNN-based models operate at the patch-level, in which pixel is separately classified into classes using a patch of images around it.
Ship detection has been an active and vital topic in the field of remote sensing for a decade, but it is still a challenging problem due to the large scale variations, the high aspect ratios, the intensive arrangement, and the background clutter disturbance.
Simulation results show that, compared to the exiting random access scheme for the crowded asynchronous massive MIMO systems, the proposed scheme can improve the uplink throughput and estimate the effective timing offsets accurately at the same time.
Then, we introduce a general route search algorithm coupled with an efficient station binding method for efficient route candidate generation.
We collect and build a large-scale Chinese dataset aligned with the commonsense knowledge for dialogue generation.
Measuring the scholarly impact of a document without citations is an important and challenging problem.
In this paper, we provide a systematic review of existing compelling deep learning architectures applied in LiDAR point clouds, detailing for specific tasks in autonomous driving such as segmentation, detection, and classification.
1 code implementation • 15 May 2020 • Francesco Piccoli, Rajarathnam Balakrishnan, Maria Jesus Perez, Moraldeepsingh Sachdeo, Carlos Nunez, Matthew Tang, Kajsa Andreasson, Kalle Bjurek, Ria Dass Raj, Ebba Davidsson, Colin Eriksson, Victor Hagman, Jonas Sjoberg, Ying Li, L. Srikar Muppirisetty, Sohini Roychowdhury
Pedestrian intention recognition is very important to develop robust and safe autonomous driving (AD) and advanced driver assistance systems (ADAS) functionalities for urban driving.
While these strategies have effectively dealt with the critical situations of outbreaks, the combination of the pandemic and mobility controls has slowed China's economic growth, resulting in the first quarterly decline of Gross Domestic Product (GDP) since GDP began to be calculated, in 1992.
Autonomous vehicles were experiencing rapid development in the past few years.
Semantic segmentation of large-scale outdoor point clouds is essential for urban scene understanding in various applications, especially autonomous driving and urban high-definition (HD) mapping.
Hyperspectral image(HSI) classification has been improved with convolutional neural network(CNN) in very recent years.
To the best of our knowledge, this paper is the first attempt to study cross-range LiDAR adaptation for object detection in point clouds.
The temporal consistency loss is combined with the spatial loss to update the model in an end-to-end fashion.
Ranked #4 on Monocular Depth Estimation on Mid-Air Dataset
Specifically, we first propose the Multi-Dilation (MD) module, which can synthesize the crack features of multiple context sizes via dilated convolution with multiple rates.
no code implementations • 2 Feb 2019 • Yijiang Lian, Zhijie Chen, Jinlong Hu, Kefeng Zhang, Chunwei Yan, Muchenxuan Tong, Wenying Han, Hanju Guan, Ying Li, Ying Cao, Yang Yu, Zhigang Li, Xiaochun Liu, Yue Wang
In this paper, we present a generative retrieval method for sponsored search engine, which uses neural machine translation (NMT) to generate keywords directly from query.
Different from RGB videos, depth data in RGB-D videos provide key complementary information for tristimulus visual data which potentially could achieve accuracy improvement for action recognition.
Recent years have witnessed a surge of interest on response generation for neural conversation systems.
The proposed method is favorable for healthcare applications because in additional to improved prediction performance, relationships among the different risks and risk factors are also identified.
Entanglement is an important evidence that a quantum device can potentially solve problems intractable for classical computers.
Generally the existing monolingual corpora are not suitable for large vocabulary continuous speech recognition (LVCSR) of code-switching speech.