The spatial attention mechanism captures long-range dependencies by aggregating global contextual information to each query location, which is beneficial for semantic segmentation.
no code implementations • 8 Mar 2021 • Tian Meng, Yang Tao, Ziqi Chen, Jorge R. Salas Avila, Qiaoye Ran, Yuchun Shao, Ruochen Huang, Yuedong Xie, Qian Zhao, Zhijie Zhang, Hujun Yin, Anthony J. Peyton, Wuliang Yin
Eddy current testing (ECT) is an effective technique in the evaluation of the depth of metal surface defects.
This paper presents the development of a control system for vision-guided pick-and-place tasks using a robot arm equipped with a 3D camera.
An efficient training strategy is proposed to allow a robot to learn from both human experience data and self-exploratory data.
Graph theory is combined with the CNN output for detecting semantic changes in road networks with OpenStreetMap data.
Experimental results show that pretraining on ImageNet usually improves the segmentation performance for a number of models.
Driven by the insight that 3D data is inherently sparse, we visualise the features learnt by a voxel-based classification network and show that these features are also sparse and can be pruned relatively easily, leading to more efficient neural networks.
The third method is a scaled version of the PointNet++ method and works directly on outdoor point clouds and achieves an F_score of 82. 1% on the ISPRS benchmark dataset, comparable to the state-of-the-art methods but with increased efficiency.
Although current deep learning methods have achieved impressive results for semantic segmentation, they incur high computational costs and have a huge number of parameters.
A semantic feature extraction method for multitemporal high resolution aerial image registration is proposed in this paper.
On the other hand, approaches that try to control a gradient-based update rule typically resort to computing gradients through the learning process to obtain their meta-gradients, leading to methods that can not scale beyond few-shot task adaptation.
Specifically, a shallow branch is used to preserve low-level spatial information and a deep branch is employed to extract high-level contextual features.
Standard neural network architectures are non-linear only by virtue of a simple element-wise activation function, making them both brittle and excessively large.
In this paper, a new caricature dataset is built, with the objective to facilitate research in caricature recognition.
To discover underlying local structures in the gradient domain, we compute image gradients from multiple directions and simplify them into a set of binary strings.