Nevertheless, how to efficiently model the spatial-temporal skeleton graph without introducing extra computation burden is a challenging problem for industrial deployment.
Retrosynthesis prediction is one of the fundamental challenges in organic chemistry and related fields.
In this study, we investigate the expressive power of deep rectified quadratic unit (ReQU) neural networks for approximating the solution maps of parametric PDEs.
The experimental results show that the application of multi-stage transfer and class-balanced loss function can effectively improve the grading performance metrics such as accuracy and quadratic weighted kappa.
To overcome these issues, we propose unbiased Dense Contrastive Visual-Linguistic Pretraining (DCVLP), which replaces the region regression and classification with cross-modality region contrastive learning that requires no annotations.
In this work, we consider the decentralized optimization problem in which a network of $n$ agents, each possessing a smooth and convex objective function, wish to collaboratively minimize the average of all the objective functions through peer-to-peer communication in a directed graph.
The second algorithm is a broadcast-like version of CPP (B-CPP), and it also achieves linear convergence rate under the same conditions on the objective functions.
A multi-modal framework to generate user intention distributions when operating a mobile vehicle is proposed in this work.
Learners may post their feelings of confusion and struggle in the respective MOOC forums, but with the large volume of posts and high workloads for MOOC instructors, it is unlikely that the instructors can identify all learners requiring intervention.
Existing methods for skeleton-based action recognition mainly focus on improving the recognition accuracy, whereas the efficiency of the model is rarely considered.
We experimentally analyze all four observed lasing BICs by imaging their far-field polarization vortices and their associated topological charges.
While Massive Open Online Course (MOOCs) platforms provide knowledge in a new and unique way, the very high number of dropouts is a significant drawback.
An exploratory study on social interactions of MOOC students in FutureLearn was conducted, to answer "how can we cluster students based on their social interactions?"
We evaluate CVLP on several down-stream tasks, including VQA, GQA and NLVR2 to validate the superiority of contrastive learning on multi-modality representation learning.
Trust and distrust are common in the opinion interactions among agents in social networks, and they are described by the edges with positive and negative weights in the signed digraph, respectively.
Besides, from the data aspect, we introduce a skeletal data decoupling technique to emphasize the specific characteristics of space/time and different motion scales, resulting in a more comprehensive understanding of the human actions. To test the effectiveness of the proposed method, extensive experiments are conducted on four challenging datasets for skeleton-based gesture and action recognition, namely, SHREC, DHG, NTU-60 and NTU-120, where DSTA-Net achieves state-of-the-art performance on all of them.
Ranked #11 on Skeleton Based Action Recognition on NTU RGB+D
In this paper, we study the asymptotic properties of regularized least squares with indefinite kernels in reproducing kernel Krein spaces (RKKS).
The two perspectives are orthogonal and complementary to each other; and by fusing them in a unified framework, our method achieves a more comprehensive understanding of the skeleton data.
In the first part, we examine the dynamics of bipartite tracking for first-order MASs, second-order MASs and general linear MASs in the presence of asynchronous interactions, respectively.
How can we infer such a mobility model from the single trajectory information?
To solve the issue for the intermediate layers, we propose an efficient Quaternion Block Network (QBN) to learn interaction not only for the last layer but also for all intermediate layers simultaneously.
Second, the second-order information of the skeleton data, i. e., the length and orientation of the bones, is rarely investigated, which is naturally more informative and discriminative for the human action recognition.
Existing methods exploit the joint positions to extract the body-part features from the activation map of the convolutional networks to assist human action recognition.
However, the topology of the graph is set by hand and fixed over all layers, which may be not optimal for the action recognition task and the hierarchical CNN structures.
Ranked #36 on Skeleton Based Action Recognition on NTU RGB+D
Rapidly estimating the remaining wall thickness (RWT) is paramount for the non-destructive condition assessment evaluation of large critical metallic pipelines.
The skeleton data have been widely used for the action recognition tasks since they can robustly accommodate dynamic circumstances and complex backgrounds.
Ranked #6 on Skeleton Based Action Recognition on UAV-Human
Fine-grained Named Entity Recognition is a task whereby we detect and classify entity mentions to a large set of types.
The prevalence of networked sensors and actuators in many real-world systems such as smart buildings, factories, power plants, and data centers generate substantial amounts of multivariate time series data for these systems.
Based on refined covering number estimates, we find that, to realize some complex data features, deep nets can improve the performances of shallow neural networks (shallow nets for short) without requiring additional capacity costs.
This paper generalizes regularized regression problems in a hyper-reproducing kernel Hilbert space (hyper-RKHS), illustrates its utility for kernel learning and out-of-sample extensions, and proves asymptotic convergence results for the introduced regression models in an approximation theory view.
In addition, the second-order information (the lengths and directions of bones) of the skeleton data, which is naturally more informative and discriminative for action recognition, is rarely investigated in existing methods.
Ranked #4 on Skeleton Based Action Recognition on UAV-Human
In this paper we study the convergence of online gradient descent algorithms in reproducing kernel Hilbert spaces (RKHSs) without regularization.
An ultrasonic sensor array is employed to provide the range information from the target person to the robot and Gaussian Process Regression is used for partial location estimation (2-D).
Aiming at overexposure correction for computed tomography (CT) reconstruction, we in this paper propose a mixed one-bit compressive sensing (M1bit-CS) to acquire information from both regular and saturated measurements.
Among these parameters, visual and step are very significant in view of the fact that artificial fish basically move based on these parameters.
As the complexity of deep neural networks (DNNs) trend to grow to absorb the increasing sizes of data, memory and energy consumption has been receiving more and more attentions for industrial applications, especially on mobile devices.
As scene images have larger diversity than the iconic object images, it is more challenging for deep learning methods to automatically learn features from scene images with less samples.
Furthermore, results show that the features automatically learned from the raw input range data can achieve competitive results to the features constructed based on statistical and geometrical information.
The one-sided $\ell_1$ loss and the linear loss are two popular loss functions for 1bit-CS.
This paper extends to propose sparse additive model with low rank background (SAM-LRB), and simple yet efficient estimation.
Here we propose a simple mechanism for Bayesian inference which involves averaging over a few feature detection neurons which fire at a rate determined by their similarity to a sensory stimulus.