Last, in order to incorporate a general motion space for high-quality prediction, we build a memory-based dictionary, which aims to preserve the global motion patterns in training data to guide the predictions.
GRAPHCACHE aggregates the features from sentences in the whole dataset to learn global representations of properties, and use them to augment the local features within individual sentences.
In this paper, we propose the CORE (Counterfactual Analysis based Relation Extraction) debiasing method that guides the RE models to focus on the main effects of textual context without losing the entity information.
In this work, we propose the TNS (Time-aware Neighbor Sampling) method: TNS learns from temporal information to provide an adaptive receptive neighborhood for every node at any time.
Meanwhile, for achieving higher rendering efficiency, we introduce a progressive rendering pipeline through geometry guidance, which leverages the geometric feature volume and the predicted density values to progressively reduce the number of sampling points and speed up the rendering process.
Representing a label distribution as a one-hot vector is a common practice in training node classification models.
To address this issue, our idea is to transform the temporal graphs using data augmentation (DA) with adaptive magnitudes, so as to effectively augment the input features and preserve the essential semantic information.
Instead of estimating 3D joint locations from costly volumetric representation or reconstructing the per-person 3D pose from multiple detected 2D poses as in previous methods, MvP directly regresses the multi-person 3D poses in a clean and efficient way, without relying on intermediate tasks.
Ranked #2 on 3D Multi-Person Pose Estimation on Panoptic
In this work, we propose the Mixup methods for two fundamental tasks in graph learning: node and graph classification.
Ranked #15 on Node Classification on Pubmed
no code implementations • • Yujun Cai, Yiwei Wang, Yiheng Zhu, Tat-Jen Cham, Jianfei Cai, Junsong Yuan, Jun Liu, Chuanxia Zheng, Sijie Yan, Henghui Ding, Xiaohui Shen, Ding Liu, Nadia Magnenat Thalmann
Notably, by considering this problem as a conditional generation process, we estimate a parametric distribution of the missing regions based on the input conditions, from which to sample and synthesize the full motion series.
We present a new method to regularize graph neural networks (GNNs) for better generalization in graph classification.
We employ the Earth Mover's Distance (EMD) as a metric to compute a structural distance between dense image representations to determine image relevance.
Despite great progress in 3D pose estimation from single-view images or videos, it remains a challenging task due to the substantial depth ambiguity and severe self-occlusions.
Ranked #110 on 3D Human Pose Estimation on Human3.6M
Compared with depth-based 3D hand pose estimation, it is more challenging to infer 3D hand pose from monocular RGB images, due to substantial depth ambiguity and the difficulty of obtaining fully-annotated training data.
Convolutional Neural Network (CNN) has shown promising results for 3D hand pose estimation in depth images.
Ranked #6 on Hand Pose Estimation on HANDS 2017