Few-shot fine-grained classification and person search appear as distinct tasks and literature has treated them separately.
Pushing back the frontiers of collaborative robots in industrial environments, we propose a new Separable-Sparse Graph Convolutional Network (SeS-GCN) for pose forecasting.
Our experiments show the effectiveness of our segmentation approach on thousands of real-world point clouds.
We propose a new approach of sample mixing for point cloud UDA, namely Compositional Semantic Mix (CoSMix), the first UDA approach for point cloud segmentation based on sample mixing.
We demonstrate that DL models based on Long-Short Term Memory (LSTM) and Convolution Neural Networks predict labquakes under several conditions, and that fault zone stress can be predicted with fidelity, confirming that acoustic energy is a fingerprint of fault zone stress.
This paper proposes the first in-depth study of Transformer Networks (TF) and Bidirectional Transformers (BERT) for the forecasting of the individual motion of people, without bells and whistles.
For the first time, STS-GCN models the human pose dynamics only with a graph convolutional network (GCN), including the temporal evolution and the spatial joint interaction within a single-graph framework, which allows the cross-talk of motion and spatial correlations.
Ranked #1 on Human Pose Forecasting on Human3.6M
Clustering may reduce heterogeneity by identifying the domains, but it deprives each cluster model of the data and supervision of others.
Unsupervised Domain Adaptation (UDA) is a key issue in visual recognition, as it allows to bridge different visual domains enabling robust performances in the real world.
In the case of LiDAR, in fact, domain shift is not only due to changes in the environment and in the object appearances, as for visual data from RGB cameras, but is also related to the geometry of the point clouds (e. g., point density variations).
In experimental evaluation, the combination of CIR and a plain Siamese-net with triplet loss yields best few-shot learning performance on the challenging tieredImageNet.
Notably, our joint optimization maintains the detector performance, a typical multi-task challenge.
Illumination is important for well-being, productivity and safety across several environments, including offices, retail shops and industrial warehouses.
In particular, the TF model without bells and whistles yields the best score on the largest and most challenging trajectory forecasting benchmark of TrajNet.
Ranked #10 on Trajectory Prediction on ETH/UCY
We employ this to supervise the detector of our person search model at various levels using a specialized detector.
We extend this with i. a query-guided Siamese squeeze-and-excitation network (QSSE-Net) that uses global context from both the query and gallery images, ii.
ILS may therefore dim those luminaires, which are not seen by the user, resulting in an effective energy saving, especially in large open offices (where light may otherwise be ON everywhere for a single person).
In this work, we explore the correlation between people trajectories and their head orientations.
The proposed method uses both depth data and images from the sensor to provide a dense measure of light intensity in the field of view of the camera.
Recent approaches on trajectory forecasting use tracklets to predict the future positions of pedestrians exploiting Long Short Term Memory (LSTM) architectures.
Neural network compression has recently received much attention due to the computational requirements of modern deep models.
In this paper we show the importance of the head pose estimation in the task of trajectory forecasting.
Video segmentation has become an important and active research area with a large diversity of proposed approaches.
In contrast to previous work, the reduced graph is reweighted such that the resulting segmentation is equivalent, under certain assumptions, to that of the full graph.