In this article, we propose a simulated crowd counting dataset CrowdX, which has a large scale, accurate labeling, parameterized realization, and high fidelity.
In this work, we propose a novel framework which follows the anchor-based idea and aims at conveying distance information implicitly along the MPNN message passing steps for encoding position information, node attributes, and graph structure in a more flexible way.
Specifically, our model use the graph neural network framework with powerful representation capabilities to represent the interaction between group-user-items in the topological structure of the graph, and at the same time, analyze the interaction pattern of the graph to adjust the feature output of the graph neural network, the feature representations of groups, and items are obtained to calculate the group's preference for items.
We show that GIRs get outperformed results in position-aware scenarios, and performances on typical GNNs could be improved by fusing GIR embeddings.
Place recognition is one of the hot research fields in automation technology and is still an open issue, Camera and Lidar are two mainstream sensors used in this task, Camera-based methods are easily affected by illumination and season changes, LIDAR cannot get the rich data as the image could , In this paper, we propose the PIC-Net (Point cloud and Image Collaboration Network), which use attention mechanism to fuse the features of image and point cloud, and mine the complementary information between the two.
Recognizing objects from subcategories with very subtle differences remains a challenging task due to the large intra-class and small inter-class variation.
Ranked #9 on Fine-Grained Image Classification on NABirds (using extra training data)