In this article, we propose Human-M3, an outdoor multi-modal multi-view multi-person human pose database which includes not only multi-view RGB videos of outdoor scenes but also corresponding pointclouds.
The experimental results show that multi-view point clouds greatly improve 3D object detection and tracking accuracy regardless of complex and various outdoor environments.
Accurate cerebrovascular segmentation from Magnetic Resonance Angiography (MRA) and Computed Tomography Angiography (CTA) is of great significance in diagnosis and treatment of cerebrovascular pathology.
As each sample is annotated with multiple attribute labels, these "words" will naturally form an unordered but meaningful "sentence", which depicts the semantic information of the corresponding sample.
Unlike most existing personalized methods that learn the parameters of a personalized estimator for each person in the training set, our method learns the mapping from identity information to age estimator parameters.
Compared with contact-based fingerprint acquisition techniques, contactless acquisition has the advantages of less skin distortion, larger fingerprint area, and hygienic acquisition.
The generated text prompts are paired with corresponding video clips, and together co-train the text encoder and the video encoder via a contrastive approach.
Ranked #4 on Action Segmentation on GTEA (using extra training data)
To address this, we propose a Star-shaped Reasoning Graph Network (S-RGN).
1 code implementation • 10 Jun 2021 • Michela Antonelli, Annika Reinke, Spyridon Bakas, Keyvan Farahani, AnnetteKopp-Schneider, Bennett A. Landman, Geert Litjens, Bjoern Menze, Olaf Ronneberger, Ronald M. Summers, Bram van Ginneken, Michel Bilello, Patrick Bilic, Patrick F. Christ, Richard K. G. Do, Marc J. Gollub, Stephan H. Heckers, William R. Jarnagin, Maureen K. McHugo, Sandy Napel, Jennifer S. Goli Pernicka, Kawal Rhode, Catalina Tobon-Gomez, Eugene Vorontsov, Henkjan Huisman, James A. Meakin, Sebastien Ourselin, Manuel Wiesenfarth, Pablo Arbelaez, Byeonguk Bae, Sihong Chen, Laura Daza, Jianjiang Feng, Baochun He, Fabian Isensee, Yuanfeng Ji, Fucang Jia, Namkug Kim, Ildoo Kim, Dorit Merhof, Akshay Pai, Beomhee Park, Mathias Perslev, Ramin Rezaiifar, Oliver Rippel, Ignacio Sarasua, Wei Shen, Jaemin Son, Christian Wachinger, Liansheng Wang, Yan Wang, Yingda Xia, Daguang Xu, Zhanwei Xu, Yefeng Zheng, Amber L. Simpson, Lena Maier-Hein, M. Jorge Cardoso
Segmentation is so far the most widely investigated medical image processing task, but the various segmentation challenges have typically been organized in isolation, such that algorithm development was driven by the need to tackle a single specific clinical problem.
Segmentation of infected areas in chest CT volumes is of great significance for further diagnosis and treatment of COVID-19 patients.
In our method, we exploit semantic segmentation information to mitigate the effects of dynamic objects and occlusions in the scene, and to improve depth prediction performance by considering the correlation between depth and semantics.
Given a pair of fingerprints to match, we bypass the minutiae extraction step and take uniformly sampled points as key points.
In this paper, we propose a graph-based kinship reasoning (GKR) network for kinship verification, which aims to effectively perform relational reasoning on the extracted features of an image pair.
In addition, based on the proposed registration algorithm, we propose a fingerprint mosaicking method based on optimal seam selection.
Existing methods for age estimation usually apply a divide-and-conquer strategy to deal with heterogeneous data caused by the non-stationary aging process.
In this paper, we propose a novel heart segmentation pipeline Combining Faster R-CNN and U-net Network (CFUN).
In this paper, we propose an unsupervised feature learning method called deep binary descriptor with multi-quantization (DBD-MQ) for visual matching.
In this paper, we propose a consistent-aware deep learning (CADL) framework for person re-identification in a camera network.
Specifically, we construct a pair of deep convolutional neural networks (CNNs) for the RGB and depth data, and concatenate them at the top layer of the network with a loss function which learns a new feature space where both correlated part and the individual part of the RGB-D information are well modelled.
It is desirable to combine multiple feature descriptors to improve the visual tracking performance because different features can provide complementary information to describe objects of interest.