Text based Person Search
18 papers with code • 0 benchmarks • 2 datasets
Benchmarks
These leaderboards are used to track progress in Text based Person Search
Most implemented papers
Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search
Secondly, a BERT with locality-constrained attention is proposed to obtain representations of descriptions at different scales.
Learning Granularity-Unified Representations for Text-to-Image Person Re-identification
In PGU, we adopt a set of shared and learnable prototypes as the queries to extract diverse and semantically aligned features for both modalities in the granularity-unified feature space, which further promotes the ReID performance.
TIPCB: A Simple but Effective Part-based Convolutional Baseline for Text-based Person Search
Text-based person search is a sub-task in the field of image retrieval, which aims to retrieve target person images according to a given textual description.
Text-based Person Search in Full Images via Semantic-Driven Proposal Generation
Finding target persons in full scene images with a query of text description has important practical applications in intelligent video surveillance. However, different from the real-world scenarios where the bounding boxes are not available, existing text-based person retrieval methods mainly focus on the cross modal matching between the query text descriptions and the gallery of cropped pedestrian images.
Text-Based Person Search with Limited Data
Firstly, to fully utilize the existing small-scale benchmarking datasets for more discriminative feature learning, we introduce a cross-modal momentum contrastive learning framework to enrich the training data for a given mini-batch.
Learning Semantic-Aligned Feature Representation for Text-based Person Search
In this paper, we propose a semantic-aligned embedding method for text-based person search, in which the feature alignment across modalities is achieved by automatically learning the semantic-aligned visual features and textual features.
CLIP-Driven Fine-grained Text-Image Person Re-identification
Secondly, cross-grained feature refinement (CFR) and fine-grained correspondence discovery (FCD) modules are proposed to establish the cross-grained and fine-grained interactions between modalities, which can filter out non-modality-shared image patches/words and mine cross-modal correspondences from coarse to fine.
A Simple and Robust Correlation Filtering Method for Text-based Person Search
Text-based person search aims to associate pedestrian images with natural language descriptions.
Person Text-Image Matching via Text-Feature Interpretability Embedding and External Attack Node Implantation
Specifically, we improve the interpretability of text features by providing them with consistent semantic information with image features to achieve the alignment of text and describe image region features. To address the challenges posed by the diversity of text and the corresponding person images, we treat the variation caused by diversity to features as caused by perturbation information and propose a novel adversarial attack and defense method to solve it.
Asymmetric Cross-Scale Alignment for Text-Based Person Search
To implement this task, one needs to extract multi-scale features from both image and text domains, and then perform the cross-modal alignment.