Text-based Person Retrieval
8 papers with code • 0 benchmarks • 0 datasets
Benchmarks
These leaderboards are used to track progress in Text-based Person Retrieval
Most implemented papers
DSSL: Deep Surroundings-person Separation Learning for Text-based Person Retrieval
Many previous methods on text-based person retrieval tasks are devoted to learning a latent common space mapping, with the purpose of extracting modality-invariant features from both visual and textual modality.
Text-based Person Search in Full Images via Semantic-Driven Proposal Generation
Finding target persons in full scene images with a query of text description has important practical applications in intelligent video surveillance. However, different from the real-world scenarios where the bounding boxes are not available, existing text-based person retrieval methods mainly focus on the cross modal matching between the query text descriptions and the gallery of cropped pedestrian images.
See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval
To explore the fine-grained alignment, we further propose two implicit semantic alignment paradigms: multi-level alignment (MLA) and bidirectional mask modeling (BMM).
A Simple and Robust Correlation Filtering Method for Text-based Person Search
Text-based person search aims to associate pedestrian images with natural language descriptions.
Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval
To alleviate these issues, we present IRRA: a cross-modal Implicit Relation Reasoning and Aligning framework that learns relations between local visual-textual tokens and enhances global image-text matching without requiring additional prior supervision.
PLIP: Language-Image Pre-training for Person Representation Learning
Extensive experiments demonstrate that our model not only significantly improves existing methods on all these tasks, but also shows great ability in the few-shot and domain generalization settings.
Towards Unified Text-based Person Retrieval: A Large-scale Multi-Attribute and Language Search Benchmark
To verify the feasibility of learning from the generated data, we develop a new joint Attribute Prompt Learning and Text Matching Learning (APTM) framework, considering the shared knowledge between attribute and text.
Noisy-Correspondence Learning for Text-to-Image Person Re-identification
Text-to-image person re-identification (TIReID) is a compelling topic in the cross-modal community, which aims to retrieve the target person based on a textual query.