Dependency-Aware Entity–Attribute Relationship Learning for Text-Based Person Search

Text-based person search (TPS), a critical technology for security and surveillance, aims to retrieve target individuals from image galleries using textual descriptions. The existing methods face two challenges: (1) ambiguous attribute–noun association (AANA), where syntactic ambiguities lead to inc...

Full description

Saved in:
Bibliographic Details
Main Authors: Wei Xia, Wenguang Gan, Xinpan Yuan
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Big Data and Cognitive Computing
Subjects:
Online Access:https://www.mdpi.com/2504-2289/9/7/182
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1839616473405849600
author Wei Xia
Wenguang Gan
Xinpan Yuan
author_facet Wei Xia
Wenguang Gan
Xinpan Yuan
author_sort Wei Xia
collection DOAJ
description Text-based person search (TPS), a critical technology for security and surveillance, aims to retrieve target individuals from image galleries using textual descriptions. The existing methods face two challenges: (1) ambiguous attribute–noun association (AANA), where syntactic ambiguities lead to incorrect associations between attributes and the intended nouns; and (2) textual noise and relevance imbalance (TNRI), where irrelevant or non-discriminative tokens (e.g., ‘wearing’) reduce the saliency of critical visual attributes in the textual description. To address these aspects, we propose the dependency-aware entity–attribute alignment network (DEAAN), a novel framework that explicitly tackles AANA through dependency-guided attention and TNRI via adaptive token filtering. The DEAAN introduces two modules: (1) dependency-assisted implicit reasoning (DAIR) to resolve AANA through syntactic parsing, and (2) relevance-adaptive token selection (RATS) to suppress TNRI by learning token saliency. Experiments on CUHK-PEDES, ICFG-PEDES, and RSTPReid demonstrate state-of-the-art performance, with the DEAAN achieving a Rank-1 accuracy of 76.71% and an mAP of 69.07% on CUHK-PEDES, surpassing RDE by 0.77% in Rank-1 and 1.51% in mAP. Ablation studies reveal that DAIR and RATS individually improve Rank-1 by 2.54% and 3.42%, while their combination elevates the performance by 6.35%, validating their synergy. This work bridges structured linguistic analysis with adaptive feature selection, demonstrating practical robustness in surveillance-oriented TPS scenarios.
format Article
id doaj-art-2bc8fde231a74029ab0d9eadd6d1edac
institution Matheson Library
issn 2504-2289
language English
publishDate 2025-07-01
publisher MDPI AG
record_format Article
series Big Data and Cognitive Computing
spelling doaj-art-2bc8fde231a74029ab0d9eadd6d1edac2025-07-25T13:14:09ZengMDPI AGBig Data and Cognitive Computing2504-22892025-07-019718210.3390/bdcc9070182Dependency-Aware Entity–Attribute Relationship Learning for Text-Based Person SearchWei Xia0Wenguang Gan1Xinpan Yuan2School of Computer, Hunan University of Technology, Zhuzhou 412000, ChinaSchool of Computer, Hunan University of Technology, Zhuzhou 412000, ChinaSchool of Computer, Hunan University of Technology, Zhuzhou 412000, ChinaText-based person search (TPS), a critical technology for security and surveillance, aims to retrieve target individuals from image galleries using textual descriptions. The existing methods face two challenges: (1) ambiguous attribute–noun association (AANA), where syntactic ambiguities lead to incorrect associations between attributes and the intended nouns; and (2) textual noise and relevance imbalance (TNRI), where irrelevant or non-discriminative tokens (e.g., ‘wearing’) reduce the saliency of critical visual attributes in the textual description. To address these aspects, we propose the dependency-aware entity–attribute alignment network (DEAAN), a novel framework that explicitly tackles AANA through dependency-guided attention and TNRI via adaptive token filtering. The DEAAN introduces two modules: (1) dependency-assisted implicit reasoning (DAIR) to resolve AANA through syntactic parsing, and (2) relevance-adaptive token selection (RATS) to suppress TNRI by learning token saliency. Experiments on CUHK-PEDES, ICFG-PEDES, and RSTPReid demonstrate state-of-the-art performance, with the DEAAN achieving a Rank-1 accuracy of 76.71% and an mAP of 69.07% on CUHK-PEDES, surpassing RDE by 0.77% in Rank-1 and 1.51% in mAP. Ablation studies reveal that DAIR and RATS individually improve Rank-1 by 2.54% and 3.42%, while their combination elevates the performance by 6.35%, validating their synergy. This work bridges structured linguistic analysis with adaptive feature selection, demonstrating practical robustness in surveillance-oriented TPS scenarios.https://www.mdpi.com/2504-2289/9/7/182text-based person searchsyntactic knowledgesemantic alignment
spellingShingle Wei Xia
Wenguang Gan
Xinpan Yuan
Dependency-Aware Entity–Attribute Relationship Learning for Text-Based Person Search
Big Data and Cognitive Computing
text-based person search
syntactic knowledge
semantic alignment
title Dependency-Aware Entity–Attribute Relationship Learning for Text-Based Person Search
title_full Dependency-Aware Entity–Attribute Relationship Learning for Text-Based Person Search
title_fullStr Dependency-Aware Entity–Attribute Relationship Learning for Text-Based Person Search
title_full_unstemmed Dependency-Aware Entity–Attribute Relationship Learning for Text-Based Person Search
title_short Dependency-Aware Entity–Attribute Relationship Learning for Text-Based Person Search
title_sort dependency aware entity attribute relationship learning for text based person search
topic text-based person search
syntactic knowledge
semantic alignment
url https://www.mdpi.com/2504-2289/9/7/182
work_keys_str_mv AT weixia dependencyawareentityattributerelationshiplearningfortextbasedpersonsearch
AT wenguanggan dependencyawareentityattributerelationshiplearningfortextbasedpersonsearch
AT xinpanyuan dependencyawareentityattributerelationshiplearningfortextbasedpersonsearch