Interactive Content Retrieval in Egocentric Videos Based on Vague Semantic Queries

Retrieving specific, often instantaneous, content from hours-long egocentric video footage based on hazily remembered details is challenging. Vision–language models (VLMs) have been employed to enable zero-shot textual-based content retrieval from videos. But, they fall short if the textual query co...

Full description

Saved in:

Bibliographic Details
Main Authors:	Linda Ablaoui, Wilson Estecio Marcilio-Jr, Lai Xing Ng, Christophe Jouffrais, Christophe Hurter
Format:	Article
Language:	English
Published:	MDPI AG 2025-06-01
Series:	Multimodal Technologies and Interaction
Subjects:	human–computer interaction zero-shot content retrieval multimodal querying egocentric videos
Online Access:	https://www.mdpi.com/2414-4088/9/7/66
Tags:	Add Tag No Tags, Be the first to tag this record!

Internet

https://www.mdpi.com/2414-4088/9/7/66

Interactive Content Retrieval in Egocentric Videos Based on Vague Semantic Queries

Internet

Similar Items