A Framework for Constructing Large-Scale Dynamic Datasets for Water Conservancy Image Recognition Using Multi-Role Collaboration and Intelligent Annotation

The construction of large-scale, dynamic datasets for specialized domain models often suffers with problems of low efficiency and poor consistency. This paper proposes a method that integrates multi-role collaboration with automated annotation to address these issues. The framework introduces two ne...

Full description

Saved in:
Bibliographic Details
Main Authors: Xueying Song, Xiaofeng Wang, Ganggang Zuo, Jiancang Xie
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/14/8002
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The construction of large-scale, dynamic datasets for specialized domain models often suffers with problems of low efficiency and poor consistency. This paper proposes a method that integrates multi-role collaboration with automated annotation to address these issues. The framework introduces two new roles, data augmentation specialists and automatic annotation operators, to establish a closed-loop process that includes dynamic classification adjustment, data augmentation, and intelligent annotation. Two supporting tools were developed: an image classification modification tool that automatically adapts to changes in categories and an automatic annotation tool with rotation-angle perception based on the rotation matrix algorithm. Experimental results show that this method increases annotation efficiency by 40% compared to traditional approaches, while achieving 100% annotation consistency after classification modifications. The method’s effectiveness was validated using the WATER-DET dataset, a collection of 1500 annotated images from the water conservancy engineering field. A model trained on this dataset achieved an F1-score of 0.9 for identifying water environment problems in rivers and lakes. This research offers an efficient framework for dynamic dataset construction, and the developed methods and tools are expected to promote the application of artificial intelligence in specialized domains.
ISSN:2076-3417