Conformal taxonomic validation: A semi-automated validation framework for citizen science records
Citizen science records are a valuable source of marine biodiversity data, especially where standardized sampling campaigns are limited in spatial or temporal scope. However, such records often contain biases and errors and typically require expert validation before they can reliably support scienti...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2025-12-01
|
Series: | Ecological Informatics |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S1574954125002997 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Citizen science records are a valuable source of marine biodiversity data, especially where standardized sampling campaigns are limited in spatial or temporal scope. However, such records often contain biases and errors and typically require expert validation before they can reliably support scientific research. Validating large volumes of citizen science data remains an important challenge. In this paper, we present a semi-automated validation framework that combines a deep learning classifier with conformal prediction to generate sets of plausible taxonomic labels at multiple ranks, while providing rigorous control over prediction confidence. Extensive evaluation was carried out using 25,000 jellyfish records, both with and without prior validation, as well as against 800 expert-validated entries. Our results show that the method frequently produces singleton prediction sets that can be accepted automatically, offering a high-confidence and scalable solution for validating marine citizen science data. |
---|---|
ISSN: | 1574-9541 |