OpenFungi: A Machine Learning Dataset for Fungal Image Recognition Tasks
A key aspect driving advancements in machine learning applications in medicine is the availability of publicly accessible datasets. Evidently, there are studies conducted in the past with promising results, but they are not reproducible due to the fact that the data used are closed or proprietary or...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2025-07-01
|
Series: | Life |
Subjects: | |
Online Access: | https://www.mdpi.com/2075-1729/15/7/1132 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | A key aspect driving advancements in machine learning applications in medicine is the availability of publicly accessible datasets. Evidently, there are studies conducted in the past with promising results, but they are not reproducible due to the fact that the data used are closed or proprietary or the authors were not able to publish them. The current study aims to narrow this gap for researchers who focus on image recognition tasks in microbiology, specifically in fungal identification and classification. An open database named OpenFungi is made available in this work; it contains high-quality images of macroscopic and microscopic fungal genera. The fungal cultures were grown from food products such as green leaf spices and cereals. The quality of the dataset is demonstrated by solving a classification problem with a simple convolutional neural network. A thorough experimental analysis was conducted, where six performance metrics were measured in three distinct validation scenarios. The results obtained demonstrate that in the fungal species classification task, the model achieved an overall accuracy of 99.79%, a true-positive rate of 99.55%, a true-negative rate of 99.96%, and an F1 score of 99.63% on the macroscopic dataset. On the microscopic dataset, the model reached a 97.82% accuracy, a 94.89% true-positive rate, a 99.19% true-negative rate, and a 95.20% F1 score. The results also reveal that the model maintains promising performance even when trained on smaller datasets, highlighting its robustness and generalization capabilities. |
---|---|
ISSN: | 2075-1729 |