Optimizing tomato yield prediction using phenologically timed UAV-based spectral data and machine learning

Accurate yield prediction is critical for optimizing agricultural practices and ensuring food security. This study evaluated the performance of machine learning models in predicting tomato yield using weather data, spectral bands, and vegetation indices under varying nitrogen rates and bio-stimulant...

Full description

Saved in:
Bibliographic Details
Main Authors: Carolina Trentin, Yiannis Ampatzidis, Sotirios Tasioulas, Pavlos Tsouvaltzis
Format: Article
Language:English
Published: Elsevier 2025-12-01
Series:Smart Agricultural Technology
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2772375525003909
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Accurate yield prediction is critical for optimizing agricultural practices and ensuring food security. This study evaluated the performance of machine learning models in predicting tomato yield using weather data, spectral bands, and vegetation indices under varying nitrogen rates and bio-stimulant treatments to induce plant growth variability. UAV-based spectral data were collected across seven dates from October 27 to December 15, 2023, corresponding to key phenological stages: vegetative growth (data collection date 1), flowering (dates 2 and 3), fruit development (dates 4, 5, and 6), and early ripening (date 7). Significant input features were identified using the Pearson correlation coefficient (r > 0.65, p < 0.05), including Near Infrared (NIR), Red Edge, and Red spectral bands, as well as vegetation indices such as NDVI, GNDVI, NDRE, and SAVI. Aerial spectral data collected during fruit development (dates 5 and 6) showed the strongest correlations with yield (r = 0.66–0.74), emphasizing the importance of mid-to-late-season spectral information. Among the models evaluated, linear regression (LR) and XGBoost achieved the best performance, with root mean squared error (RMSE) values of 16.13 kg and 16.15 kg, respectively, and R² values of 0.63. Support vector machine (SVM) and decision tree (DT) also perform well, with RMSE values of 17.15 kg and 17.18 kg, respectively. In contrast, the deep learning model underperformed (RMSE = 23.49 kg, R² = 0.23), likely due to the limited data. This study highlights the predictive potential of spectral bands and emphasizes the significance of phenologically timed spectral data for yield estimation.
ISSN:2772-3755