The research on enhancing LA estimation accuracy across domains for small sample data based on data augmentation and data transfer integration optimization system
Context: The efficient and precise monitoring of rice leaf area (LA) is essential for variety selection and agricultural management. At present, LA estimation models based on high-throughput phenotyping technologies primarily depend on homogenized large sample datasets. These models encounter genera...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-12-01
|
| Series: | Smart Agricultural Technology |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2772375525003806 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Context: The efficient and precise monitoring of rice leaf area (LA) is essential for variety selection and agricultural management. At present, LA estimation models based on high-throughput phenotyping technologies primarily depend on homogenized large sample datasets. These models encounter generalization challenges when applied to heterogeneous scenarios with small sample sizes. Objective: In this research, our goal is to develop a novel framework to mitigate prediction biases in LA caused by sample limitations and data heterogeneity. This framework integrates machine learning models to establish a universal solution for cross-domain LA estimation in data-scarce situations. Methods: This research utilizes canopy image data acquired from the 2023–2024 rice full-cycle multi-view RGB imaging system (with dual front and side camera positions). Fourteen morphological feature parameters are constructed, and the leaf area values are measured through destructive sampling, together forming the dataset. A comprehensive comparison of six algorithms (linear regression, support vector regression, random forest, XGBoost, CatBoost, and K-nearest neighbors) is conducted, assessing their performance under a combined strategy of data augmentation (noise injection, generative adversarial networks, Gaussian mixture model, variational autoencoders) and transfer learning (random, clustering, and hierarchical parameter transfer). Results and conclusions: The results demonstrate that the integrated optimization system (Gaussian Mixture Model Generation-Cluster-Based Transfer, GMM-CBT) achieved optimal performance when combined with XGBoost (validation R2=0.85, test R2=0.85), outperforming both standalone approaches: data augmentation (validation R2=0.87, test R2=-0.37) and transfer learning (validation R2=0.84, test R2=0.84). The framework clusters heterogeneous data based on morphological features (such as size, compactness, and roundness) and constructs a transfer sample library with feature coverage. Significance: The proposed methodology advances precision agriculture by enabling single-plant LA monitoring, with potential extensions to other crops and trait-phenotyping applications. |
|---|---|
| ISSN: | 2772-3755 |