Physically-constrained evapotranspiration models with machine learning parameterization outperform pure machine learning: Critical role of domain knowledge.

Physics-informed machine learning techniques have emerged to tackle challenges inherent in pure machine learning (ML) approaches. One such technique, the hybrid approach, has been introduced to estimate terrestrial evapotranspiration (ET), a crucial variable linking water, energy, and carbon cycles....

Full description

Saved in:
Bibliographic Details
Main Authors: Yeonuk Kim, Monica Garcia, T Andrew Black, Mark S Johnson
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0328798
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Physics-informed machine learning techniques have emerged to tackle challenges inherent in pure machine learning (ML) approaches. One such technique, the hybrid approach, has been introduced to estimate terrestrial evapotranspiration (ET), a crucial variable linking water, energy, and carbon cycles. A key advantage of these hybrid ET models is their improved performance, particularly under extreme conditions, compared to ET estimates relying solely on ML. However, the mechanisms driving their improved performance are not well understood. To address this gap, we developed six hybrid approaches based on different physical formulations of ET and compared them with a pure ML model. All models employed the random forest algorithm and were trained on daily-scale ET observations, in-situ meteorological data and satellite remote sensing. We found a strong correlation (r = 0.93) between the sensitivity of ET estimates to machine-learned parameters and model error (root-mean-square error; RMSE), indicating that reduced sensitivity minimizes error propagation and improves performance. Notably, the most accurate hybrid model (RMSE = 17.8 W m-2 in energy unit) utilized a novel empirical parameter, which is relatively stable due to land-atmosphere equilibrium, outperforming both the pure ML model and hybrid models requiring conventional parameters (e.g., surface conductance). These results imply that conventional parameterizations may require reevaluated to effectively integrate physical models with machine learning, as conventional choices may not be optimal for this new, hybrid, paradigm. This study underscores the critical role of domain knowledge in setting up hybrid models, potentially guiding future hybrid model developments beyond ET estimation.
ISSN:1932-6203