Computing streamflow from hydrological basin similar groups and country scale via deep learning

Pedro Alberto Pereira Zamboni

Institute of Photogrammetry and Remote Sensing, Dresden University of Technology, Dresden, Germany

About

The application of deep learning in hydrology has demonstrated promising results in recent years, particularly with the use of Long Short-Term Memory (LSTM) networks for streamflow prediction. However, large-scale implementations remain underexplored in many regions, including Brazil. This study presents the first comprehensive assessment of LSTM-based streamflow modeling across Brazilian catchments using the CABra dataset, comprising 660 basins. We evaluate the performance of LSTM and Entity-Aware LSTM (EA-LSTM) models under two spatial strategies: a country-scale approach and regionally grouped catchments based on hydrological similarity. Models were trained using multiple combinations of dynamic inputs and validated through five-fold cross-validation, leave-one-group-out strategies, and independent test sets. Results show that country-scale models generally outperform region-specific models, particularly in humid regions and large catchments, while performance declines in arid areas with non-perennial rivers. Moreover, increasing the number of dynamic inputs does not consistently improve performance, suggesting potential issues with multicollinearity. The study highlights the feasibility and limitations of large-sample deep learning hydrological modeling in data-sparse environments and provides insights into model generalizability and transferability across diverse hydroclimatic regions.

Methodology

Methodology Overview

Methodology Overview

We utilized a Long Short-Term Memory (LSTM) network for streamflow prediction, leveraging the CABra dataset of 660 Brazilian basins. We trained models in three different configurations: country-scale, regionally grouped by hydrological similarity, and using a leave-one-group-out strategies. On the country-scale, we trained models on all basins, while the regionally grouped approach involved training on catchments with similar hydrological characteristics. The leave-one-group-out strategy was trained by iteratively leaving out one catchment group for testing while training on the remaining groups.

Country scale model

Country scale model

We proceeded to train country-level models using five-fold cross-validation, repeating each fold three times with different random seeds. This approach allowed us to evaluate two key aspects: the effect of varying training catchments—since each fold excludes a different 20% subset of basins—and the influence of stochasticity arising from random initialization. Subsequently, the most optimal model was evaluated on both the training and test catchments during the designated test period, ensuring the test set was independent in both time and space. Additionally, within the test catchments, we assessed the model's performance across each hydrological group individually. Finally, we evaluated the impact of different dynamic input configurations and model types using the optimized hyperparameters. Four distinct combinations of dynamic inputs were tested: (1) precipitation only; (2) precipitation and potential evapotranspiration calculated using the Penman-Monteith method; (3) precipitation, minimum and maximum temperature, relative humidity, wind speed, solar radiation, and evapotranspiration; and (4) all aforementioned variables plus potential evapotranspiration calculated using the Penman-Monteith method. Each configuration's performance was assessed on the test set during the designated test period to analyze the influence of dynamic inputs on model accuracy.

Regionally grouped by hydrological similarity models

Regionally grouped by hydrological similarity models

Brazil is a country with significant environmental heterogeneity, exhibiting a wide variety of biomes and hydroclimatic characteristics.A recent analysis of catchment-scale streamflow signatures revealed the presence of six distinct hydrological groups within Brazil's catchments (Almagro et al., 2024). These groups exhibited similar hydrologic behavior and characteristics, displaying a spatial cluster pattern. To further explore the similarities among these catchments, we developed individual models for each group. To this end, we initiated a preliminary training phase to establish model hyperparameters for each group. The selection of the most suitable hyperparameters was made for each group, and subsequently, we proceeded to train an individual model for each group. The evaluation of these models was conducted on the test catchment of each group during the designated test period. A comparison of the results obtained from these models with the country-scale model was also undertaken. Furthermore, a model trained on one was applied to the rest to assess model transferability and generalizability.

Leave-one-group-out strategy

Leave-one-group-out strategy

The utilization of an independent test set is imperative for the precise evaluation of a deep learning model's capacity for generalization across unseen data, thereby averting the occurrence of overly optimistic performance estimates. This approach enables a more accurate reflection of the models' potential to discern and extrapolate hydrological behavior across disparate regions.In this study, a leave-one-out strategy was employed, leveraging the six hydrological groups, with five groups utilized for model training and one for evaluation. This approach yielded an independent test set, thereby ensuring that any potential correlation between the training and test catchments was eliminated. Each hydrological group was utilized once as the test set, with the objective being to ascertain whether a specific combination of groups would produce superior results in comparison to the others.

Results & Applications

Country-Scale Model Performance

Z-coordinate Comparison

Performance on the training catchments over the test period of time.

Mean NSE of 0.557 with median of 0.658. Mean KGE of 0.637, with median of 0.690.

Water Level and Discharge Measurement

Performance on the training catchments over the test period of time.

Z-coordinate Comparison

Performance on the test catchments over the test period of time.

Mean NSE of 0.442 with median of 0.590. Mean KGE of 0.554, with median of 0.626.

Water Level and Discharge Measurement

Performance on the test catchments over the test period of time.

Regionally grouped by hydrological similarity models

Impact of dynamic input and model size on the performacne of the test catchment on the test perid of time

Hydrologically similar group models performance

Regionally grouped by hydrological similarity models

Performance per hydrological group of the test catchment on the test period of time

Regionally grouped by hydrological similarity models

Comparison between country-scale models and model per hydrological group on the test period of time

More yet to come....