Data-driven insights into groundwater quality: machine and deep learning approaches

Arsenic and nitrate contamination of groundwater have been major causes of concern to both the environment and the health of the people, which are significant risks to drinking water quality. In this study, machine learning (ML) and deep learning (DL) models are applied to predict groundwater contam...

Full description

Saved in:
Bibliographic Details
Main Authors: Gift Mbuzi, Abdur Rashid Sangi, Baha Ihnaini, Anil Carie, Sruthi Sivarajan, Satish Anamalamudi
Format: Article
Language:English
Published: Mehran University of Engineering and Technology 2025-07-01
Series:Mehran University Research Journal of Engineering and Technology
Subjects:
Online Access:https://murjet.muet.edu.pk/index.php/home/article/view/317
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Arsenic and nitrate contamination of groundwater have been major causes of concern to both the environment and the health of the people, which are significant risks to drinking water quality. In this study, machine learning (ML) and deep learning (DL) models are applied to predict groundwater contamination trends in different parts of India. Mapping a five-year time series historical dataset (2016–2021) of important physicochemical parameters such as conductivity, pH, BOD, fluoride, arsenic, and nitrate, this paper compares some machine learning and deep learning models. Feature importance revealed BOD, total dissolved solids (TDS), and conductivity to be important predictors of arsenic contamination, while agricultural and industrial activities dictate nitrate contamination. Temporal analysis for the variability of arsenic levels revealed decreasing values post-year 2019, which may be due to dilution effects and regulatory measures, while nitrate contamination fluctuated region-wise. After hyperparameter tuning, XGBoost was the most predictive (R² = 0.70), outperforming traditional regression analysis. Partial Dependence Plots (PDP) also caught detailed non-linear relationships among water quality parameters. The findings indicate the potential of predictive models based on AI in groundwater monitoring in real-time to enable better mitigation of contamination. This study contributes to the offering of reliable AI-based systems of monitoring the groundwater in real-life cases and sustainable resource management planning.
ISSN:0254-7821
2413-7219