Application of Surface Water Quality Classification Models Using Principal Components Analysis and Cluster Analysis

Mohamed Hamed

Abstract

Water quality monitoring has one of the highest priorities in surface water protection policy. Many techniques and methods focus in analyzing the concealing parameters that determine the variance of observed water quality of various source points. A considerable proportion of them mainly depend on statistical methods, multivariate statistical techniques in particular.In the present study, the use of multivariate techniques is required to reduce the large variables number of Nile River water quality upstream Cairo Drinking Water Plants (CDWPs) and determination of relationships among them for easy and robust evaluation. By means of multivariate statistics of principal components analysis (PCA), Fuzzy C-Means (FCM) and K-means algorithm for clustering analysis, this study attempted to determine the major dominant factors responsible for the variations of Nile River water quality upstream Cairo Drinking Water Plants (CDWPs).Furthermore, cluster analysis classified 21 sampling stations into three clusters based on similarities of water quality features.The result of PCA shows that 6 principal components contain the key variables and account for 75.82% of total variance of the study area surface water quality and the dominant water quality parameters were: Conductivity, Iron, Biological Oxygen Demand (BOD), Total Coliform (TC), Ammonia (NH3), and pH.However, the results from both of FCM clustering and K-means algorithm, based on the dominant parameters concentrations, determined 3 cluster groups and produced cluster centers (prototypes). Based on clustering classification, a noted water quality deteriorating as the cluster number increased from one to three, thus the cluster grouping can be used to identify the physical, chemical and biological processes creating the variations in the water quality parameters.

Relevant Publications in Irrigation & Drainage Systems Engineering