Author ORCID Identifier

https://orcid.org/0000-0003-2007-697X

Date of Award

9-1-2025

Document Type

Thesis

School

School of Computing

Programme

Ph.D.-Doctoral of Philosophy

First Advisor

Dr.V.Ramaswamy

Keywords

Rainfall Forecasting, Machine Learning, Deep Learning, Stacking Ensemble Model, Imputation Methods

Abstract

Rainfall forecasting is critical for a variety of reasons, the most important of which is the substantial impact it has on many sectors of the community and the environment. It helps farmers with planting schedules, crop choices and irrigation techniques, all of which directly impact food production and agricultural yields. Rainfall forecasting is also vital in sectors such as hydroelectric power generation, since knowledge about water availability is essential for electricity generation. Accurate rainfall forecasts play very important roles in disaster planning and flood control. They enable authorities to take precautionary measures and, where necessary, plan for the evacuation of those who are at risk. Traditional approaches that have been employed earlier are often found to be biased and yield low accuracy.

Deep Learning (DL) and Machine Learning (ML) techniques are capable of handling complicated data patterns, thus increasing prediction accuracy. Traditional statistical approaches, such as Seasonal ARIMA (SARIMA) and Linear Regression (LR), often struggle to capture the irregular relationships in meteorological data. However, rainfall patterns often exhibit complex, non-linear behaviours that these models fail to represent accurately. Since ML and DL models are capable of learning non-linear correlations, they are better equipped to capture the complex relationships that exist between meteorological variables. Our work begins with the Variable Specific Hot Deck (VSHD) imputation technique, which provides a customized approach for addressing missing values in datasets. Missing data add a degree of uncertainty to data analysis, which can alter the characteristics of statistical estimators, reduce their power and lead to false conclusions.

Datasets considered in this work are from the Australian meteorological department and NASA’s POWER access portal. Brisbane, Sydney and Melbourne are the three Australian sites that have been taken into consideration for rainfall forecasting. Cuddalore and Karaikal, two regions in South India, are also included. The dataset is thoroughly examined using several machine learning classifiers. Among the classifiers used in this work are K-Nearest Neighbors (KNN), Decision Trees (DT), Support Vector Machines (SVM) and Random Forests (RF). Random Forests are especially notable for their higher accuracy after imputation, demonstrating how well the VSHD method maintains data integrity.

A meta-learner selection mechanism is incorporated into a stacking ensemble approach, which significantly improves accuracy and performance. XGBoost, a meta-learner selected for its robustness and efficiency, has led to notable improvements in accuracy. Despite these advances, additional refinement is still being pursued. Our research work makes use of Recursive Feature Elimination with Cross-Validation (RFECV) to extract eight pertinent features from a set of eighteen. The STEM-XG model is then fitted with these selected features. Prediction results are obtained by aggregating outcomes of base models. A comparative study shows that the combination yields high performance for all the locations. From the outcomes, it is inferred that the proposed STEM-XG enhances prediction performance.

We have also explored the integration of Particle Swarm Optimization with LSTM, Bi-LSTM and GRU. PSO-LSTM emerges as a powerful approach to tackle complex prediction tasks by leveraging the strengths of both techniques. By dynamically adjusting LSTM’s parameters using PSO, and through empirical evaluation, we have demonstrated that PSO-LSTM outperforms PSO-GRU and PSO-Bi-LSTM. Performance metrics such as RMSE and MAE for prediction, and Accuracy and F1-Score for classification, are considered in this evaluation.

Share

COinS