Author ORCID Identifier
https://orcid.org/0000-0003-2007-697X
Date of Award
9-1-2025
Document Type
Thesis
School
School of Computing
Programme
Ph.D.-Doctoral of Philosophy
First Advisor
Dr.V.Ramaswamy
Keywords
Rainfall Forecasting, Machine Learning, Deep Learning, Stacking Ensemble Model, Imputation Methods
Abstract
Rainfall forecasting is critical for a variety of reasons, the most important of which is the substantial impact it has on many sectors of the community and the environment. It helps farmers with planting schedules, crop choices and irrigation techniques, all of which directly impact food production and agricultural yields. Rainfall forecasting is also vital in sectors such as hydroelectric power generation, since knowledge about water availability is essential for electricity generation. Accurate rainfall forecasts play very important roles in disaster planning and flood control. They enable authorities to take precautionary measures and, where necessary, plan for the evacuation of those who are at risk. Traditional approaches that have been employed earlier are often found to be biased and yield low accuracy.
Deep Learning (DL) and Machine Learning (ML) techniques are capable of handling complicated data patterns, thus increasing prediction accuracy. Traditional statistical approaches, such as Seasonal ARIMA (SARIMA) and Linear Regression (LR), often struggle to capture the irregular relationships in meteorological data. However, rainfall patterns often exhibit complex, non-linear behaviours that these models fail to represent accurately. Since ML and DL models are capable of learning non-linear correlations, they are better equipped to capture the complex relationships that exist between meteorological variables. Our work begins with the Variable Specific Hot Deck (VSHD) imputation technique, which provides a customized approach for addressing missing values in datasets. Missing data add a degree of uncertainty to data analysis, which can alter the characteristics of statistical estimators, reduce their power and lead to false conclusions.
Datasets considered in this work are from the Australian meteorological department and NASA’s POWER access portal. Brisbane, Sydney and Melbourne are the three Australian sites that have been taken into consideration for rainfall forecasting. Cuddalore and Karaikal, two regions in South India, are also included. The dataset is thoroughly examined using several machine learning classifiers. Among the classifiers used in this work are K-Nearest Neighbors (KNN), Decision Trees (DT), Support Vector Machines (SVM) and Random Forests (RF). Random Forests are especially notable for their higher accuracy after imputation, demonstrating how well the VSHD method maintains data integrity.
A meta-learner selection mechanism is incorporated into a stacking ensemble approach, which significantly improves accuracy and performance. XGBoost, a meta-learner selected for its robustness and efficiency, has led to notable improvements in accuracy. Despite these advances, additional refinement is still being pursued. Our research work makes use of Recursive Feature Elimination with Cross-Validation (RFECV) to extract eight pertinent features from a set of eighteen. The STEM-XG model is then fitted with these selected features. Prediction results are obtained by aggregating outcomes of base models. A comparative study shows that the combination yields high performance for all the locations. From the outcomes, it is inferred that the proposed STEM-XG enhances prediction performance.
We have also explored the integration of Particle Swarm Optimization with LSTM, Bi-LSTM and GRU. PSO-LSTM emerges as a powerful approach to tackle complex prediction tasks by leveraging the strengths of both techniques. By dynamically adjusting LSTM’s parameters using PSO, and through empirical evaluation, we have demonstrated that PSO-LSTM outperforms PSO-GRU and PSO-Bi-LSTM. Performance metrics such as RMSE and MAE for prediction, and Accuracy and F1-Score for classification, are considered in this evaluation.
Recommended Citation
P, Umamaheswari Ms, "An Integrated Approach to Enhance The Performance of Rainfall Forecasting by Leveraging Stacking Based Machine Learning and Deep Learning Techniques" (2025). Theses and Dissertations. 143.
https://knowledgeconnect.sastra.edu/theses/143