Regularized Regression Model for Predicting Hypertension and Type 2 Diabetes Mellitus in Patients
Main Article Content
Abstract
Regularized Regression methods such as Least Absolute Shrinkage and Selection Operator (LASSO) and Ridge regression are some of the techniques that overcome ordinary least squares assumptions’ violation such as multicollinearity. Modeling hypertension and diabetes involve several explanatory variables, some of which are interrelated. Thus, there is the need to use an estimation technique that can solve the problem of interrelated variables in modeling. The consideration of hypertension and diabetes in this study is on the premise that the two are related and have some predictors in common. There were four dependent variables in the study: Fasting Blood Sugar (FBS), urea, Systolic Blood pressure (SBP) and Diastolic Blood Pressure (DBP) and thirteen independent
variables. Comparisons were made using Mean Squared Error (MSE) and Root Mean Squared Error (RMSE). The results showed that the model on SBP had the best performance in the final LASSO models which retained Height, Religion, Age, Sex, Marital Status, Creatinine, Family History, Temperature and Type of disease, as well as the ridge
regression model. One of the implications of the result is that certain levels of these independent variables can imply the levels of the dependent variables that signify the presence of type 2 diabetes or hypertension. The LASSO method performed better than Ridge regression for FBS, urea and SBP. With both LASSO and ridge regression, multicollinearity problem in the independent variables was removed.
Downloads
Article Details
Issue
Section

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.