Computational Approaches to Diabetes Risk Assessment: A Review of Data-Driven Techniques

Authors

DOI:

https://doi.org/10.35566/jbds/thakur

Keywords:

Diabetes prediction, Machine learning, Healthcare analytics, Predictive modeling, Clinical decision support

Abstract

Over 540 million people worldwide suffer from diabetes mellitus, making it a serious global health concern. The advancement of robust predictive models that surpass traditional risk assessment approaches has demonstrated significant potential due to machine learning techniques. This thorough analysis summarizes the state of the art in machine learning-based diabetes prediction systems by examining algorithmic approaches, dataset properties, and performance indicators. The analysis shows how advanced ensemble and deep learning techniques have replaced more conventional statistical methods in order to achieve better results. Critical drawbacks still exist, nonetheless, such as an excessive dependence on datasets with a restricted demographic, a lack of real-world validation, and inadequate model interpretability for clinical acceptability. Regulatory obstacles, population-specific dataset variability, and discrepancies between algorithmic performance and therapeutic impact are some of the main obstacles. In order to convert advancements into clinically useful systems, future priorities include creating representative datasets, putting explainable artificial intelligence (AI) into practice, and carrying out prospective clinical studies.

Author Biographies

  • Agrimaa Singh Thakur, Maharaja Agrasen University

    Research Scholar, Department of Computer Science and Engineering, Maharaja Agrasen Institute of Technology, Maharaja Agrasen University, India

    ORCID: https://orcid.org/0000-0002-9996-0321

  • Amit Verma, Maharaja Agrasen University

    Associate Professor, Department of Computer Science and Engineering, Maharaja Agrasen Institute of Technology, Maharaja Agrasen University, India


    ORCID: https://orcid.org/0000-0002-4132-4082

References

Ahmed, U., Issa, G., Aftab, S., Farhan Khan, M., Said, R., Ghazal, T., … Khan, M. (2022). Prediction of diabetes empowered with fused machine learning. IEEE Access. doi: https://doi.org/10.1109/ACCESS.2022.3142097 DOI: https://doi.org/10.1109/ACCESS.2022.3142097

Alanazi, A., & Mezher, M. (2020). Using machine learning algorithms for prediction of diabetes mellitus. In Proceedings of iccit (pp. 1–3). doi: https://doi.org/10.1109/ICCIT-144147971.2020.9213708 DOI: https://doi.org/10.1109/ICCIT-144147971.2020.9213708

American Diabetes Association. (n.d.-a). Gestational diabetes. https://www.diabetes.org/diabetes/gestational-diabetes.

American Diabetes Association. (n.d.-b). Type 1 diabetes. https://www.diabetes.org/diabetes/type-1.

American Diabetes Association. (n.d.-c). Type 2 diabetes. https://www.diabetes.org/diabetes/type-2.

Aslan, M. F., & Sabanci, K. (2023). A novel proposal for deep learning-based diabetes prediction: Converting clinical data to image data. Diagnostics, 13(4), 796. doi: https://doi.org/10.3390/diagnostics13040796 DOI: https://doi.org/10.3390/diagnostics13040796

Atkinson, M. A., Eisenbarth, G. S., & Michels, A. W. (2014). Type 1 diabetes. Lancet, 383(9911), 69–82. doi: https://doi.org/10.1016/S0140-6736(13)60591-7 DOI: https://doi.org/10.1016/S0140-6736(13)60591-7

Bhat, S. S., Selvam, V., Ansari, G. A., Ansari, M. D., & Rahman, M. H. (2022). Prevalence and early prediction of diabetes using machine learning in north kashmir: A case study of district bandipora. Computational Intelligence and Neuroscience, 2022, 2789760. doi: https://doi.org/10.1155/2022/2789760 DOI: https://doi.org/10.1155/2022/2789760

Butunoi, B.-P., Stolojescu-Crisan, C., & Negru, V. (2024). Blood glucose prediction in type 1 diabetes based on long short-term memory. In Recent advances in artificial intelligence (pp. 1–10). doi: https://doi.org/10.1007/978-3-031-70259-4_35 DOI: https://doi.org/10.1016/j.procs.2024.05.194

Centers for Disease Control and Prevention. (n.d.-a). Behavioral risk factor surveillance system (brfss). https://www.cdc.gov/brfss/annual_data/annual_data.htm.

Centers for Disease Control and Prevention. (n.d.-b). National health and nutrition examination survey (nhanes). https://wwwn.cdc.gov/nchs/nhanes/.

Defronzo, R. A., Ferrannini, E., Zimmet, P., & Alberti, G. (2015). International textbook of diabetes mellitus. John Wiley & Sons. DOI: https://doi.org/10.1002/9781118387658

Doğru, A., Buyrukoglu, S., & Ari, M. (2023). A hybrid super ensemble learning model for the early-stage prediction of diabetes risk. Medical & Biological Engineering & Computing, 61. doi: https://doi.org/10.1007/s11517-022-02749-z DOI: https://doi.org/10.1007/s11517-022-02749-z

Framingham Heart Study. (n.d.). Framingham heart study dataset. https://www.kaggle.com/datasets/aasheesh200/framingham-heart-study-dataset.

Hennebelle, A., Materwala, H., & Ismail, L. (2023). Healthedge: A machine learning-based smart healthcare framework for prediction of type 2 diabetes in an integrated iot, edge, and cloud computing system. Procedia Computer Science, 220, 331–338. doi: https://doi.org/10.1016/j.procs.2023.03.043 DOI: https://doi.org/10.1016/j.procs.2023.03.043

International Diabetes Federation. (2023). Diabetes facts and figures. Retrieved from https://idf.org/ (Accessed 2025)

International Diabetes Federation. (2025a). Country data: India. Retrieved from https://diabetesatlas.org/ (Accessed 2025)

International Diabetes Federation. (2025b). Data by location. Retrieved from https://diabetesatlas.org/ (Accessed 2025)

International Diabetes Federation. (2025c). Idf diabetes atlas reports. Retrieved from https://diabetesatlas.org/ (Accessed 2025)

Joshi, T. N., Pramila, M., & Chawan, P. (2018). Logistic regression and svm based diabetes prediction system. International Journal of Computer Applications, 180(20), 1–5. DOI: https://doi.org/10.5120/ijca2018916127

Knip, M., & Simell, O. (2012). Environmental triggers of type 1 diabetes. Cold Spring Harbor Perspectives in Medicine, 2(7), a007690. doi: https://doi.org/10.1101/cshperspect.a007690 DOI: https://doi.org/10.1101/cshperspect.a007690

Kokkorakis, M., et al. (2023). Effective questionnaire-based prediction models for type 2 diabetes across several ethnicities: a model development and validation study. EClinicalMedicine, 64, 102235. doi: https://doi.org/10.1016/j.eclinm.2023.102235 DOI: https://doi.org/10.1016/j.eclinm.2023.102235

Kopitar, L., Kocbek, P., Cilar Budler, L., Sheikh, A., & Stiglic, G. (2020). Early detection of type 2 diabetes mellitus using machine learning-based prediction models. Scientific Reports, 10. doi: https://doi.org/10.1038/s41598-020-68771-z DOI: https://doi.org/10.1038/s41598-020-68771-z

Mujumdar, A., & Vaidehi, V. (2019). Diabetes prediction using machine learning algorithms. Procedia Computer Science, 165, 292–299. doi: https://doi.org/10.1016/j.procs.2020.01.047 DOI: https://doi.org/10.1016/j.procs.2020.01.047

Qi, H., Song, X., Liu, S., Zhang, Y., & Wong, K. K. L. (2023). Kfpredict: An ensemble learning prediction framework for diabetes based on fusion of key features. Computer Methods and Programs in Biomedicine, 231, 107378. doi: https://doi.org/10.1016/j.cmpb.2023.107378 DOI: https://doi.org/10.1016/j.cmpb.2023.107378

Rajkomar, A., Dean, J., & Kohane, I. (2019). Machine learning in medicine. New England Journal of Medicine, 380(14), 1347–1358. doi: https://doi.org/10.1056/NEJMra1814259 DOI: https://doi.org/10.1056/NEJMra1814259

Saru, S., & Subashree, S. (2019). Analysis and prediction of diabetes using machine learning. International Journal of Emerging Technology and Innovative Engineering, 5(4).

Sneha, N., & Gangil, T. (2019). Analysis of diabetes mellitus for early prediction using optimal features selection. Journal of Big Data, 6, 13. doi: https://doi.org/10.1186/s40537-019-0175-6 DOI: https://doi.org/10.1186/s40537-019-0175-6

Sonar, P., & JayaMalini, K. (2019). Diabetes prediction using different machine learning approaches. In Proceedings of iccmc (pp. 367–371). doi: https://doi.org/10.1109/ICCMC.2019.8819841 DOI: https://doi.org/10.1109/ICCMC.2019.8819841

Su, Y., Huang, C., Zhu, W., Lyu, X., & Ji, F. (2023). Multi-party diabetes mellitus risk prediction based on secure federated learning. Biomedical Signal Processing and Control, 85, 104881. doi: https://doi.org/10.1016/j.bspc.2023.104881 DOI: https://doi.org/10.1016/j.bspc.2023.104881

Syed, A. H., & Khan, T. (2020). Machine learning-based application for predicting risk of type 2 diabetes mellitus (t2dm) in saudi arabia: A retrospective cross-sectional study. IEEE Access, 8, 199539–199561. doi: https://doi.org/10.1109/ACCESS.2020.3035026 DOI: https://doi.org/10.1109/ACCESS.2020.3035026

UCI Machine Learning Repository. (n.d.). Pima indians diabetes database. https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database.

Uloko, A. E., Musa, B. M., Ramalan, M. A., Gezawa, I. D., Puepet, F. H., Uloko, A. T., … Sada, K. B. (2018). Prevalence and risk factors for diabetes mellitus in nigeria: A systematic review and meta-analysis. Diabetes Therapy, 9(3), 1307–1316. doi: https://doi.org/10.1007/s13300-018-0441-1 DOI: https://doi.org/10.1007/s13300-018-0441-1

World Health Organization. (2016). Global report on diabetes.

Zhang, L., Wang, Y., Niu, M., et al. (2020). Machine learning for characterizing risk of type 2 diabetes mellitus in a rural chinese population: the henan rural cohort study. Scientific Reports, 10, 4406. doi: https://doi.org/10.1038/s41598-020-61123-x DOI: https://doi.org/10.1038/s41598-020-61123-x

Zhou, H., Xin, Y., & Li, S. (2023). A diabetes prediction model based on boruta feature selection and ensemble learning. BMC Bioinformatics, 24, 224. doi: https://doi.org/10.1186/s12859-023-05300-5 DOI: https://doi.org/10.1186/s12859-023-05300-5

Downloads

Published

2026-02-17

Issue

Section

Literature Review

How to Cite

Singh Thakur, A., & Verma, A. (2026). Computational Approaches to Diabetes Risk Assessment: A Review of Data-Driven Techniques. Journal of Behavioral Data Science, 6(1), 1-16. https://doi.org/10.35566/jbds/thakur