Predicting diabetes mellitus metabolic goals and chronic complications transitions—analysis based on natural language processing and machine learning models

Claudia C. Colmenares-Mejia, Andrés F. García-Suaza, Paul Rodríguez-Lesmes, Christian Lochmuller, Sara C. Atehortúa, J. E. Camacho-Cogollo, Juan P. Martínez, Juliana Rincón, Yohan R. Céspedes, Esteban Morales-Mendoza, Mario A. Isaza-Ruget

Research output: Contribution to journalResearch Articlepeer-review

Abstract

Objective To estimate Diabetes mellitus (DM) progression at one and two years in terms of glycemic targets and development of complications. Research design and methods We analyzed a retrospective cohort of adult DM patients treated in a Health Maintenance Organization in Colombia, including those with at least one glycosylated hemoglobin (HbA1c) measurement in 2018, 2019, and 2020. We defined four disease transition stages based on metabolic goals according to HbA1c levels and complications: 1. Within HbA1c goals and without complications; 2. Outside goals and without complications, 3. Within goals, but with complications, and 4. Outside goals and with complications. We applied Natural Language Processing (NLP) techniques to extract relevant clinical information from Electronic Health Records. Machine learning (ML) models were used to predict patient progression. Results A total of 23,802 patients were included. Despite achieving initial glycemic control, more than 60% of patients who started within HbA1c targets and without complications developed chronic complications within two years. Our models, which achieved up to 80% accuracy and F1 scores above 74%, identified key predictors of disease progression. Adherence to dyslipidemia treatment guidelines significantly reduced the likelihood of HbA1c deterioration and complications, whereas non-adherence to pharmacological treatments increased the risk of complications. These findings suggest that HbA1c control alone is insufficient to prevent disease progression and that a more comprehensive management approach—including lipid control, kidney function monitoring, and improved adherence to clinical guidelines—is necessary. Conclusions Patient compliance with pharmacological treatments, professional adherence to clinical practice guidelines, and lifestyle interventions play a crucial role in diabetes progression. While our models provide strong predictive capabilities, improving data quality and integration remains essential for better forecasting and intervention strategies.

Original languageEnglish (US)
Article numbere0321258
JournalPLOS ONE
Volume20
Issue number4 April
DOIs
StatePublished - Apr 2025

All Science Journal Classification (ASJC) codes

  • General

Cite this