Predicting politicians' misconduct: Evidence from Colombia

Jorge Gallego, Mounu Prem, Juan F. Vargas

Research output: Contribution to journalResearch Articlepeer-review

2 Scopus citations


Corruption has pervasive effects on economic development and the well-being of the population. Despite being crucial and necessary, fighting corruption is not an easy task because it is a difficult phenomenon to measure and detect. However, recent advances in the field of artificial intelligence may help in this quest. In this article, we propose the use of machine-learning models to predict municipality-level corruption in a developing country. Using data from disciplinary prosecutions conducted by an anti-corruption agency in Colombia, we trained four canonical models (Random Forests, Gradient Boosting Machine, Lasso, and Neural Networks), and ensemble their predictions, to predict whether or not a mayor will commit acts of corruption. Our models achieve acceptable levels of performance, based on metrics such as the precision and the area under the receiver-operating characteristic curve, demonstrating that these tools are useful in predicting where misbehavior is most likely to occur. Moreover, our feature-importance analysis shows us which groups of variables are most important in predicting corruption.

Original languageEnglish (US)
Article numbere41
JournalData and Policy
StatePublished - Nov 14 2022

All Science Journal Classification (ASJC) codes

  • Computer Science (miscellaneous)
  • Artificial Intelligence
  • Social Sciences (miscellaneous)
  • Public Administration


Dive into the research topics of 'Predicting politicians' misconduct: Evidence from Colombia'. Together they form a unique fingerprint.

Cite this