Predicting politicians' misconduct: Evidence from Colombia

Jorge Gallego; Mounu Prem; Juan F. Vargas

doi:10.1017/dap.2022.35

Predicting politicians' misconduct: Evidence from Colombia

Jorge Gallego, Mounu Prem, Juan F. Vargas

Faculty of Economics

Research output: Contribution to journal › Article › peer-review

Abstract

Corruption has pervasive effects on economic development and the well-being of the population. Despite being crucial and necessary, fighting corruption is not an easy task because it is a difficult phenomenon to measure and detect. However, recent advances in the field of artificial intelligence may help in this quest. In this article, we propose the use of machine-learning models to predict municipality-level corruption in a developing country. Using data from disciplinary prosecutions conducted by an anti-corruption agency in Colombia, we trained four canonical models (Random Forests, Gradient Boosting Machine, Lasso, and Neural Networks), and ensemble their predictions, to predict whether or not a mayor will commit acts of corruption. Our models achieve acceptable levels of performance, based on metrics such as the precision and the area under the receiver-operating characteristic curve, demonstrating that these tools are useful in predicting where misbehavior is most likely to occur. Moreover, our feature-importance analysis shows us which groups of variables are most important in predicting corruption.

Original language	English (US)
Article number	e41
Journal	Data and Policy
Volume	4
DOIs	https://doi.org/10.1017/dap.2022.35
State	Published - Nov 14 2022

All Science Journal Classification (ASJC) codes

Computer Science (miscellaneous)
Artificial Intelligence
Social Sciences (miscellaneous)
Public Administration

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1017/dap.2022.35

Cite this

@article{452e3717d44f498c899bf283ff6a0616,

title = "Predicting politicians' misconduct: Evidence from Colombia",

abstract = "Corruption has pervasive effects on economic development and the well-being of the population. Despite being crucial and necessary, fighting corruption is not an easy task because it is a difficult phenomenon to measure and detect. However, recent advances in the field of artificial intelligence may help in this quest. In this article, we propose the use of machine-learning models to predict municipality-level corruption in a developing country. Using data from disciplinary prosecutions conducted by an anti-corruption agency in Colombia, we trained four canonical models (Random Forests, Gradient Boosting Machine, Lasso, and Neural Networks), and ensemble their predictions, to predict whether or not a mayor will commit acts of corruption. Our models achieve acceptable levels of performance, based on metrics such as the precision and the area under the receiver-operating characteristic curve, demonstrating that these tools are useful in predicting where misbehavior is most likely to occur. Moreover, our feature-importance analysis shows us which groups of variables are most important in predicting corruption.",

author = "Jorge Gallego and Mounu Prem and Vargas, {Juan F.}",

note = "Funding Information: Acknowledgments. We thank Mision de Observacion Electoral, Contraloriaa General de la Republica, and Luis Mart..nez for sharing with us the data used in this project. Erika Corzo and And.es Rivera provided excellent research assistance. We also thank seminar participants at the World Bank and University of Pennsylvania. Funding Statement. This work received no specific grant from any funding agency, commercial, or not-for-profit sectors. Publisher Copyright: {\textcopyright} The Author(s), 2022. Published by Cambridge University Press.",

year = "2022",

month = nov,

day = "14",

doi = "10.1017/dap.2022.35",

language = "English (US)",

volume = "4",

journal = "Data and Policy",

issn = "2632-3249",

publisher = "Cambridge University Press",

}

TY - JOUR

T1 - Predicting politicians' misconduct

T2 - Evidence from Colombia

AU - Gallego, Jorge

AU - Prem, Mounu

AU - Vargas, Juan F.

N1 - Funding Information: Acknowledgments. We thank Mision de Observacion Electoral, Contraloriaa General de la Republica, and Luis Mart..nez for sharing with us the data used in this project. Erika Corzo and And.es Rivera provided excellent research assistance. We also thank seminar participants at the World Bank and University of Pennsylvania. Funding Statement. This work received no specific grant from any funding agency, commercial, or not-for-profit sectors. Publisher Copyright: © The Author(s), 2022. Published by Cambridge University Press.

PY - 2022/11/14

Y1 - 2022/11/14

N2 - Corruption has pervasive effects on economic development and the well-being of the population. Despite being crucial and necessary, fighting corruption is not an easy task because it is a difficult phenomenon to measure and detect. However, recent advances in the field of artificial intelligence may help in this quest. In this article, we propose the use of machine-learning models to predict municipality-level corruption in a developing country. Using data from disciplinary prosecutions conducted by an anti-corruption agency in Colombia, we trained four canonical models (Random Forests, Gradient Boosting Machine, Lasso, and Neural Networks), and ensemble their predictions, to predict whether or not a mayor will commit acts of corruption. Our models achieve acceptable levels of performance, based on metrics such as the precision and the area under the receiver-operating characteristic curve, demonstrating that these tools are useful in predicting where misbehavior is most likely to occur. Moreover, our feature-importance analysis shows us which groups of variables are most important in predicting corruption.

AB - Corruption has pervasive effects on economic development and the well-being of the population. Despite being crucial and necessary, fighting corruption is not an easy task because it is a difficult phenomenon to measure and detect. However, recent advances in the field of artificial intelligence may help in this quest. In this article, we propose the use of machine-learning models to predict municipality-level corruption in a developing country. Using data from disciplinary prosecutions conducted by an anti-corruption agency in Colombia, we trained four canonical models (Random Forests, Gradient Boosting Machine, Lasso, and Neural Networks), and ensemble their predictions, to predict whether or not a mayor will commit acts of corruption. Our models achieve acceptable levels of performance, based on metrics such as the precision and the area under the receiver-operating characteristic curve, demonstrating that these tools are useful in predicting where misbehavior is most likely to occur. Moreover, our feature-importance analysis shows us which groups of variables are most important in predicting corruption.

UR - http://www.scopus.com/inward/record.url?scp=85151812572&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85151812572&partnerID=8YFLogxK

U2 - 10.1017/dap.2022.35

DO - 10.1017/dap.2022.35

M3 - Article

AN - SCOPUS:85151812572

SN - 2632-3249

VL - 4

JO - Data and Policy

JF - Data and Policy

M1 - e41

ER -

Predicting politicians' misconduct: Evidence from Colombia

Abstract

All Science Journal Classification (ASJC) codes

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this