Escherichia coli: Analysis of Features for Protein Localization Classification Employing Fusion Data: Analysis of Features for Protein Localization Classification Employing Fusion Data

Alvaro David Orjuela-Cañon, Diana C. Rodriguez, Oscar Perdomo

Producción científica: Capítulo en Libro/InformeCapítuloInvestigación


Machine learning models can be used for relevance of features in classification systems. The interest in protein analysis based on biomolecular information has rapidly grown. In this case a comparison of two sources of this information was employed to determine protein localization in Escherichia coli cells. Models as support vector machines, artificial neural networks and random forest were compared for the prediction of protein localization. The sources of data used to train the models were the information from targeting signal and protein sequences, for determining the localization sites of the protein. A third scenario with a fusion of both sources of data was employed. Four classes were established according to the subcellular localization of the protein: cytoplasm, periplasmatic space, outer and inner membranes. Results reached values between 77% and 92% in terms of balanced accuracy. The models with better performance were based on random forest and support vector machines. In terms of features, the first source, where targeting signal was employed, was the one with best performance associated to relevance for the classification.

Idioma originalInglés estadounidense
Título de la publicación alojadaApplications of Computational Intelligence - 5th IEEE Colombian Conference, ColCACI 2022, Revised Selected Papers
EditoresAlvaro David Orjuela-Cañón, Jesus Lopez, Julian David Arias-Londoño, Juan Carlos Figueroa-García
EditorialSpringer Science and Business Media Deutschland GmbH
Número de páginas13
ISBN (versión impresa)9783031297823
EstadoPublicada - ene. 1 2023
Evento5th IEEE Colombian Conference on Applications of Computational Intelligence, ColCACI 2022 - Cali, Colombia
Duración: jul. 27 2022jul. 29 2022

Serie de la publicación

NombreCommunications in Computer and Information Science
Volumen1746 CCIS


Conferencia5th IEEE Colombian Conference on Applications of Computational Intelligence, ColCACI 2022

Áreas temáticas de ASJC Scopus

  • Ciencia de la Computación General
  • Matemáticas General


Profundice en los temas de investigación de 'Escherichia coli: Analysis of Features for Protein Localization Classification Employing Fusion Data: Analysis of Features for Protein Localization Classification Employing Fusion Data'. En conjunto forman una huella única.

Citar esto