Escherichia coli: Analysis of Features for Protein Localization Classification Employing Fusion Data: Analysis of Features for Protein Localization Classification Employing Fusion Data

Alvaro David Orjuela-Cañon, Diana C. Rodriguez, Oscar Perdomo

Research output: Chapter in Book/ReportChapter

Abstract

Machine learning models can be used for relevance of features in classification systems. The interest in protein analysis based on biomolecular information has rapidly grown. In this case a comparison of two sources of this information was employed to determine protein localization in Escherichia coli cells. Models as support vector machines, artificial neural networks and random forest were compared for the prediction of protein localization. The sources of data used to train the models were the information from targeting signal and protein sequences, for determining the localization sites of the protein. A third scenario with a fusion of both sources of data was employed. Four classes were established according to the subcellular localization of the protein: cytoplasm, periplasmatic space, outer and inner membranes. Results reached values between 77% and 92% in terms of balanced accuracy. The models with better performance were based on random forest and support vector machines. In terms of features, the first source, where targeting signal was employed, was the one with best performance associated to relevance for the classification.

Original languageEnglish (US)
Title of host publicationApplications of Computational Intelligence - 5th IEEE Colombian Conference, ColCACI 2022, Revised Selected Papers
EditorsAlvaro David Orjuela-Cañón, Jesus Lopez, Julian David Arias-Londoño, Juan Carlos Figueroa-García
PublisherSpringer Science and Business Media Deutschland GmbH
Pages31-43
Number of pages13
ISBN (Print)9783031297823
DOIs
StatePublished - Jan 1 2023
Event5th IEEE Colombian Conference on Applications of Computational Intelligence, ColCACI 2022 - Cali, Colombia
Duration: Jul 27 2022Jul 29 2022

Publication series

NameCommunications in Computer and Information Science
Volume1746 CCIS

Conference

Conference5th IEEE Colombian Conference on Applications of Computational Intelligence, ColCACI 2022
Country/TerritoryColombia
CityCali
Period7/27/227/29/22

All Science Journal Classification (ASJC) codes

  • General Computer Science
  • General Mathematics

Fingerprint

Dive into the research topics of 'Escherichia coli: Analysis of Features for Protein Localization Classification Employing Fusion Data: Analysis of Features for Protein Localization Classification Employing Fusion Data'. Together they form a unique fingerprint.

Cite this