NClassG+: A classifier for non-classically secreted Gram-positive bacterial proteins

Daniel Restrepo-Montoya, Camilo Pino, Luis F. Nino, Manuel E. Patarroyo, Manuel A. Patarroyo

Resultado de la investigación: Contribución a RevistaArtículo

8 Citas (Scopus)

Resumen

Background: Most predictive methods currently available for the identification of protein secretion mechanisms have focused on classically secreted proteins. In fact, only two methods have been reported for predicting non-classically secreted proteins of Gram-positive bacteria. This study describes the implementation of a sequence-based classifier, denoted as NClassG+, for identifying non-classically secreted Gram-positive bacterial proteins.Results: Several feature-based classifiers were trained using different sequence transformation vectors (frequencies, dipeptides, physicochemical factors and PSSM) and Support Vector Machines (SVMs) with Linear, Polynomial and Gaussian kernel functions. Nested k-fold cross-validation (CV) was applied to select the best models, using the inner CV loop to tune the model parameters and the outer CV group to compute the error. The parameters and Kernel functions and the combinations between all possible feature vectors were optimized using grid search.Conclusions: The final model was tested against an independent set not previously seen by the model, obtaining better predictive performance compared to SecretomeP V2.0 and SecretPV2.0 for the identification of non-classically secreted proteins. NClassG+ is freely available on the web at http://www.biolisi.unal.edu.co/web-servers/nclassgpositive/. © 2011 Restrepo-Montoya et al; licensee BioMed Central Ltd.
Idioma originalEnglish (US)
PublicaciónBMC Bioinformatics
DOI
EstadoPublished - ene 14 2011

Huella dactilar

Bacterial Proteins
Classifiers
Classifier
Proteins
Protein
Cross-validation
Kernel Function
Dipeptides
Gram-Positive Bacteria
Gaussian Kernel
Gaussian Function
Secretion
Support vector machines
Web Server
Bacteria
Independent Set
Feature Vector
Servers
Model
Polynomials

Citar esto

Restrepo-Montoya, Daniel ; Pino, Camilo ; Nino, Luis F. ; Patarroyo, Manuel E. ; Patarroyo, Manuel A. / NClassG+: A classifier for non-classically secreted Gram-positive bacterial proteins. En: BMC Bioinformatics. 2011.
@article{68cbb30be84d4a129dffc1456a5aa69e,
title = "NClassG+: A classifier for non-classically secreted Gram-positive bacterial proteins",
abstract = "Background: Most predictive methods currently available for the identification of protein secretion mechanisms have focused on classically secreted proteins. In fact, only two methods have been reported for predicting non-classically secreted proteins of Gram-positive bacteria. This study describes the implementation of a sequence-based classifier, denoted as NClassG+, for identifying non-classically secreted Gram-positive bacterial proteins.Results: Several feature-based classifiers were trained using different sequence transformation vectors (frequencies, dipeptides, physicochemical factors and PSSM) and Support Vector Machines (SVMs) with Linear, Polynomial and Gaussian kernel functions. Nested k-fold cross-validation (CV) was applied to select the best models, using the inner CV loop to tune the model parameters and the outer CV group to compute the error. The parameters and Kernel functions and the combinations between all possible feature vectors were optimized using grid search.Conclusions: The final model was tested against an independent set not previously seen by the model, obtaining better predictive performance compared to SecretomeP V2.0 and SecretPV2.0 for the identification of non-classically secreted proteins. NClassG+ is freely available on the web at http://www.biolisi.unal.edu.co/web-servers/nclassgpositive/. {\circledC} 2011 Restrepo-Montoya et al; licensee BioMed Central Ltd.",
author = "Daniel Restrepo-Montoya and Camilo Pino and Nino, {Luis F.} and Patarroyo, {Manuel E.} and Patarroyo, {Manuel A.}",
year = "2011",
month = "1",
day = "14",
doi = "10.1186/1471-2105-12-21",
language = "English (US)",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

NClassG+: A classifier for non-classically secreted Gram-positive bacterial proteins. / Restrepo-Montoya, Daniel; Pino, Camilo; Nino, Luis F.; Patarroyo, Manuel E.; Patarroyo, Manuel A.

En: BMC Bioinformatics, 14.01.2011.

Resultado de la investigación: Contribución a RevistaArtículo

TY - JOUR

T1 - NClassG+: A classifier for non-classically secreted Gram-positive bacterial proteins

AU - Restrepo-Montoya, Daniel

AU - Pino, Camilo

AU - Nino, Luis F.

AU - Patarroyo, Manuel E.

AU - Patarroyo, Manuel A.

PY - 2011/1/14

Y1 - 2011/1/14

N2 - Background: Most predictive methods currently available for the identification of protein secretion mechanisms have focused on classically secreted proteins. In fact, only two methods have been reported for predicting non-classically secreted proteins of Gram-positive bacteria. This study describes the implementation of a sequence-based classifier, denoted as NClassG+, for identifying non-classically secreted Gram-positive bacterial proteins.Results: Several feature-based classifiers were trained using different sequence transformation vectors (frequencies, dipeptides, physicochemical factors and PSSM) and Support Vector Machines (SVMs) with Linear, Polynomial and Gaussian kernel functions. Nested k-fold cross-validation (CV) was applied to select the best models, using the inner CV loop to tune the model parameters and the outer CV group to compute the error. The parameters and Kernel functions and the combinations between all possible feature vectors were optimized using grid search.Conclusions: The final model was tested against an independent set not previously seen by the model, obtaining better predictive performance compared to SecretomeP V2.0 and SecretPV2.0 for the identification of non-classically secreted proteins. NClassG+ is freely available on the web at http://www.biolisi.unal.edu.co/web-servers/nclassgpositive/. © 2011 Restrepo-Montoya et al; licensee BioMed Central Ltd.

AB - Background: Most predictive methods currently available for the identification of protein secretion mechanisms have focused on classically secreted proteins. In fact, only two methods have been reported for predicting non-classically secreted proteins of Gram-positive bacteria. This study describes the implementation of a sequence-based classifier, denoted as NClassG+, for identifying non-classically secreted Gram-positive bacterial proteins.Results: Several feature-based classifiers were trained using different sequence transformation vectors (frequencies, dipeptides, physicochemical factors and PSSM) and Support Vector Machines (SVMs) with Linear, Polynomial and Gaussian kernel functions. Nested k-fold cross-validation (CV) was applied to select the best models, using the inner CV loop to tune the model parameters and the outer CV group to compute the error. The parameters and Kernel functions and the combinations between all possible feature vectors were optimized using grid search.Conclusions: The final model was tested against an independent set not previously seen by the model, obtaining better predictive performance compared to SecretomeP V2.0 and SecretPV2.0 for the identification of non-classically secreted proteins. NClassG+ is freely available on the web at http://www.biolisi.unal.edu.co/web-servers/nclassgpositive/. © 2011 Restrepo-Montoya et al; licensee BioMed Central Ltd.

U2 - 10.1186/1471-2105-12-21

DO - 10.1186/1471-2105-12-21

M3 - Article

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

ER -