Full Length Research Paper
S. B. Muley, V. Bastikar*, S.
Abstract
Pathogenic bacteria that cause infectious disease are operated by various virulence mechanisms. Hence, it is important to develop a reliable system for predicting bacterial virulent proteins aiming at discovering novel drug/vaccine and for understanding virulence mechanisms in pathogens. On the basis of features like amino acid and dipeptide composition, it tried to identify the virulence potential in the given biological protein sequence of bacteria using statistical methods like regression analysis, which is of great use in the prediction strategies of the virulence protein. In this work a bacterial virulent protein prediction model, virprob, is proposed based on classifiers, where the features are extracted directly from the amino acid sequence of a given protein. It is a probabilistic model which predicts the virulence potential of the corresponding human pathogenic bacterial protein. An extensive evaluation according to a blind testing protocol, where the parameters of the system are calculated using the training set and the system is validated in independent dataset, has demonstrated the validity of virprob with 53.6% of accuracy. The statistical analysis method may increase the prediction accuracy when combined with machine learning techniques. The results of this analysis might help in rapidly advancing knowledge of infectious agents.