Machine learning model for HIV1 and HIV2 enzyme secondary structure classification

Anubha Dubey, Bhaskar Pant, Us

Abstract

The structure of a protein can reveal its function and its evolutionary history. Extracting this information requires knowledge of the structure and its relationship with other proteins. Secondary structures of protein are compact with helices and strands. Hence there is a need for development of computational techniques for prediction and classification of HIV-1and HIV-2 protein (enzymes) structures. In this paper a machine learning model has been developed for classification of alpha, beta and residues of HIV ribonuclease, HIV reverse transcriptase, protease, integrase, and these four types of HIV enzymes are present in HIV1 & HIV2 cycle. Various machine learning algorithms such as J48, Rotation Forest, and Random Forest have been used to classify alpha, beta and residues of HIV reverse transcriptase, protease, ribonuclease, integrase and model developed gives fair accuracy. The information generated from these models can be of great use in clinical applications.

Relevant Publications in Journal of Computational Methods in Molecular Design