Bioinformatics in High Throughput Sequencing: Application in Evolving Genetic Diseases

Mohammad MS Al-Haggar, Balk

Abstract

Bioinformatics is a computational biology, in terms of macromolecules applying “informatics” techniques to understand and organize the information associated with these molecules. These data are product of large-scale molecular biology projects, such as the various genomes sequencing projects, analysis of gene expression and analysis of genomics, proteomics and protein-protein interactions. They are collected and stored in different databases. Analysis in bioinformatics available in molecular biology focuses on: macromolecular structures, genome sequences and gene expression data. Techniques developed by computer scientists have enabled researchers to sequence nearly 3 billion base pairs of the human genome. Recent scientific discoveries that resulted from the application of next generation DNA sequencing technologies have given rise to the science of genomics, and have enabled critical advances in other fields, including epidemiology, forensics, evolutionary biology and medical diagnostics. Technologies for high throughput sequencing, their limitations and their applications are spotted in this review. Sequencing known genes enables the discovery of novel mutations that could help scientists understanding the evolving features of some genetic diseases, occurrence of many genetic diseases due to mutant variants of one gene or clusters of genes, or even explains the overlapping features of some genetic diseases mapped to nearby or distant loci.

Relevant Publications in Data Mining in Genomics & Proteomics