An Analysis of Palindromes and n-nary Tract Frequencies found in a Genomic Sequence

Dan Ophir

Abstract

The motivation to investigate n-nary and palindrome tracts arose following the discovery by Chargaff and coworkers of over-representation of certain DNA binary tracts in genomes. They investigated the frequencies of various ternary tracts in diverse locations in genes of various species. The current research further examines ternary tracts and the palindromes will hereafter be called designated tracts. Does a designated tract have any extraordinary frequencies of length and location? A theoretical mathematical analysis has been performed to analyze the amount of designated tracts according to the frequencies of its single elements. The designated tracts are categorized according to those that have mixed elements from a subset of a set composing the sequence, and according to which consist of a long tract of lower n-nary order. For example, tract analysis investigates whether the special phenomena are due to the ternary tract or due to a long binary tract that is included in it. The maximal n-nary tract order of interest in the genome is of three (ternary); four is the whole gene itself. However, the higher order of n-nary tracts is of interest in other areas like “Reliability theory”. Therefore, the general formulation and treatment of designated tracts is presented here and is demonstrated for the genomic aspects, which were thoroughly investigated in the two past decades.

Relevant Publications in Data Mining in Genomics & Proteomics