Python in ChIP-Seq data analysis

Li Zhang, Yuansen Hu, Jinshui

Abstract

Python is an interpreted programming language that is simple, clear and powerful. To many scientists in life sciences, Python has become their favorite choice to perform routine work, such as text processing, image plotting, basic statistics, GUI programming and even prototype development. In order to introduce Python to more scientists, here, we present some Python experiences and examples in Illumina ChIP-Seq data analysis. Five in-house Python scripts were written to illustrate the simplicity and clarity of Python usages in data analysis and results presentation: Illumina Q30 analysis, reads distribution around TSS, reads intensity plot, reads distribution along chromosomes and sequence retrieval from genome FASTA files. Finally, we show three programs written in Python for ChIP-Seq data analysis: MACS, SICER and CEAS.

Relevant Publications in Journal of Chemical and Pharmaceutical Research