A Computational Approach for MicroRNA Identification in Plants: Combining Genome-Based Predictions with RNA-Seq Data

Jorge S Oliveira, Nuno D Mende

Abstract

MicroRNAs are endogenous molecules that act by silencing targeted messenger RNAs, and which have an important regulatory role in many physiological processes in both plants and animals. Here, we propose a pipeline that makes use of CRAVELA, a single-genome microRNA finding tool originally developed for microRNA discovery in animals, and an NGS data analysis algorithm that provides a novel scoring function to evaluate the expression profile of candidates, taking advantage of the expected relative abundance of RNA fragments originating from the mature sequence, compared to other portions of the microRNA precursor. This approach was tested in Eucalyptus spp. for which, despite their economic importance, no microRNAs have been documented. The outcome of our approach was a short list of candidates, including both conserved and non-conserved sequences. Experimental validation showed amplification in 6 out of 8 candidates chosen from the best-scoring non-conserved sequences.

Relevant Publications in Data Mining in Genomics & Proteomics