Since Ambros' discovery of small non-protein coding RNAs in the early 1990s, the past two decades have seen an upsurge in the number of reports of predicted microRNAs (miR), which have been implicated in various functions. The correlation of miRs with cancer has spurred the usage of this class of non-coding RNAs in various cancer therapies, although most of them are at trial stages. However, the experimental identification of a miR to be associated with cancer is still an elaborate, time-consuming process.
The computational prediction of novel microRNA within a full genome involves identifying sequences having the highest chance of being a miRNA precursor (pre-miRNA). These sequences are usually named candidates to miRNA. The well-known pre-miRNAs are usually only a few in comparison to the hundreds of thousands of potential candidates to miRNA that have to be analyzed, which makes this task a high classimbalance classification problem.
Although many machine learning techniques have been proposed for distinguishing miRNA hairpins from other stem-loop sequences, most of the current methods use supervised learning, which requires a very good set of positive and negative examples. Those methods have important practical limitations when they have to be applied to a real prediction task. First, there is the challenge of dealing with a scarce number of positive (well-known) pre-miRNA examples.
miRNAfe is a comprehensive tool to extract features from RNA sequences. It is freely available as a web service, allowing a single access point to almost all state-of-the-art feature extraction methods used today in a variety of works from different authors. It has a very simple user interface, where the user only needs to load a file containing the input sequences and select the features to extract. As a result, the user obtains a text file with the features extracted, which can be used to analyze the sequences or as input to a miRNA prediction software.
Small RNA sequencing is the most widely used tool for microRNA (miRNA) discovery, and shows great potential for the efficient study of miRNA cross-species transport, i.e., by detecting the presence of exogenous miRNA sequences in the host species. Because of the increased appreciation of dietary miRNAs and their far-reaching implication in human health, research interests are currently growing with regard to exogenous miRNAs bioavailability, mechanisms of cross-species transport and miRNA function in cellular biological processes.
MicroRNAs (miRNAs) are endogenous non-coding small RNAs (of about 22 nucleotides), which play an important role in the post-transcriptional regulation of gene expression via either mRNA cleavage or translation inhibition. Several machine learning-based approaches have been developed to identify novel miRNAs from next generation sequencing (NGS) data. Typically, precursor/genomic sequences are required as references for most methods. However, the non-availability of genomic sequences is often a limitation in miRNA discovery in non-model plants.
MicroRNAs are a predominant type of small non-coding RNAs approximately 21 nucleotides in length that play an essential role at the post-transcriptional level by either RNA degradation, translational repression or both through an RNA-induced silencing complex. Identification of these molecules can aid the dissecting of their regulatory functions. The secondary structures of plant pre-miRNAs are much more complex than those of animal pre-miRNAs. In contrast to prediction tools for animal pre-miRNAs, much less effort has been contributed to plant pre-miRNAs.