You are here

SeqBuster

Submitted by ChenLiang on Fri, 09/02/2016 - 21:59

Status:

Implement Technique:

Species:

Pubmed IDs: 
Rating: 
Average: 5 (1 vote)

High-throughput sequencing technologies enable direct approaches to catalog and analyze snapshots of the total small RNA content of living cells. Characterization of high-throughput sequencing data requires bioinformatic tools offering a wide perspective of the small RNA transcriptome. Here we present SeqBuster, a highly versatile and reliable web-based toolkit to process and analyze large-scale small RNA datasets. The high flexibility of this tool is illustrated by the multiple choices offered in the pre-analysis for mapping purposes and in the different analysis modules for data manipulation. To overcome the storage capacity limitations of the web-based tool, SeqBuster offers a stand-alone version that permits the annotation against any custom database. SeqBuster integrates multiple analyses modules in a unique platform and constitutes the first bioinformatic tool offering a deep characterization of miRNA variants (isomiRs). The application of SeqBuster to small-RNA datasets of human embryonic stem cells revealed that most miRNAs present different types of isomiRs, some of them being associated to stem cell differentiation. The exhaustive description of the isomiRs provided by SeqBuster could help to identify miRNA-variants that are relevant in physiological and pathological processes. SeqBuster is available at http://estivill_lab.crg.es/seqbuster.[1]

Recent progress in high-throughput sequencing technologies has largely contributed to reveal a highly complex landscape of small non-coding RNAs (sRNAs), including novel non-canonical sRNAs derived from long non-coding RNA, repeated elements, transcription start sites and splicing site regions among others. The published frameworks for sRNA data analysis are focused on miRNA detection and prediction, ignoring further information in the dataset. As a consequence, tools for the identification and classification of the sRNAs not belonging to miRNA family are currently lacking.
Here, we present, SeqCluster, an extension of the currently available SeqBuster tool to identify and analyze at different levels the sRNAs not annotated or predicted as miRNAs. This new module deals with sequences mapping onto multiple locations and permits a highly versatile and user-friendly interaction with the data in order to easily classify sRNA sequences with a putative functional importance. We were able to detect all known classes of sRNAs described to date using SeqCluster with different sRNA datasets.[2]

Most computational tools for small non-coding RNAs (sRNA) sequencing data analysis focus in microRNAs (miRNAs), overlooking other types of sRNAs that show multi-mapping hits. Here, we have developed a pipeline to non-redundantly quantify all types of sRNAs, and extract patterns of expression in biologically defined groups. We have used our tool to characterize and profile sRNAs in post-mortem brain samples of control individuals and Parkinson's disease (PD) cases at early-premotor and late-symptomatic stages.
Clusters of co-expressed sRNAs mapping onto tRNAs significantly separated premotor and motor cases from controls. A similar result was obtained using a matrix of miRNAs slightly varying in sequence (isomiRs). The present framework revealed sRNA alterations at premotor stages of PD, which might reflect initial pathogenic perturbations. This tool may be useful to discover sRNA expression patterns linked to different biological conditions.
The full code is available at http://github.com/lpantano/seqbuster
lpantano@hsph.harvard.edu or eulalia.marti@crg.eu
Supplementary data are available at Bioinformatics online.[3]


References