You are here

Rfam

Submitted by ChenLiang on Fri, 09/02/2016 - 21:59

Status:

Platform:

Rating: 
5
Average: 5 (2 votes)

MicroRNAs (miRNAs) are small noncoding RNA gene products about 22 nt long that are processed by Dicer from precursors with a characteristic hairpin secondary structure. Guidelines are presented for the identification and annotation of new miRNAs from diverse organisms, particularly so that miRNAs can be reliably distinguished from other RNAs such as small interfering RNAs. We describe specific criteria for the experimental verification of miRNAs, and conventions for naming miRNAs and miRNA genes. Finally, an online clearinghouse for miRNA gene name assignments is provided by the Rfam database of RNA families.[1]

Rfam is a comprehensive collection of non-coding RNA (ncRNA) families, represented by multiple sequence alignments and profile stochastic context-free grammars. Rfam aims to facilitate the identification and classification of new members of known sequence families, and distributes annotation of ncRNAs in over 200 complete genome sequences. The data provide the first glimpses of conservation of multiple ncRNA families across a wide taxonomic range. A small number of large families are essential in all three kingdoms of life, with large numbers of smaller families specific to certain taxa. Recent improvements in the database are discussed, together with challenges for the future. Rfam is available on the Web at http://www.sanger.ac.uk/Software/Rfam/ and http://rfam.wustl.edu/.[2]

The online encyclopedia Wikipedia has become one of the most important online references in the world and has a substantial and growing scientific content. A search of Google with many RNA-related keywords identifies a Wikipedia article as the top hit. We believe that the RNA community has an important and timely opportunity to maximize the content and quality of RNA information in Wikipedia. To this end, we have formed the RNA WikiProject (http://en.wikipedia.org/wiki/Wikipedia:WikiProject_RNA) as part of the larger Molecular and Cellular Biology WikiProject. We have created over 600 new Wikipedia articles describing families of noncoding RNAs based on the Rfam database, and invite the community to update, edit, and correct these articles. The Rfam database now redistributes this Wikipedia content as the primary textual annotation of its RNA families. Users can, therefore, for the first time, directly edit the content of one of the major RNA databases. We believe that this Wikipedia/Rfam link acts as a functioning model for incorporating community annotation into molecular biology databases.[3]

Rfam is a collection of RNA sequence families, represented by multiple sequence alignments and covariance models (CMs). The primary aim of Rfam is to annotate new members of known RNA families on nucleotide sequences, particularly complete genomes, using sensitive BLAST filters in combination with CMs. A minority of families with a very broad taxonomic range (e.g. tRNA and rRNA) provide the majority of the sequence annotations, whilst the majority of Rfam families (e.g. snoRNAs and miRNAs) have a limited taxonomic range and provide a limited number of annotations. Recent improvements to the website, methodologies and data used by Rfam are discussed. Rfam is freely available on the Web at http://rfam.sanger.ac.uk/and http://rfam.janelia.org/.[4]

The Rfam database aims to catalogue non-coding RNAs through the use of sequence alignments and statistical profile models known as covariance models. In this contribution, we discuss the pros and cons of using the online encyclopedia, Wikipedia, as a source of community-derived annotation. We discuss the addition of groupings of related RNA families into clans and new developments to the website. Rfam is available on the Web at http://rfam.sanger.ac.uk.[5]

The Rfam database (available via the website at http://rfam.sanger.ac.uk and through our mirror at http://rfam.janelia.org) is a collection of non-coding RNA families, primarily RNAs with a conserved RNA secondary structure, including both RNA genes and mRNA cis-regulatory elements. Each family is represented by a multiple sequence alignment, predicted secondary structure and covariance model. Here we discuss updates to the database in the latest release, Rfam 11.0, including the introduction of genome-based alignments for large families, the introduction of the Rfam Biomart as well as other user interface improvements. Rfam is available under the Creative Commons Zero license.[6]

The Rfam database (available at http://rfam.xfam.org) is a collection of non-coding RNA families represented by manually curated sequence alignments, consensus secondary structures and annotation gathered from corresponding Wikipedia, taxonomy and ontology resources. In this article, we detail updates and improvements to the Rfam data and website for the Rfam 12.0 release. We describe the upgrade of our search pipeline to use Infernal 1.1 and demonstrate its improved homology detection ability by comparison with the previous version. The new pipeline is easier for users to apply to their own data sets, and we illustrate its ability to annotate RNAs in genomic and metagenomic data sets of various sizes. Rfam has been expanded to include 260 new families, including the well-studied large subunit ribosomal RNA family, and for the first time includes information on short sequence- and structure-based RNA motifs present within families.[7]


References

  1. A uniform system for microRNA annotation.,
    Ambros, Victor, Bartel Bonnie, Bartel David P., Burge Christopher B., Carrington James C., Chen Xuemei, Dreyfuss Gideon, Eddy Sean R., Griffiths-Jones Sam, Marshall Mhairi, et al.
    , RNA, 2003 Mar, Volume 9, Issue 3, p.277-9, (2003)
  2. Rfam: annotating non-coding RNAs in complete genomes.,
    Griffiths-Jones, Sam, Moxon Simon, Marshall Mhairi, Khanna Ajay, Eddy Sean R., and Bateman Alex
    , Nucleic Acids Res, 2005 Jan 1, Volume 33, Issue Database issue, p.D121-4, (2005)
  3. The RNA WikiProject: community annotation of RNA families.,
    Daub, Jennifer, Gardner Paul P., Tate John, Ramsköld Daniel, Manske Magnus, Scott William G., Weinberg Zasha, Griffiths-Jones Sam, and Bateman Alex
    , RNA, 2008 Dec, Volume 14, Issue 12, p.2462-4, (2008)
  4. Rfam: updates to the RNA families database.,
    Gardner, Paul P., Daub Jennifer, Tate John G., Nawrocki Eric P., Kolbe Diana L., Lindgreen Stinus, Wilkinson Adam C., Finn Robert D., Griffiths-Jones Sam, Eddy Sean R., et al.
    , Nucleic Acids Res, 2009 Jan, Volume 37, Issue Database issue, p.D136-40, (2009)
  5. Rfam: Wikipedia, clans and the "decimal" release.,
    Gardner, Paul P., Daub Jennifer, Tate John, Moore Benjamin L., Osuch Isabelle H., Griffiths-Jones Sam, Finn Robert D., Nawrocki Eric P., Kolbe Diana L., Eddy Sean R., et al.
    , Nucleic Acids Res, 2011 Jan, Volume 39, Issue Database issue, p.D141-5, (2011)
  6. Rfam 11.0: 10 years of RNA families.,
    Burge, Sarah W., Daub Jennifer, Eberhardt Ruth, Tate John, Barquist Lars, Nawrocki Eric P., Eddy Sean R., Gardner Paul P., and Bateman Alex
    , Nucleic Acids Res, 2013 Jan, Volume 41, Issue Database issue, p.D226-32, (2013)
  7. Rfam 12.0: updates to the RNA families database.,
    Nawrocki, Eric P., Burge Sarah W., Bateman Alex, Daub Jennifer, Eberhardt Ruth Y., Eddy Sean R., Floden Evan W., Gardner Paul P., Jones Thomas A., Tate John, et al.
    , Nucleic Acids Res, 2015 Jan, Volume 43, Issue Database issue, p.D130-7, (2015)