Rat Strain Ontology: structured controlled vocabulary designed to facilitate access to strain data at RGD

J Biomed Semantics. 2013 Nov 22;4(1):36. doi: 10.1186/2041-1480-4-36.

Abstract

Background: The Rat Genome Database (RGD) ( http://rgd.mcw.edu/) is the premier site for comprehensive data on the different strains of the laboratory rat (Rattus norvegicus). The strain data are collected from various publications, direct submissions from individual researchers, and rat providers worldwide. Rat strain, substrain designation and nomenclature follow the Guidelines for Nomenclature of Mouse and Rat Strains, instituted by the International Committee on Standardized Genetic Nomenclature for Mice. While symbols and names aid in identifying strains correctly, the flat nature of this information prohibits easy search and retrieval, as well as other data mining functions. In order to improve these functionalities, particularly in ontology-based tools, the Rat Strain Ontology (RS) was developed.

Results: The Rat Strain Ontology (RS) reflects the breeding history, parental background, and genetic manipulation of rat strains. This controlled vocabulary organizes strains by type: inbred, outbred, chromosome altered, congenic, mutant and so on. In addition, under the chromosome altered category, strains are organized by chromosome, and further by type of manipulations, such as mutant or congenic. This allows users to easily retrieve strains of interest with modifications in specific genomic regions. The ontology was developed using the Open Biological and Biomedical Ontology (OBO) file format, and is organized on the Directed Acyclic Graph (DAG) structure. Rat Strain Ontology IDs are included as part of the strain report (RS: ######).

Conclusions: As rat researchers are often unaware of the number of substrains or altered strains within a breeding line, this vocabulary now provides an easy way to retrieve all substrains and accompanying information. Its usefulness is particularly evident in tools such as the PhenoMiner at RGD, where users can now easily retrieve phenotype measurement data for related strains, strains with similar backgrounds or those with similar introgressed regions. This controlled vocabulary also allows better retrieval and filtering for QTLs and in genomic tools such as the GViewer.The Rat Strain Ontology has been incorporated into the RGD Ontology Browser ( http://rgd.mcw.edu/rgdweb/ontology/view.html?acc_id=RS:0000457#s) and is available through the National Center for Biomedical Ontology ( http://bioportal.bioontology.org/ontologies/1150) or the RGD ftp site ( ftp://rgd.mcw.edu/pub/ontology/rat_strain/).