Registry

The registry contains metadata about ontologies, controlled vocabularies, and resources including their preferred prefix, name, description, homepage, mappings to other registries, and more.

Name Prefix Description
3D Metabolites 3dmet 3DMET is a database collecting three-dimensional structures of natural metabolites.
4D Nucleome Data Portal Biosource 4dn.biosource The 4D Nucleome Data Portal hosts data generated by the 4DN Network and other reference nucleomics data sets. The 4D Nucleome Network aims to understand the principles underlying nuclear organization in space and time, the role nuclear organization plays in gene expression and cellular function, and how changes in nuclear organization affect normal development as well as various diseases.
4D Nucleome Data Portal Experiment Replicate 4dn.replicate Database portal containing replicate experiments of different assays and samples
Amphibian gross anatomy aao A structured controlled vocabulary of the anatomy of Amphibians. Note that AAO is currently being integrated into Uberon.
AntiBodies Chemically Defined database abcd The ABCD (AntiBodies Chemically Defined) database is a manually curated depository of sequenced antibodies
Annotated Regulatory Binding Sites abs The database of Annotated regulatory Binding Sites (from orthologous promoters), ABS, is a public database of known binding sites identified in promoters of orthologous vertebrate genes that have been manually curated from bibliography.
Activity Streams ac Activity Streams is an open format specification for activity stream protocols, which are used to syndicate activities taken in social web applications and services, similar to those in Facebook's, Instagram's, and Twitter's. The standard provides a general way to represent activities.
Aceview Worm aceview.worm AceView provides a curated sequence representation of all public mRNA sequences (mRNAs from GenBank or RefSeq, and single pass cDNA sequences from dbEST and Trace). These are aligned on the genome and clustered into a minimal number of alternative transcript variants and grouped into genes. In addition, alternative features such as promoters, and expression in tissues is recorded. This collection references C. elegans genes and expression.
Addgene Plasmid Repository addgene Addgene is a non-profit plasmid repository. Addgene facilitates the exchange of genetic material between laboratories by offering plasmids and their associated cloning data to not-for-profit laboratories around the world.
Animal natural history and life history adw Animal Diversity Web (ADW) is an online database of animal natural history, distribution, classification, and conservation biology.
Anatomical Entity Ontology aeo AEO is an ontology of anatomical structures that expands CARO, the Common Anatomy Reference Ontology
Adverse Event Reporting Ontology aero The Adverse Event Reporting Ontology (AERO) is an ontology aimed at supporting clinicians at the time of data entry, increasing quality and accuracy of reported adverse events
Affymetrix Probeset affy.probeset An Affymetrix ProbeSet is a collection of up to 11 short (~22 nucleotide) microarray probes designed to measure a single gene or a family of genes as a unit. Multiple probe sets may be available for each gene under consideration.
Allotrope Merged Ontology Suite afo Allotrope Merged Ontology Suite
Assembling the Fungal Tree of Life - Taxonomy aftol.taxonomy The Assembling the Fungal Tree of Life (AFTOL) project is dedicated to significantly enhancing our understanding of the evolution of the Kingdom Fungi, which represents one of the major clades of life. There are roughly 80,000 described species of Fungi, but the actual diversity in the group has been estimated to be about 1.5 million species.
Agricultural Online Access agricola AGRICOLA (AGRICultural OnLine Access) serves as the catalog and index to the collections of the National Agricultural Library, as well as a primary public source for world-wide access to agricultural information. The database covers materials in all formats and periods, including printed works from as far back as the 15th century.
Agronomy Ontology agro Ontology of agronomic practices, agronomic techniques, and agronomic variables used in agronomic experiments
Ontology for the Anatomy of the Insect SkeletoMuscular system aism The ontology for the Anatomy of the Insect SkeletoMuscular system (AISM) contains terms used to describe the cuticle - as a single anatomical structure - and the skeletal muscle system, to be used in insect biodiversity research.
Allergome allergome Allergome is a repository of data related to all IgE-binding compounds. Its purpose is to collect a list of allergenic sources and molecules by using the widest selection criteria and sources.
Alzforum Mutations alzforum.mutation Alzforum mutations is a repository of genes and rare variants associated with Alzheimer's disease.
AmoebaDB amoebadb AmoebaDB is one of the databases that can be accessed through the EuPathDB (http://EuPathDB.org; formerly ApiDB) portal, covering eukaryotic pathogens of the genera Cryptosporidium, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Trichomonas and Trypanosoma. While each of these groups is supported by a taxon-specific database built upon the same infrastructure, the EuPathDB portal offers an entry point to all these resources, and the opportunity to leverage orthology for searches across genera.
The Amphioxus Development and Anatomy Ontology amphx An ontology for the development and anatomy of Amphioxus (Branchiostoma lanceolatum).
Antibody Registry antibodyregistry The Antibody Registry provides identifiers for antibodies used in publications. It lists commercial antibodies from numerous vendors, each assigned with a unique identifier. Unlisted antibodies can be submitted by providing the catalog number and vendor information.
Ant Database antweb AntWeb is a website documenting the known species of ants, with records for each species linked to their geographical distribution, life history, and includes pictures.
AOPWiki aop International repository of Adverse Outcome Pathways.
AOPWiki (Key Event) aop.events International repository of Adverse Outcome Pathways.
AOPWiki (Key Event Relationship) aop.relationships International repository of Adverse Outcome Pathways.
AOPWiki (Stressor) aop.stressor International repository of Adverse Outcome Pathways.
Antimicrobial Peptide Database apd The antimicrobial peptide database (APD) provides information on anticancer, antiviral, antifungal and antibacterial peptides.
AphidBase Transcript aphidbase.transcript AphidBase is a centralized bioinformatic resource that was developed to facilitate community annotation of the pea aphid genome by the International Aphid Genomics Consortium (IAGC). The AphidBase Information System was designed to organize and distribute genomic data and annotations for a large international community. This collection references the transcript report, which describes genomic location, sequence and exon information.
APID Interactomes apid.interactions APID (Agile Protein Interactomes DataServer) provides information on the protein interactomes of numerous organisms, based on the integration of known experimentally validated protein-protein physical interactions (PPIs). Interactome data includes a report on quality levels and coverage over the proteomes for each organism included. APID integrates PPIs from primary databases of molecular interactions (BIND, BioGRID, DIP, HPRD, IntAct, MINT) and also from experimentally resolved 3D structures (PDB) where more than two distinct proteins have been identified. This collection references protein interactors, through a UniProt identifier.
Yeast phenotypes apo A structured controlled vocabulary for the phenotypes of Ascomycete fungi.
Apollo Structured Vocabulary apollosv Defines terms and relations necessary for interoperation between epidemic models and public health application software that interface with these models
ArachnoServer arachnoserver ArachnoServer (www.arachnoserver.org) is a manually curated database providing information on the sequence, structure and biological activity of protein toxins from spider venoms. It include a molecular target ontology designed specifically for venom toxins, as well as current and historic taxonomic information.
Arabidopsis Information Portal araport Website with general information about Arabidopsis and functionalities such as a genomic viewer
Antibiotic Resistance Genes Database ardb The Antibiotic Resistance Genes Database (ARDB) is a manually curated database which characterises genes involved in antibiotic resistance. Each gene and resistance type is annotated with information, including resistance profile, mechanism of action, ontology, COG and CDD annotations, as well as external links to sequence and protein databases. This collection references resistance genes.
Archival Resource Key ark An Archival Resource Key (ARK) is a Uniform Resource Locator (URL) that is a multi-purpose persistent identifier for information objects of any type.
Antibiotic Resistance Ontology aro Antibiotic resistance genes and mutations
ArrayExpress arrayexpress ArrayExpress is a public repository for microarray data, which is aimed at storing MIAME-compliant data in accordance with Microarray Gene Expression Data (MGED) recommendations.
ArrayExpress Platform arrayexpress.platform ArrayExpress is a public repository for microarray data, which is aimed at storing MIAME-compliant data in accordance with Microarray Gene Expression Data (MGED) recommendations.This collection references the specific platforms used in the generation of experimental results.
ArrayMap arraymap arrayMap is a collection of pre-processed oncogenomic array data sets and CNA (somatic copy number aberrations) profiles. CNA are a type of mutation commonly found in cancer genomes. arrayMap data is assembled from public repositories and supplemented with additional sources, using custom curation pipelines. This information has been mapped to multiple editions of the reference human genome.
arXiv arxiv arXiv is an e-print service in the fields of physics, mathematics, non-linear science, computer science, and quantitative biology.
A Systematic Annotation Package for Community Analysis of Genomes asap ASAP (a systematic annotation package for community analysis of genomes) stores bacterial genome sequence and functional characterization data. It includes multiple genome sequences at various stages of analysis, corresponding experimental data and access to collections of related genome resources.
Astrophysics Source Code Library ascl The Astrophysics Source Code Library (ASCL) is a free online registry for software that have been used in research that has appeared in, or been submitted to, peer-reviewed publications. The ASCL is indexed by the SAO/NASA Astrophysics Data System (ADS) and Web of Science's Data Citation Index (WoS DCI), and is citable by using the unique ascl ID assigned to each code. The ascl ID can be used to link to the code entry by prefacing the number with ascl.net (i.e., ascl.net/1201.001).
Amazon Standard Identification Number asin Almost every product on our site has its own ASIN, a unique code we use to identify it. For books, the ASIN is the same as the ISBN number, but for all other products a new ASIN is created when the item is uploaded to our catalogue.
Aspergillus Genome Database aspgd.locus The Aspergillus Genome Database (AspGD) is a repository for information relating to fungi of the genus Aspergillus, which includes organisms of clinical, agricultural and industrial importance. AspGD facilitates comparative genomics by providing a full-featured genomics viewer, as well as matched and standardized sets of genomic information for the sequenced aspergilli. This collection references gene information.
AspGD Protein aspgd.protein The Aspergillus Genome Database (AspGD) is a repository for information relating to fungi of the genus Aspergillus, which includes organisms of clinical, agricultural and industrial importance. AspGD facilitates comparative genomics by providing a full-featured genomics viewer, as well as matched and standardized sets of genomic information for the sequenced aspergilli. This collection references protein information.
Arabidopsis Small RNA Project asrp Arabidopsis Small RNA Project is a repository of data on Arabidopsis small RNA genes.
Anatomical Therapeutic Chemical Classification System atc The Anatomical Therapeutic Chemical (ATC) classification system, divides active substances into different groups according to the organ or system on which they act and their therapeutic, pharmacological and chemical properties. Drugs are classified in groups at five different levels; Drugs are divided into fourteen main groups (1st level), with pharmacological/therapeutic subgroups (2nd level). The 3rd and 4th levels are chemical/pharmacological/therapeutic subgroups and the 5th level is the chemical substance. The Anatomical Therapeutic Chemical (ATC) classification system and the Defined Daily Dose (DDD) is a tool for exchanging and comparing data on drug use at international, national or local levels.
American Type Culture Collection atcc The American Type Culture Collection (ATCC) is a private, nonprofit biological resource center whose mission focuses on the acquisition, authentication, production, preservation, development and distribution of standard reference microorganisms, cell lines and other materials for research in the life sciences.
Anatomical Therapeutic Chemical Vetinary atcvet The ATCvet system for the classification of veterinary medicines is based on the same overall principles as the ATC system for substances used in human medicine. In ATCvet systems, preparations are divided into groups, according to their therapeutic use. First, they are divided into 15 anatomical groups (1st level), classified as QA-QV in the ATCvet system, on the basis of their main therapeutic use.
Animal TFDB Family atfdb.family The Animal Transcription Factor DataBase (AnimalTFDB) classifies TFs in sequenced animal genomes, as well as collecting the transcription co-factors and chromatin remodeling factors of those genomes. This collections refers to transcription factor families, and the species in which they are found.
Amphibian taxonomy ato
Animal Trait Ontology for Livestock atol ATOL (Animal Trait Ontology for Livestock) is an ontology of characteristics defining phenotypes of livestock in their environment (EOL). ATOL aims to: - provide a reference ontology of phenotypic traits of farm animals for the international scientificand educational - communities, farmers, etc.; - deliver this reference ontology in a language which can be used by computers in order to support database management, semantic analysis and modeling; - represent traits as generic as possible for livestock vertebrates; - make the ATOL ontology as operational as possible and closely related to measurement techniques; - structure the ontology in relation to animal production.
AutDB autdb AutDB is a curated database for autism research. It is built on information extracted from the studies on molecular genetics and biology of Autism Spectrum Disorders (ASD). The four modules of AutDB include information on Human Genes, Animal models, Protein Interactions (PIN) and Copy Number Variants (CNV) respectively. It provides an annotated list of ASD candidate genes in the form of reference dataset for interrogating molecular mechanisms underlying the disorder.
Bacterial Diversity Metadatabase bacdive BacDive—the Bacterial Diversity Metadatabase merges detailed strain-linked information on the different aspects of bacterial and archaeal biodiversity.
BacMap Biography bacmap.biog BacMap is an electronic, interactive atlas of fully sequenced bacterial genomes. It contains labeled, zoomable and searchable chromosome maps for sequenced prokaryotic (archaebacterial and eubacterial) species. Each map can be zoomed to the level of individual genes and each gene is hyperlinked to a richly annotated gene card. All bacterial genome maps are supplemented with separate prophage genome maps as well as separate tRNA and rRNA maps. Each bacterial chromosome entry in BacMap contains graphs and tables on a variety of gene and protein statistics. Likewise, every bacterial species entry contains a bacterial 'biography' card, with taxonomic details, phenotypic details, textual descriptions and images. This collection references 'biography' information.
BacMap Map bacmap.map BacMap is an electronic, interactive atlas of fully sequenced bacterial genomes. It contains labeled, zoomable and searchable chromosome maps for sequenced prokaryotic (archaebacterial and eubacterial) species. Each map can be zoomed to the level of individual genes and each gene is hyperlinked to a richly annotated gene card. All bacterial genome maps are supplemented with separate prophage genome maps as well as separate tRNA and rRNA maps. Each bacterial chromosome entry in BacMap contains graphs and tables on a variety of gene and protein statistics. Likewise, every bacterial species entry contains a bacterial 'biography' card, with taxonomic details, phenotypic details, textual descriptions and images. This collection references genome map information.
Bactibase: a database dedicated to bacteriocins bactibase Bactibase is a database describing the physical and chemical properties of bacteriocins from gram-negative and gram-positive bacteria.
Brain Architecture Knowledge Management System Neuroanatomical Ontology bams BAMS (Brain Architectural Management System) describes vertebrate neuroinformatics data at four levels of organization: expressed molecules, neuron types and classes, brain regions, and networks of brain regions.
BioAssay Ontology bao The BioAssay Ontology (BAO) describes chemical biology screening assays and their results including high-throughput screening (HTS) data for the purpose of categorizing assays and data analysis.
Beta Cell Genomics Ontology bcgo An application ontology built for beta cell genomics studies.
The Behaviour Change Intervention Ontology bcio The Behaviour Change Intervention Ontology is an ontology for all aspects of human behaviour change interventions and their evaluation.
Biological Collections Ontology bco An ontology to support the interoperability of biodiversity data, including data on museum collections, environmental/metagenomic samples, and ecological surveys.
Berkeley Drosophila Genome Project EST database bdgp.est The BDGP EST database collects the expressed sequence tags (ESTs) derived from a variety of tissues and developmental stages for Drosophila melanogaster. All BDGP ESTs are available at dbEST (NCBI).
BDGP insertion DB bdgp.insertion BDGP gene disruption collection provides a public resource of gene disruptions of Drosophila genes using a single transposable element.
Bloomington Drosophila Stock Center bdsc The Bloomington Drosophila Stock Center collects, maintains and distributes Drosophila melanogaster strains for research.
Tribolium Genome Database -- Insertion beetlebase BeetleBase is a comprehensive sequence database and community resource for Tribolium genetics, genomics and developmental biology. It incorporates information about genes, mutants, genetic markers, expressed sequence tags and publications.
Benchmark Energy & Geometry Database begdb The Benchmark Energy & Geometry Database (BEGDB) collects results of highly accurate quantum mechanics (QM) calculations of molecular structures, energies and properties. These data can serve as benchmarks for testing and parameterization of other computational methods.
Biological Expression Language bel The Biological Expression Language is a domain-specific language for describing causal, correlative, and associative relationships between a variety of biological agents.
Basic Formal Ontology bfo The upper level ontology upon which OBO Foundry ontologies are built.
Bgee family bgee.family Bgee is a database of gene expression patterns within particular anatomical structures within a species, and between different animal species. This collection refers to expression across species.
Bgee gene bgee.gene Bgee is a database to retrieve and compare gene expression patterns in multiple species, produced from multiple data types (RNA-Seq, Affymetrix, in situ hybridization, and EST data). This collection references genes in Bgee.
Bgee organ bgee.organ Bgee is a database of gene expression patterns within particular anatomical structures within a species, and between different animal species. This collection refers to anatomical structures.
Bgee stage bgee.stage Bgee is a database of gene expression patterns within particular anatomical structures within a species, and between different animal species. This collection refers to developmental stages.
BiGG Compartment bigg.compartment BiGG is a knowledgebase of Biochemically, Genetically and Genomically structured genome-scale metabolic network reconstructions. It more published genome-scale metabolic networks into a single database with a set of stardized identifiers called BiGG IDs. Genes in the BiGG models are mapped to NCBI genome annotations, and metabolites are linked to many external databases (KEGG, PubChem, and many more). This collection references model compartments.
BiGG Metabolite bigg.metabolite BiGG is a knowledgebase of Biochemically, Genetically and Genomically structured genome-scale metabolic network reconstructions. It more published genome-scale metabolic networks into a single database with a set of stardized identifiers called BiGG IDs. Genes in the BiGG models are mapped to NCBI genome annotations, and metabolites are linked to many external databases (KEGG, PubChem, and many more). This collection references individual metabolotes.
BiGG Model bigg.model BiGG is a knowledgebase of Biochemically, Genetically and Genomically structured genome-scale metabolic network reconstructions. It more published genome-scale metabolic networks into a single database with a set of stardized identifiers called BiGG IDs. Genes in the BiGG models are mapped to NCBI genome annotations, and metabolites are linked to many external databases (KEGG, PubChem, and many more). This collection references individual models.
BiGG Reaction bigg.reaction BiGG is a knowledgebase of Biochemically, Genetically and Genomically structured genome-scale metabolic network reconstructions. It more published genome-scale metabolic networks into a single database with a set of stardized identifiers called BiGG IDs. Genes in the BiGG models are mapped to NCBI genome annotations, and metabolites are linked to many external databases (KEGG, PubChem, and many more). This collection references reactions.
Bilateria anatomy bila
BindingDB bindingdb BindingDB is the first public database of protein-small molecule affinity data.
BioCarta Pathway biocarta.pathway BioCarta is a supplier and distributor of characterized reagents and assays for biopharmaceutical and academic research. It catalogs community produced online maps depicting molecular relationships from areas of active research, generating classical pathways as well as suggestions for new pathways. This collections references pathway maps.
BioCatalogue Service biocatalogue.service The BioCatalogue provides a common interface for registering, browsing and annotating Web Services to the Life Science community. Registered services are monitored, allowing the identification of service problems and changes and the filtering-out of unavailable or unreliable resources. BioCatalogue is free to use, for all.
BioCyc collection of metabolic pathway databases biocyc BioCyc is a collection of Pathway/Genome Databases (PGDBs) which provides an electronic reference source on the genomes and metabolic pathways of sequenced organisms.
BioGRID biogrid BioGRID is a database of physical and genetic interactions in Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, Homo sapiens, and Schizosaccharomyces pombe.
BioGRID Interactions biogrid.interaction BioGRID is a database of physical and genetic interactions in Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, Homo sapiens, and Schizosaccharomyces pombe.
BioLink Model biolink A high level datamodel of biological entities (genes, diseases, phenotypes, pathways, individuals, substances, etc) and their associations.
Bio-MINDER Tissue Database biominder Database of the dielectric properties of biological tissues.
BioModels Database biomodels.db BioModels Database is a data resource that allows biologists to store, search and retrieve published mathematical models of biological interests.
Kinetic Simulation Algorithm Ontology biomodels.kisao The Kinetic Simulation Algorithm Ontology (KiSAO) is an ontology that describes simulation algorithms and methods used for biological kinetic models, and the relationships between them. This provides a means to unambiguously refer to simulation algorithms when describing a simulation experiment.
Terminology for Description of Dynamics biomodels.teddy The Terminology for Description of Dynamics (TEDDY) is an ontology for dynamical behaviours, observable dynamical phenomena, and control elements of bio-models and biological systems in Systems Biology and Synthetic Biology.
SBML RDF Vocabulary biomodels.vocabulary Vocabulary used in the RDF representation of SBML models.
BioNumbers bionumbers BioNumbers is a database of key numberical information that may be used in molecular biology. Along with the numbers, it contains references to the original literature, useful comments, and related numeric data.
BioPortal bioportal BioPortal is an open repository of biomedical ontologies that provides access via Web services and Web browsers to ontologies developed in OWL, RDF, OBO format and Protégé frames. BioPortal functionality includes the ability to browse, search and visualize ontologies.
BioProject bioproject BioProject provides an organizational framework to access metadata about research projects and the data from the projects that are deposited into different databases. It provides information about a project’s scope, material, objectives, funding source and general relevance categories.
Bioregistry bioregistry The Bioregistry is integrative meta-registry of biological databases, ontologies, and nomenclatures that is backed by an open database.
Bioregistry Collections bioregistry.collection Manually curated collections of resources stored in the bioregistry
bioRxiv biorxiv The bioRxiv is a preprint server for biology
BioSample biosample The BioSample Database stores information about biological samples used in molecular experiments, such as sequencing, gene expression or proteomics. It includes reference samples, such as cell lines, which are repeatedly used in experiments. Accession numbers for the reference samples will be exchanged with a similar database at NCBI, and DDBJ (Japan). Record access may be affected due to different release cycles and inter-institutional synchronisation.
biosimulations biosimulations BioSimulations is an open repository of simulation projects, including simulation experiments, their results, and data visualizations of their results. BioSimulations supports a broad range of model languages, modeling frameworks, simulation algorithms, and simulation software tools.
BioSimulators biosimulators BioSimulators is a registry of containerized simulation tools that support a common interface. The containers in BioSimulators support a range of modeling frameworks (e.g., logical, constraint-based, continuous kinetic, discrete kinetic), simulation algorithms (e.g., CVODE, FBA, SSA), and modeling formats (e.g., BGNL, SBML, SED-ML).
BioStudies database biostudies The BioStudies database holds descriptions of biological studies, links to data from these studies in other databases at EMBL-EBI or outside, as well as data that do not fit in the structured archives at EMBL-EBI. The database can accept a wide range of types of studies described via a simple format. It also enables manuscript authors to submit supplementary information and link to it from the publication.
BioSystems biosystems The NCBI BioSystems database centralizes and cross-links existing biological systems databases, increasing their utility and target audience by integrating their pathways and systems into NCBI resources.
BioTools biotools BioTools is a registry of databases and software with tools, services, and workflows for biological and biomedical research.
Biomedical Informatics Research Network Lexicon birnlex The BIRN Project lexicon will provide entities for data and database annotation for the BIRN project, covering anatomy, disease, data collection, project management and experimental design.
BitterDB Compound bitterdb.cpd BitterDB is a database of compounds reported to taste bitter to humans. The compounds can be searched by name, chemical structure, similarity to other bitter compounds, association with a particular human bitter taste receptor, and so on. The database also contains information on mutations in bitter taste receptors that were shown to influence receptor activation by bitter compounds. The aim of BitterDB is to facilitate studying the chemical features associated with bitterness. This collection references compounds.
BitterDB Receptor bitterdb.rec BitterDB is a database of compounds reported to taste bitter to humans. The compounds can be searched by name, chemical structure, similarity to other bitter compounds, association with a particular human bitter taste receptor, and so on. The database also contains information on mutations in bitter taste receptors that were shown to influence receptor activation by bitter compounds. The aim of BitterDB is to facilitate studying the chemical features associated with bitterness. This collection references receptors.
Biological Magnetic Resonance Data Bank bmrb BMRB collects, annotates, archives, and disseminates (worldwide in the public domain) the important spectral and quantitative data derived from NMR spectroscopic investigations of biological macromolecules and metabolites. The goal is to empower scientists in their analysis of the structure, dynamics, and chemistry of biological systems and to support further development of the field of biomolecular NMR spectroscopy.
NMR Restraints Grid bmrb.restraint The NMR Restraints Grid contains the original NMR data as collected for over 2500 protein and nucleic acid structures with corresponding PDB entries. In addition to the original restraints, most of the distance, dihedral angle and RDC restraint data (>85%) were parsed, and those in over 500 entries were converted and filtered. The converted and filtered data sets constitute the DOCR and FRED databases respectively.
Barcode of Life database bold.taxonomy The Barcode of Life Data System (BOLD) is an informatics workbench aiding the acquisition, storage, analysis and publication of DNA barcode records. The associated taxonomy browser shows the progress of DNA barcoding and provides sample collection site distribution, and taxon occurence information.
Gene Regulation Ontology bootstrep
Bio-Pesticides DataBase bpdb Database of biopesticides maintained by the University of Hertfordshire
BRENDA, The Comprehensive Enzyme Information System brenda BRENDA is a collection of enzyme functional data available to the scientific community. Data on enzyme function are extracted directly from the primary literature The database covers information on classification and nomenclature, reaction and specificity, functional parameters, occurrence, enzyme structure and stability, mutants and enzyme engineering, preparation and isolation, the application of enzymes, and ligand-related data.
Broad Fungal Genome Initiative broad Magnaporthe grisea, the causal agent of rice blast disease, is one of the most devasting threats to food security worldwide and is a model organism for studying fungal phytopathogenicity and host-parasite interactions. The Magnaporthe comparative genomics database provides accesses to multiple fungal genomes from the Magnaporthaceae family to facilitate the comparative analysis. As part of the Broad Fungal Genome Initiative, the Magnaporthe comparative project includes the finished M. oryzae (formerly M. grisea) genome, as well as the draft assemblies of Gaeumannomyces graminis var. tritici and M. poae.
Biosapiens Protein Feature Ontology bs SO is a collaborative ontology project for the definition of sequence features used in biological sequence annotation. It is part of the Open Biomedical Ontologies library.
Biological Spatial Ontology bspo An ontology for respresenting spatial concepts, anatomical axes, gradients, regions, planes, sides and surfaces. These concepts can be used at multiple biological scales and in a diversity of taxa, including plants, animals and fungi. The BSPO is used to provide a source of anatomical location descriptors for logically defining anatomical entity classes in anatomy ontologies.
BRENDA tissue / enzyme source bto The Brenda tissue ontology is a structured controlled vocabulary eastablished to identify the source of an enzyme cited in the Brenda enzyme database. It comprises terms of tissues, cell lines, cell types and cell cultures from uni- and multicellular organisms.
BugBase Expt bugbase.expt BugBase is a MIAME-compliant microbial gene expression and comparative genomic database. It stores experimental annotation and multiple raw and analysed data formats, as well as protocols for bacterial microarray designs. This collection references microarray experiments.
BugBase Protocol bugbase.protocol BugBase is a MIAME-compliant microbial gene expression and comparative genomic database. It stores experimental annotation and multiple raw and analysed data formats, as well as protocols for bacterial microarray designs. This collection references design protocols.
Bacterial Tyrosine Kinase Database bykdb The bacterial tyrosine kinase database (BYKdb) that collects sequences of putative and authentic bacterial tyrosine kinases, providing structural and functional information.
Common Access to Biological Resources and Information Project cabri CABRI (Common Access to Biotechnological Resources and Information) is an online service where users can search a number of European Biological Resource Centre catalogues. It lists the availability of a particular organism or genetic resource and defines the set of technical specifications and procedures which should be used to handle it.
Cancer Data Standards Registry and Repository cadsr The US National Cancer Institute (NCI) maintains and administers data elements, forms, models, and components of these items in a metadata registry referred to as the Cancer Data Standards Registry and Repository, or caDSR.
CALIPHO Group Ontology of Human Anatomy caloha This is a code repository for the SIB - Swiss Institute of Bioinformatics CALIPHO group neXtProt project, which is a comprehensive human-centric discovery platform, that offers a integration of and navigation through protein-related data. CALIPHO is an interdisciplinary team which aims to use a variety of methodologies to help uncover the function of uncharacterized human proteins.
Continuous Automated Model Evaluation cameo The goal of the CAMEO (Continuous Automated Model EvaluatiOn) community project is to continuously evaluate the accuracy and reliability of protein structure prediction servers, offering scores on tertiary and quaternary structure prediction, model quality estimation, accessible surface area prediction, ligand binding site residue prediction and contact prediction services in a fully automated manner. These predictions are regularly compared against reference structures from PDB.
CAPS-DB caps CAPS-DB is a structural classification of helix-cappings or caps compiled from protein structures. The regions of the polypeptide chain immediately preceding or following an alpha-helix are known as Nt- and Ct cappings, respectively. Caps extracted from protein structures have been structurally classified based on geometry and conformation and organized in a tree-like hierarchical classification where the different levels correspond to different properties of the caps.
Common Anatomy Reference Ontology caro An upper level ontology to facilitate interoperability between existing anatomy ontologies for different species
CAS Chemical Registry cas CAS (Chemical Abstracts Service) is a division of the American Chemical Society and is the producer of comprehensive databases of chemical information.
CATH Protein Structural Domain Superfamily cath CATH is a classification of protein structural domains. We group protein domains into superfamilies when there is sufficient evidence they have diverged from a common ancestor. CATH can be used to predict structural and functional information directly from protein sequence.
CATH domain cath.domain The CATH database is a hierarchical domain classification of protein structures in the Protein Data Bank. Protein structures are classified using a combination of automated and manual procedures. There are four major levels in this hierarchy; Class (secondary structure classification, e.g. mostly alpha), Architecture (classification based on overall shape), Topology (fold family) and Homologous superfamily (protein domains which are thought to share a common ancestor). This colelction is concerned with CATH domains.
CATH superfamily cath.superfamily The CATH database is a hierarchical domain classification of protein structures in the Protein Data Bank. Protein structures are classified using a combination of automated and manual procedures. There are four major levels in this hierarchy; Class (secondary structure classification, e.g. mostly alpha), Architecture (classification based on overall shape), Topology (fold family) and Homologous superfamily (protein domains which are thought to share a common ancestor). This colelction is concerned with superfamily classification.
Animal Genome Cattle QTL cattleqtldb The Animal Quantitative Trait Loci (QTL) database (Animal QTLdb) is designed to house publicly all available QTL and single-nucleotide polymorphism/gene association data on livestock animal species. This collection references cattle QTLs.
Carbohydrate Active EnZYmes cazy The Carbohydrate-Active Enzyme (CAZy) database is a resource specialized in enzymes that build and breakdown complex carbohydrates and glycoconjugates. These enzymes are classified into families based on structural features.
Chinese Biological Abstracts cba CBA: http://www.cba.ac.cn/ and the Shanghai Institutes for Biological Sciences (SIBS at http://www.sibs.ac.cn/) provide EBI with citation data not available in MEDLINE.
The cBioPortal for Cancer Genomics cbioportal The cBioPortal for Cancer Genomics provides visualization, analysis and download of large-scale cancer genomics data sets.
CCDC Number ccdc The Cambridge Crystallographic Data Centre (CCDC) develops and maintains the Cambridge Stuctural Database, the world's most comprehensive archive of small-molecule crystal structure data. A CCDC Number is a unique identifier assigned to a dataset deposited with the CCDC.
Consensus CDS ccds The Consensus CDS (CCDS) project is a collaborative effort to identify a core set of human and mouse protein coding regions that are consistently annotated and of high quality. The CCDS set is calculated following coordinated whole genome annotation updates carried out by the NCBI, WTSI, and Ensembl. The long term goal is to support convergence towards a standard set of gene annotations.
Cancer Cell Line Encyclopedia (CCLE) Cells ccle Datasets around different cancer cell lines generated by the Broad Institute and Novartis
Cell Cycle Ontology cco The Cell Cycle Ontology is an application ontology that captures and integrates detailed knowledge on the cell cycle process.
Comparative Data Analysis Ontology cdao The Comparative Data Analysis Ontology (CDAO) provides a framework for understanding data in the context of evolutionary-comparative analysis. This comparative approach is used commonly in bioinformatics and other areas of biology to draw inferences from a comparison of differently evolved versions of something, such as differently evolved versions of a protein. In this kind of analysis, the things-to-be-compared typically are classes called 'OTUs' (Operational Taxonomic Units). The OTUs can represent biological species, but also may be drawn from higher or lower in a biological hierarchy, anywhere from molecules to communities. The features to be compared among OTUs are rendered in an entity-attribute-value model sometimes referred to as the 'character-state data model'. For a given character, such as 'beak length', each OTU has a state, such as 'short' or 'long'. The differences between states are understood to emerge by a historical process of evolutionary transitions in state, represented by a model (or rules) of transitions along with a phylogenetic tree. CDAO provides the framework for representing OTUs, trees, transformations, and characters. The representation of characters and transformations may depend on imported ontologies for a specific type of character.
Conserved Domain Database at NCBI cdd The Conserved Domain Database (CDD) is a collection of multiple sequence alignments and derived database search models, which represent protein domains conserved in molecular evolution.
Compositional Dietary Nutrition Ontology cdno None
Canadian Drug Product Database cdpd The Canadian Drug Product Database (DPD) contains product specific information on drugs approved for use in Canada, and includes human pharmaceutical and biological drugs, veterinary drugs and disinfectant products. This information includes 'brand name', 'route of administration' and a Canadian 'Drug Identification Number' (DIN).
CellBank Australia cellbank.australia CellBank Australia collects novel cell lines, developed by Australian researchers, submits these cell lines to rigorous testing to confirm their integrity, and then distributes the cell lines to researchers throughout the world.
Cell Image Library cellimage The Cell: An Image Library™ is a freely accessible, public repository of reviewed and annotated images, videos, and animations of cells from a variety of organisms, showcasing cell architecture, intracellular functionalities, and both normal and abnormal processes.
Cellosaurus ID cellosaurus The Cellosaurus is a knowledge resource on cell lines. It attempts to describe all cell lines used in biomedical research. Its scope includes: Immortalized cell lines; naturally immortal cell lines (example: stem cell lines); finite life cell lines when those are distributed and used widely; vertebrate cell line with an emphasis on human, mouse and rat cell lines; and invertebrate (insects and ticks) cell lines. Its scope does not include primary cell lines (with the exception of the finite life cell lines described above) and plant cell lines.
Cell Version Control Repository cellrepo The Cell Version Control Repository is the single worldwide version control repository for engineered and natural cell lines
Cephalopod Ontology ceph An anatomical and developmental ontology for cephalopods
Candida Genome Database cgd The Candida Genome Database (CGD) provides access to genomic sequence data and manually curated functional information about genes and proteins of the human pathogen Candida albicans. It collects gene names and aliases, and assigns gene ontology terms to describe the molecular function, biological process, and subcellular localization of gene products.
Chicken Gene Nomenclature Consortium cgnc Nomenclature Consortium around Chicken genes (analogous to the HGNC for humans)
Coli Genetic Stock Center cgsc The CGSC Database of E. coli genetic information includes genotypes and reference information for the strains in the CGSC collection, the names, synonyms, properties, and map position for genes, gene product information, and information on specific mutations and references to primary literature.
CharProt charprot CharProt is a database of biochemically characterized proteins designed to support automated annotation pipelines. Entries are annotated with gene name, symbol and various controlled vocabulary terms, including Gene Ontology terms, Enzyme Commission number and TransportDB accession.
Chemical Entities of Biological Interest chebi Chemical Entities of Biological Interest (ChEBI) is a freely available dictionary of molecular entities focused on 'small' chemical compounds.
ChEMBL chembl ChEMBL is a database of bioactive compounds, their quantitative properties and bioactivities (binding constants, pharmacology and ADMET, etc). The data is abstracted and curated from the primary scientific literature.
ChEMBL ID chembl.compound ChEMBL is a database of bioactive compounds, their quantitative properties and bioactivities (binding constants, pharmacology and ADMET, etc). The data is abstracted and curated from the primary scientific literature.
ChEMBL target chembl.target ChEMBL is a database of bioactive compounds, their quantitative properties and bioactivities (binding constants, pharmacology and ADMET, etc). The data is abstracted and curated from the primary scientific literature.
ChemDB chemdb ChemDB is a chemical database containing commercially available small molecules, important for use as synthetic building blocks, probes in systems biology and as leads for the discovery of drugs and other useful compounds.
ChemIDplus chemidplus ChemIDplus is a web-based search system that provides access to structure and nomenclature authority files used for the identification of chemical substances cited in National Library of Medicine (NLM) databases. It also provides structure searching and direct links to many biomedical resources at NLM and on the Internet for chemicals of interest.
Chemical Information Ontology cheminf Includes terms for the descriptors commonly used in cheminformatics software applications and the algorithms which generate them.
ChemSpider ID chemspider ChemSpider is a collection of compound data from across the web, which aggregates chemical structures and their associated information into a single searchable repository entry. These entries are supplemented with additional properties, related information and links back to original data sources.
Animal Genome Chicken QTL chickenqtldb The Animal Quantitative Trait Loci (QTL) database (Animal QTLdb) is designed to house publicly all available QTL and single-nucleotide polymorphism/gene association data on livestock animal species. This collection references chicken QTLs.
ChEBI Integrated Role Ontology chiro CHEBI provides a distinct role hierarchy. Chemicals in the structural hierarchy are connected via a 'has role' relation. CHIRO provides links from these roles to useful other classes in other ontologies. This will allow direct connection between chemical structures (small molecules, drugs) and what they do. This could be formalized using 'capable of', in the same way Uberon and the Cell Ontology link structures to processes.
Chemical Methods Ontology chmo CHMO, the chemical methods ontology, describes methods used to collect data in chemical experiments, such as mass spectrometry and electron microscopy prepare and separate material for further analysis, such as sample ionisation, chromatography, and electrophoresis synthesise materials, such as epitaxy and continuous vapour deposition It also describes the instruments used in these experiments, such as mass spectrometers and chromatography columns. It is intended to be complementary to the Ontology for Biomedical Investigations (OBI).
Coronavirus Infectious Disease Ontology cido The Ontology of Coronavirus Infectious Disease (CIDO) is a community-driven open-source biomedical ontology in the area of coronavirus infectious disease. The CIDO is developed to provide standardized human- and computer-interpretable annotation and representation of various coronavirus infectious diseases, including their etiology, transmission, pathogenesis, diagnosis, prevention, and treatment.
Confidence Information Ontology cio An ontology to capture confidence information about annotations.
CiteXplore citexplore One of the precursors to the EuropePMC project. Now EuropePMC is able to resolve CiteXplore codes.
CIViC Assertion civic.aid A CIViC assertion classifies the clinical significance of a variant-disease relationship under recognized guidelines. The CIViC Assertion (AID) summarizes a collection of Evidence Items (EIDs) that covers predictive/therapeutic, diagnostic, prognostic or predisposing clinical information for a variant in a specific cancer context. CIViC currently has two main types of Assertions: those based on variants of primarily somatic origin (predictive/therapeutic, prognostic, and diagnostic) and those based on variants of primarily germline origin (predisposing). When the number and quality of Predictive, Prognostic, Diagnostic or Predisposing Evidence Items (EIDs) in CIViC sufficiently cover what is known for a particular variant and cancer type, then a corresponding assertion be created in CIViC.
CIViC Evidence civic.eid Evidence Items are the central building block of the Clinical Interpretation of Variants in Cancer (CIViC) knowledgebase. The clinical Evidence Item is a piece of information that has been manually curated from trustable medical literature about a Variant or genomic ‘event’ that has implications in cancer Predisposition, Diagnosis (aka molecular classification), Prognosis, Predictive response to therapy, Oncogenicity or protein Function. For example, an Evidence Item might describe a line of evidence supporting the notion that tumors with a somatic BRAF V600 mutation generally respond well to the drug dabrafenib. A Variant may be a single nucleotide substitution, a small insertion or deletion, an RNA gene fusion, a chromosomal rearrangement, an RNA expression pattern (e.g. over-expression), etc. Each clinical Evidence statement corresponds to a single citable Source (a publication or conference abstract).
Cell Ontology cl The Cell Ontology is designed as a structured controlled vocabulary for cell types. The ontology was constructed for use by the model organism and other bioinformatics databases, incorporating cell types from prokaryotes to mammals, and includes plants and fungi.
Collembola Anatomy Ontology clao CLAO is an ontology of anatomical terms employed in morphological descriptions for the Class Collembola (Arthropoda: Hexapoda).
ClassyFire classyfire ClassyFire is a web-based application for automated structural classification of chemical entities. This application uses a rule-based approach that relies on a comprehensible, comprehensive, and computable chemical taxonomy. ClassyFire provides a hierarchical chemical classification of chemical entities (mostly small molecules and short peptide sequences), as well as a structure-based textual description, based on a chemical taxonomy named ChemOnt, which covers 4825 chemical classes of organic and inorganic compounds. Moreover, ClassyFire allows for text-based search via its web interface. It can be accessed via the web interface or via the ClassyFire API.
Cell Line Database cldb The Cell Line Data Base (CLDB) is a reference information source for human and animal cell lines. It provides the characteristics of the cell lines and their availability through distributors, allowing cell line requests to be made from collections and laboratories.
ClinGen Allele Registry clingene The allele registry provides and maintains identifiers for genetic variants
ClinicalTrials.gov clinicaltrials ClinicalTrials.gov provides free access to information on clinical studies for a wide range of diseases and conditions. Studies listed in the database are conducted in 175 countries
ClinVar Variation ID clinvar ClinVar archives reports of relationships among medically important variants and phenotypes. It records human variation, interpretations of the relationship specific variations to human health, and supporting evidence for each interpretation. Each ClinVar record (RCV identifier) represents an aggregated view of interpretations of the same variation and condition from one or more submitters. Submissions for individual variation/phenotype combinations (SCV identifier) are also collected and made available separately. This collection references the Variant identifier.
ClinVar Record clinvar.record ClinVar archives reports of relationships among medically important variants and phenotypes. It records human variation, interpretations of the relationship specific variations to human health, and supporting evidence for each interpretation. Each ClinVar record (RCV identifier) represents an aggregated view of interpretations of the same variation and condition from one or more submitters. Submissions for individual variation/phenotype combinations (SCV identifier) are also collected and made available separately. This collection references the Record Report, based on RCV accession.
ClinVar Submission clinvar.submission ClinVar archives reports of relationships among medically important variants and phenotypes. It records human variation, interpretations of the relationship specific variations to human health, and supporting evidence for each interpretation. Each ClinVar record (RCV identifier) represents an aggregated view of interpretations of the same variation and condition from one or more submitters. Submissions for individual variation/phenotype combinations (SCV identifier) are also collected and made available separately. This collection references submissions, and is based on SCV accession.
ClinVar Submitter clinvar.submitter ClinVar archives reports of relationships among medically important variants and phenotypes. It records human variation, interpretations of the relationship specific variations to human health, and supporting evidence for each interpretation. Each ClinVar record (RCV identifier) represents an aggregated view of interpretations of the same variation and condition from one or more submitters (Submitter IDs). Submissions for individual variation/phenotype combinations (SCV identifier) are also collected and made available separately. This collection references submitters (submitter ids) that submit the submissions (SCVs).
Cell Line Ontology clo The Cell Line Ontology (CLO) is a community-based ontology of cell lines. The CLO is developed to unify publicly available cell line entry data from multiple sources to a standardized logically defined format based on consensus design patterns.
Clytia hemisphaerica Development and Anatomy Ontology clyh Anatomy, development and life cycle stages - planula, polyp, medusa/jellyfish - of the cnidarian hydrozoan species, Clytia hemiphaerica.
CranioMaxilloFacial ontology cmf This ontology represents the clinical findings and procedures used in the oral and maxillo-facial surgical domain
Clinical measurement ontology cmo Morphological and physiological measurement records generated from clinical and model organism research and health programs.
Cellular Microscopy Phenotype Ontology cmpo CMPO is a species neutral ontology for describing general phenotypic observations relating to the whole cell, cellular components, cellular processes and cell populations.
Rice ontology co_320 Rice Trait Dictionary in template v 5.0 - IRRI - March 2016 - Based on SES, RD, UPOV variables and on variables used by CIAT, FLAR and the GRISP Phenotyping Network variables
Wheat ontology co_321 Sept 2020
Maize ontology co_322 Maize Trait Dictionary in template 5 - CIMMYT- September 2016
Barley ontology co_323 ICARDA - TDv5 - Sept 2018
Sorghum ontology co_324 Sorghum TDv5 March 2021
Banana ontology co_325 Banana Trait Dictionary in template 5 - Bioversity & IITA - April 2019
Pearl millet ontology co_327 Pearl millet Trait Dictionary in template 5 - ICRISAT/INERA - April 2016
Potato ontology co_330 CIP - potato ontology - november 2020
Sweet Potato ontology co_331 Sweet Potato Trait Dictionary in template v5 - CIP - November 2019
Beet Ontology ontology co_333 This ontology was built as part of the AKER project. It describes variables used in beet phenotyping (experimental properties and measurement scale) for each institution (INRAE, Geves, ITB) and breeding companies (Florimond Desprez). Curator: Dorothee Charruaud (ADRINORD - URGI) Daphne Verdelet (Florimond Desprez) - First submission in November 2017.
Cassava ontology co_334 Cassava Trait Dictionary in template 5 - IITA - July 2015, updated in February 2016
Common Bean ontology co_335 CIAT Common bean trait dictionary - version August 2014
Soybean ontology co_336 Soybean Trait Dictionary in template v5 - IITA - July 2015
Groundnut ontology co_337 Groundnut Trait Dictionary in template v5 - ICRISAT/ISRA/DARS/USDA-ARS - Sept 2019
Chickpea ontology co_338 Chickpea Trait Dictionary in template v5 - ICRISAT - July 2015
Lentil ontology co_339 Lentil Trait Dictionary in template v5 - ICARDA - July 2015
Cowpea ontology co_340 Cowpea Trait Dictionary in template v5 - IITA - August 2015
Pigeonpea ontology co_341 Pigeonpea Trait Dictionary in template v5 - ICRISAT - July 2015
Yam ontology co_343 version 2019 - pvs
Brachiaria ontology co_345 Brachiaria (forages) ontology TD v5 - Version Oct 2016
Mungbean ontology co_346 oct 2016
Castor bean ontology co_347 March 2017 version
Brassica ontology co_348 Brassica Trait Ontology (BRaTO) hosts trait information to describe brassica crop data. Terms are collected from various projects including OREGIN, RIPR (UK) and Rapsodyn (France). BRATO development is conducted by Earlham Institute (UK), Southern Cross University (Australia) and INRA (France).
Oat ontology co_350 Oat trait dictionary started by Oat Global (http://oatglobal.org/) and improved by NIAB and PepsiCo
Vitis ontology co_356 Grape Ontology including OIV and bioversity descriptors. INRAE Jan 2021
Woody Plant Ontology ontology co_357 This ontology lists all variables used for woody plant observations. Terms are collected from various sources (past and ongoing projects at national and international levels). Curators: Celia Michotey (INRAE) & Ines Chaves (IBET) - Version 2 submitted on Jun 2020 by INRAE.
Cotton ontology co_358 Cotton ontology from CottonGen database - June 2019
Sunflower ontology co_359 December 2019
Sugar Kelp trait ontology co_360 Sugar Kelp trait ontology
Fababean ontology co_365 developed by ICARDA - Dec 2018
Bambara groundnut ontology co_366 version Dec 2019
Core Ontology for Biology and Biomedicine cob COB brings together key terms from a wide range of OBO projects to improve interoperability.
Cluster of orthologous genes cog COGs stands for Clusters of Orthologous Genes. The database was initially created in 1997 (Tatusov et al., PMID: 9381173) followed by several updates, most recently in 2014 (Galperin et al., PMID: 25428365). The current update includes complete genomes of 1,187 bacteria and 122 archaea that map into 1,234 genera. The new features include ~250 updated COG annotations with corresponding references and PDB links, where available; new COGs for proteins involved in CRISPR-Cas immunity, sporulation, and photosynthesis, and the lists of COGs grouped by pathways and functional systems.
COG Categories cog.category Higher-level classifications of COG Pathways
COG Pathways cog.pathway Database of Clusters of Orthologous Genes grouped by pathways and functional systems. It includes the complete genomes of 1,187 bacteria and 122 archaea that map into 1,234 genera.
MIMIC III Database cohd MIMIC-III is a dataset comprising health-related data associated with over 40,000 patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012
COMBINE specifications combine.specifications The 'COmputational Modeling in BIology' NEtwork (COMBINE) is an initiative to coordinate the development of the various community standards and formats for computational models, initially in Systems Biology and related fields. This collection pertains to specifications of the standard formats developed by the Computational Modeling in Biology Network.
The Bioinorganic Motif Database come COMe (Co-Ordination of Metals) at the EBI represents an ontology for bioinorganic and other small molecule centres in complex proteins, using a classification system based on the concept of a bioinorganic motif.
Complex Portal accession ID complexportal A database that describes manually curated macromolecular complexes and provides links to details about these complexes in other databases.
CompTox Chemistry Dashboard comptox The Chemistry Dashboard is a part of a suite of databases and web applications developed by the US Environmental Protection Agency's Chemical Safety for Sustainability Research Program. These databases and apps support EPA's computational toxicology research efforts to develop innovative methods to change how chemicals are currently evaluated for potential health risks.
Compluyeast-2D-DB compulyeast Compluyeast-2D-DB is a two-dimensional polyacrylamide gel electrophoresis federated database. This collection references a subset of Uniprot, and contains general information about the protein record.
ConoServer conoserver ConoServer is a database specialized in the sequence and structures of conopeptides, which are peptides expressed by carnivorous marine cone snails.
Curation of Neurodegeneration Supporting Ontology conso An ontology describing phenomena encountered in the literature surrounding neurodegenerative diseases such as Alzheimer's disease, Parkinson's disease, Huntington's disease, tauopathies, and related protein aggregation diseases.
Coriell Institute for Medical Research coriell The Coriell Cell Repositories provide essential research reagents to the scientific community by establishing, verifying, maintaining, and distributing cell cultures and DNA derived from cell cultures. These collections, supported by funds from the National Institutes of Health (NIH) and several foundations, are extensively utilized by research scientists around the world.
CorrDB corrdb A genetic correlation is the proportion of shared variance between two traits that is due to genetic causes; a phenotypic correlation is the degree to which two traits co-vary among individuals in a population. In the genomics era, while gene expression, genetic association, and network analysis provide unprecedented means to decode the genetic basis of complex phenotypes, it is important to recognize the possible effects genetic progress in one trait can have on other traits. This database is designed to collect all published livestock genetic/phenotypic trait correlation data, aimed at facilitating genetic network analysis or systems biology studies.
CORUM - the Comprehensive Resource of Mammalian protein complexes corum The CORUM database provides a resource of manually annotated protein complexes from mammalian organisms. Annotation includes protein complex function, localization, subunit composition, literature references and more. All information is obtained from individual experiments published in scientific articles, data from high-throughput experiments is excluded.
COSMIC Gene cosmic COSMIC is a comprehensive global resource for information on somatic mutations in human cancer, combining curation of the scientific literature with tumor resequencing data from the Cancer Genome Project at the Sanger Institute, U.K. This collection references genes.
COSMIC Cell Lines cosmic.cell COSMIC, the Catalogue Of Somatic Mutations In Cancer, is the world's largest and most comprehensive resource for exploring the impact of somatic mutations in human cancer
COVID-19 Ontology covid19 Curated contextual database gathering samples related to SARS-CoV-2 virus and covid-19 disease.
CoVoc Coronavirus Vocabulary covoc The COVID-19 Vocabulary (COVoc) is an ontology containing terms related to the research of the COVID-19 pandemic. This includes host organisms, pathogenicity, gene and gene products, barrier gestures, treatments and more.
Cellular Phenotypes cp
Cooperative Patent Classification cpc The Cooperative Patent Classification (CPC) is a patent classification system, developed jointly by the European Patent Office (EPO) and the United States Patent and Trademark Office (USPTO). It is based on the previous European classification system (ECLA), which itself was a version of the International Patent Classification (IPC) system. The CPC patent classification system has been used by EPO and USPTO since 1st January, 2013.
CRISPRdb crisprdb Repeated CRISPR ("clustered regularly interspaced short palindromic repeats") elements found in archaebacteria and eubacteria are believed to defend against viral infection, potentially targeting invading DNA for degradation. CRISPRdb is a database that stores information on CRISPRs that are automatically extracted from newly released genome sequence data.
Contributor Role Ontology cro A classification of the diverse roles performed in the work leading to a published research output in the sciences. Its purpose to provide transparency in contributions to scholarly published work, to enable improved systems of attribution, credit, and accountability.
Cryo Electron Microscopy ontology cryoem Ontology that describes data types and image processing operations in Cryo Electron Microscopy of Single Particles
CryptoDB cryptodb CryptoDB is one of the databases that can be accessed through the EuPathDB (http://EuPathDB.org; formerly ApiDB) portal, covering eukaryotic pathogens of the genera Cryptosporidium, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Trichomonas and Trypanosoma. While each of these groups is supported by a taxon-specific database built upon the same infrastructure, the EuPathDB portal offers an entry point to all these resources, and the opportunity to leverage orthology for searches across genera.
Catalytic Site Atlas csa The Catalytic Site Atlas (CSA) is a database documenting enzyme active sites and catalytic residues in enzymes of 3D structure. It uses a defined classification for catalytic residues which includes only those residues thought to be directly involved in some aspect of the reaction catalysed by an enzyme.
Cambridge Structural Database csd The Cambridge Stuctural Database (CSD) is the world's most comprehensive collection of small-molecule crystal structures. Entries curated into the CSD are identified by a CSD Refcode.
Computer Retrieval of Information on Science Projects (CRISP) Thesaurus csp
Cancer Staging Terms cst Cell Signaling Technology is a commercial organisation which provides a pathway portal to showcase their phospho-antibody products. This collection references pathways.
Cell Signaling Technology Antibody cst.ab Cell Signaling Technology is a commercial organisation which provides a pathway portal to showcase their phospho-antibody products. This collection references antibody products.
CTD Chemical ctd.chemical The Comparative Toxicogenomics Database (CTD) presents scientifically reviewed and curated information on chemicals, relevant genes and proteins, and their interactions in vertebrates and invertebrates. It integrates sequence, reference, species, microarray, and general toxicology information to provide a unique centralized resource for toxicogenomic research. The database also provides visualization capabilities that enable cross-species comparisons of gene and protein sequences.
CTD Disease ctd.disease The Comparative Toxicogenomics Database (CTD) presents scientifically reviewed and curated information on chemicals, relevant genes and proteins, and their interactions in vertebrates and invertebrates. It integrates sequence, reference, species, microarray, and general toxicology information to provide a unique centralized resource for toxicogenomic research. The database also provides visualization capabilities that enable cross-species comparisons of gene and protein sequences.
CTD Gene ctd.gene The Comparative Toxicogenomics Database (CTD) presents scientifically reviewed and curated information on chemicals, relevant genes and proteins, and their interactions in vertebrates and invertebrates. It integrates sequence, reference, species, microarray, and general toxicology information to provide a unique centralized resource for toxicogenomic research. The database also provides visualization capabilities that enable cross-species comparisons of gene and protein sequences.
Ctenophore Ontology cteno An anatomical and developmental ontology for ctenophores (Comb Jellies)
CTO: Core Ontology of Clinical Trials cto The core Ontology of Clinical Trials (CTO) will serve as a structured resource integrating basic terms and concepts in the context of clinical trials. Thereby covering clinicaltrails.gov. CoreCTO will serve as a basic ontology to generate extended versions for specific applications such as annotation of variables in study documents from clinical trials.
Cube db cubedb Cube-DB is a database of pre-evaluated results for detection of functional divergence in human/vertebrate protein families. It analyzes comparable taxonomical samples for all paralogues under consideration, storing functional specialisation at the level of residues. The data are presented as a table of per-residue scores, and mapped onto related structures where available.
Cardiovascular Disease Ontology cvdo An ontology to describe entities related to cardiovascular diseases
DataONE d1id DataONE provides infrastructure facilitating long-term access to scientific research data of relevance to the earth sciences.
DailyMed dailymed DailyMed provides information about marketed drugs. This information includes FDA labels (package inserts). The Web site provides a standard, comprehensive, up-to-date, look-up and download resource of medication content and labeling as found in medication package inserts. Drug labeling is the most recent submitted to the Food and Drug Administration (FDA) and currently in use; it may include, for example, strengthened warnings undergoing FDA review or minor editorial changes. These labels have been reformatted to make them easier to read.
DANDI: Distributed Archives for Neurophysiology Data Integration dandi DANDI works with BICCN and other BRAIN Initiative awardees to curate data using community data standards such as NWB and BIDS, and to make data and software for cellular neurophysiology FAIR (Findable, Accessible, Interoperable, and Reusable). DANDI references electrical and optical cellular neurophysiology recordings and associated MRI and/or optical imaging data. These data will help scientists uncover and understand cellular level mechanisms of brain function. Scientists will study the formation of neural networks, how cells and networks enable functions such as learning and memory, and how these functions are disrupted in neurological disorders.
Database of Aligned Ribosomal Complexes darc DARC (Database of Aligned Ribosomal Complexes) stores available cryo-EM (electron microscopy) data and atomic coordinates of ribosomal particles from the PDB, which are aligned within a common coordinate system. The aligned coordinate system simplifies direct visualization of conformational changes in the ribosome, such as subunit rotation and head-swiveling, as well as direct comparison of bound ligands, such as antibiotics or translation factors.
Database of small human noncoding RNAs dashr DASHR reports the annotation, expression and evidence for specific RNA processing (cleavage specificity scores/entropy) of human sncRNA genes, precursor and mature sncRNA products across different human tissues and cell types. DASHR integrates information from multiple existing annotation resources for small non-coding RNAs, including microRNAs (miRNAs), Piwi-interacting (piRNAs), small nuclear (snRNAs), nucleolar (snoRNAs), cytoplasmic (scRNAs), transfer (tRNAs), tRNA fragments (tRFs), and ribosomal RNAs (rRNAs). These datasets were obtained from non-diseased human tissues and cell types and were generated for studying or profiling small non-coding RNAs. This collection references RNA records.
DASHR expression dashr.expression DASHR reports the annotation, expression and evidence for specific RNA processing (cleavage specificity scores/entropy) of human sncRNA genes, precursor and mature sncRNA products across different human tissues and cell types. DASHR integrates information from multiple existing annotation resources for small non-coding RNAs, including microRNAs (miRNAs), Piwi-interacting (piRNAs), small nuclear (snRNAs), nucleolar (snoRNAs), cytoplasmic (scRNAs), transfer (tRNAs), tRNA fragments (tRFs), and ribosomal RNAs (rRNAs). These datasets were obtained from non-diseased human tissues and cell types and were generated for studying or profiling small non-coding RNAs. This collection references RNA expression.
Datanator Gene datanator.gene Datanator is an integrated database of genomic and biochemical data designed to help investigators find data about specific molecules and reactions in specific organisms and specific environments for meta-analyses and mechanistic models. Datanator currently includes metabolite concentrations, RNA modifications and half-lives, protein abundances and modifications, and reaction kinetics integrated from several databases and numerous publications. The Datanator website and REST API provide tools for extracting clouds of data about specific molecules and reactions in specific organisms and specific environments, as well as data about similar molecules and reactions in taxonomically similar organisms.
Datanator Metabolite datanator.metabolite Datanator is an integrated database of genomic and biochemical data designed to help investigators find data about specific molecules and reactions in specific organisms and specific environments for meta-analyses and mechanistic models. Datanator currently includes metabolite concentrations, RNA modifications and half-lives, protein abundances and modifications, and reaction kinetics integrated from several databases and numerous publications. The Datanator website and REST API provide tools for extracting clouds of data about specific molecules and reactions in specific organisms and specific environments, as well as data about similar molecules and reactions in taxonomically similar organisms.
Datanator Reaction datanator.reaction Datanator is an integrated database of genomic and biochemical data designed to help investigators find data about specific molecules and reactions in specific organisms and specific environments for meta-analyses and mechanistic models. Datanator currently includes metabolite concentrations, RNA modifications and half-lives, protein abundances and modifications, and reaction kinetics integrated from several databases and numerous publications. The Datanator website and REST API provide tools for extracting clouds of data about specific molecules and reactions in specific organisms and specific environments, as well as data about similar molecules and reactions in taxonomically similar organisms.
Database of Arabidopsis Transcription Factors datf DATF contains known and predicted Arabidopsis transcription factors (1827 genes in 56 families) with the unique information of 1177 cloned sequences and many other features including 3D structure templates, EST expression information, transcription factor binding sites and nuclear location signals.
Transcription Factor Database dbd The DBD (transcription factor database) provides genome-wide transcription factor predictions for organisms across the tree of life. The prediction method identifies sequence-specific DNA-binding transcription factors through homology using profile hidden Markov models (HMMs) of domains from Pfam and SUPERFAMILY. It does not include basal transcription factors or chromatin-associated proteins.
EST database maintained at the NCBI. dbest The dbEST contains sequence data and other information on "single-pass" cDNA sequences, or "Expressed Sequence Tags", from a number of organisms.
DBG2 Introns dbg2introns The Database for Bacterial Group II Introns provides a catalogue of full-length, non-redundant group II introns present in bacterial DNA sequences in GenBank.
Database of Genotypes and Phenotypes dbgap The database of Genotypes and Phenotypes (dbGaP) archives and distributes the results of studies that have investigated the interaction of genotype and phenotype.
Database of human Major Histocompatibility Complex dbmhc
NCBI Probe database Public registry of nucleic acid reagents dbprobe The NCBI Probe Database is a public registry of nucleic acid reagents designed for use in a wide variety of biomedical research applications, together with information on reagent distributors, probe effectiveness, and computed sequence similarities.
NCBI dbSNP dbsnp The dbSNP database is a repository for both single base nucleotide subsitutions and short deletion and insertion polymorphisms.
Disease Class dc
Dendritic cell dc_cl
Data Catalog dcat DCAT is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web
Dublin Core Metadata Vocabulary dcterms This document is an up-to-date specification of all metadata terms maintained by the Dublin Core Metadata Initiative, including properties, vocabulary encoding schemes, syntax encoding schemes, and classes.
Dublin Core Types dctypes This document is an up-to-date specification of all metadata terms maintained by the Dublin Core Metadata Initiative, including properties, vocabulary encoding schemes, syntax encoding schemes, and classes.
Dictyostelium discoideum anatomy ddanat A structured controlled vocabulary of the anatomy of the slime-mold Dictyostelium discoideum
Dictyostelium discoideum phenotype ontology ddpheno A structured controlled vocabulary of phenotypes of the slime-mould Dictyostelium discoideum.
Degradome Database degradome The Degradome Database contains information on the complete set of predicted proteases present in a a variety of mammalian species that have been subjected to whole genome sequencing. Each protease sequence is curated and, when necessary, cloned and sequenced.
DepMap Cell Lines depmap Cell lines used in the Dependency Map (DepMap). Highly related to CCLE Cells.
Human Dephosphorylation Database depod The human DEPhOsphorylation Database (DEPOD) contains information on known human active phosphatases and their experimentally verified protein and nonprotein substrates. Reliability scores are provided for dephosphorylation interactions, according to the type of assay used, as well as the number of laboratories that have confirmed such interaction. Phosphatase and substrate entries are listed along with the dephosphorylation site, bioassay type, and original literature, and contain links to other resources.
Human Dermatological Disease Ontology dermo DermO is an ontology with broad coverage of the domain of dermatologic disease and we demonstrate here its utility for text mining and investigation of phenotypic relationships between dermatologic disorders
Development Data Object Service dev.ga4ghdos Assists in resolving data across cloud resources.
BioData Catalyst dg.4503 Full implementation of the DRS 1.1 standard with support for persistent identifiers. Open source DRS server that follows the Gen3 implementation. Gen3 is a GA4GH compliant open source platform for developing framework services and data commons. Data commons accelerate and democratize the process of scientific discovery, especially over large or complex datasets. Gen3 is maintained by the Center for Translational Data Science at the University of Chicago. https://gen3.org
NCI Data Commons Framework Services dg.4dfc DRS server that follows the Gen3 implementation. Gen3 is a GA4GH compliant open source platform for developing framework services and data commons. Data commons accelerate and democratize the process of scientific discovery, especially over large or complex datasets. Gen3 is maintained by the Center for Translational Data Science at the University of Chicago. https://gen3.org
JCOIN dg.6vts Full implementation of the DRS 1.1 standard with support for persistent identifiers. Open source DRS server that follows the Gen3 implementation. Gen3 is a GA4GH compliant open source platform for developing framework services and data commons. Data commons accelerate and democratize the process of scientific discovery, especially over large or complex datasets. Gen3 is maintained by the Center for Translational Data Science at the University of Chicago. https://gen3.org
Anvil dg.anv0 DRS server that follows the Gen3 implementation. Gen3 is a GA4GH compliant open source platform for developing framework services and data commons. Data commons accelerate and democratize the process of scientific discovery, especially over large or complex datasets. Gen3 is maintained by the Center for Translational Data Science at the University of Chicago. https://gen3.org
DICOM Controlled Terminology dicom DICOM Controlled Terminology
dictyBase dictybase A resource for Dictyostelid discoideum (a soil-dwelling amoeba) genomics
dictyBase Expressed Sequence Tag dictybase.est The dictyBase database provides data on the model organism Dictyostelium discoideum and related species. It contains the complete genome sequence, ESTs, gene models and functional annotations. This collection references expressed sequence tag (EST) information.
Dictybase Gene dictybase.gene The dictyBase database provides data on the model organism Dictyostelium discoideum and related species. It contains the complete genome sequence, ESTs, gene models and functional annotations. This collection references gene information.
Decentralized Identifier did DIDs are an effort by the W3C Credentials Community Group and the wider Internet identity community to define identifiers that can be registered, updated, resolved, and revoked without any dependency on a central authority or intermediary.
Drug-drug Interaction and Drug-drug Interaction Evidence Ontology dideo The Potential Drug-drug Interaction and Potential Drug-drug Interaction Evidence Ontology
The Drug-Drug Interactions Ontology dinto A formal represention for drug-drug interactions knowledge.
Database of Interacting Proteins dip The database of interacting protein (DIP) database stores experimentally determined interactions between proteins. It combines information from a variety of sources to create a single, consistent set of protein-protein interactions
Diseases Database diseasesdb The Diseases Database is a cross-referenced index of human disease, medications, symptoms, signs, abnormal investigation findings etc. This site provides a medical textbook-like index and search portal covering areas including: internal medical disorders, symptoms and signs, congenital and inherited disorders, infectious diseases and organisms, drugs and medications, common haematology and biochemistry investigation abnormalities.
DisProt disprot DisProt is a database of intrinsically disordered proteins and protein disordered regions, manually curated from literature.
DisProt region disprot.region DisProt is a database of intrisically disordered proteins and protein disordered regions, manually curated from literature.
Linear double stranded DNA sequences dlxb DOULIX lab-tested standard biological parts, in this case linear double stranded DNA sequences.
Circular double stranded DNA sequences composed dlxc DOULIX lab-tested standard biological parts, in this case, full length constructs.
Digital Object Identifier doi The Digital Object Identifier System is for identifying content objects in the digital environment.
Human Disease Ontology doid The Disease Ontology has been developed as a standardized ontology for human disease with the purpose of providing the biomedical community with consistent, reusable and sustainable descriptions of human disease terms, phenotype characteristics and related medical vocabulary disease concepts.
Database of Macromolecular Interactions dommino DOMMINO is a database of macromolecular interactions that includes the interactions between protein domains, interdomain linkers, N- and C-terminal regions and protein peptides.
Database for Prokaryotic Operons door DOOR (Database for prOkaryotic OpeRons) contains computationally predicted operons of all the sequenced prokaryotic genomes. It includes operons for RNA genes.
Database of Quantitative Cellular Signaling: Model doqcs.model The Database of Quantitative Cellular Signaling is a repository of models of signaling pathways. It includes reaction schemes, concentrations, rate constants, as well as annotations on the models. The database provides a range of search, navigation, and comparison functions. This datatype provides access to specific models.
Database of Quantitative Cellular Signaling: Pathway doqcs.pathway The Database of Quantitative Cellular Signaling is a repository of models of signaling pathways. It includes reaction schemes, concentrations, rate constants, as well as annotations on the models. The database provides a range of search, navigation, and comparison functions. This datatype provides access to pathways.
Drosophila Phenotype Ontology dpo An ontology for the description of Drosophila melanogaster phenotypes.
Description of Plant Viruses dpv Description of Plant Viruses (DPV) provides information about viruses, viroids and satellites of plants, fungi and protozoa. It provides taxonomic information, including brief descriptions of each family and genus, and classified lists of virus sequences. The database also holds detailed information for all sequences of viruses, viroids and satellites of plants, fungi and protozoa that are complete or that contain at least one complete gene.
DragonDB Allele dragondb.allele DragonDB is a genetic and genomic database for Antirrhinum majus (Snapdragon). This collection refers to allele information.
DragonDB DNA dragondb.dna DragonDB is a genetic and genomic database for Antirrhinum majus (Snapdragon). This collection refers to DNA sequence information.
DragonDB Locus dragondb.locus DragonDB is a genetic and genomic database for Antirrhinum majus (Snapdragon). This collection refers to Locus information.
DragonDB Protein dragondb.protein DragonDB is a genetic and genomic database for Antirrhinum majus (Snapdragon). This collection refers to protein sequence information.
The Drug Ontology dron We built this ontology primarily to support comparative effectiveness researchers studying claims data. They need to be able to query U.S. National Drug Codes (NDCs) by ingredient, mechanism of action (beta-adrenergic blockade), physiological effect (diuresis), and therapeutic intent (anti-hypertensive).
Drosophila RNAi Screening Center drsc The DRSC (Drosophila RNAi Screening Cente) tracks both production of reagents for RNA interference (RNAi) screening in Drosophila cells and RNAi screen results. It maintains a list of Drosophila gene names, identifiers, symbols and synonyms and provides information for cell-based or in vivo RNAi reagents, other types of reagents, screen results, etc. corresponding for a given gene.
DrugBank drugbank The DrugBank database is a bioinformatics and chemoinformatics resource that combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information. This collection references drug information.
DrugBank Salts drugbank.salt DrugBank is a web-enabled database containing comprehensive molecular information about drugs, their mechanisms, their interactions and their targets.
DrugBank Target v4 drugbankv4.target The DrugBank database is a bioinformatics and chemoinformatics resource that combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information. This collection references target information from version 4 of the database.
Drug Central drugcentral DrugCentral provides information on active ingredients chemical entities, pharmaceutical products, drug mode of action, indications, pharmacologic action.
Deutsche Sammlung von Mikroorganismen und Zellkulturen dsmz The Leibniz Institute DSMZ is the most diverse biological resource center in the world and one of the largest collections of microorganisms and cell cultures worldwide (bacteria, archaea, protists, yeasts, fungi, bacteriophages, plant viruses, genomic bacterial DNA as well as human and animal cell lines).
Drug Target Ontology dto DTO integrates and harmonizes knowledge of the most important druggable protein families: kinases, GPCRs, ion channels and nuclear hormone receptors.
Data Use Ontology duo DUO is an ontology which represent data use conditions.
eagle-i eaglei Discovery tool for biomedical research resources available at institutions throughout the U.S.
European Collection of Authenticated Cell Culture ecacc The European Collection of Authenticated Cell Cultures (ECACC) is one of four Culture Collections of Public Health England. We supply authenticated and quality controlled cell lines, nucleic acids and induced Pluripotent Stem Cells (iPSCs).
The Echinoderm Anatomy and Development Ontology ecao None
Enzyme Nomenclature eccode The Enzyme Classification contains the recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the nomenclature and classification of enzyme-catalysed reactions.
Electrocardiogram Ontology ecg The Electrocardiography (ECG) Ontology is a Driving Biological Project of the NCBO. The ECG Ontology will contain terms for describing electrocardiograms, their capture method(s) and their waveforms.
EchoBASE post-genomic database for Escherichia coli echobase EchoBASE is a database designed to contain and manipulate information from post-genomic experiments using the model bacterium Escherichia coli K-12. The database is built on an enhanced annotation of the updated genome sequence of strain MG1655 and the association of experimental data with the E.coli genes and their products.
E. coli Metabolite Database ecmdb The ECMDB is an expertly curated database containing extensive metabolomic data and metabolic pathway diagrams about Escherichia coli (strain K12, MG1655). This database includes significant quantities of “original” data compiled by members of the Wishart laboratory as well as additional material derived from hundreds of textbooks, scientific journals, metabolic reconstructions and other electronic databases. Each metabolite is linked to more than 100 data fields describing the compound, its ontology, physical properties, reactions, pathways, references, external links and associated proteins or enzymes.
Evidence ontology eco Evidence codes can be used to specify the type of supporting evidence for a piece of knowledge. This allows inference of a 'level of support' between an entity and an annotation made to an entity.
An ontology of core ecological entities ecocore Ecocore is a community ontology for the concise and controlled description of ecological traits of organisms.
EcoCyc ecocyc EcoCyc is a scientific database for the bacterium Escherichia coli K-12 MG1655. The EcoCyc project performs literature-based curation of its genome, and of transcriptional regulation, transporters, and metabolic pathways.
Database of Escherichia coli Sequence and Function ecogene The EcoGene database contains updated information about the E. coli K-12 genome and proteome sequences, including extensive gene bibliographies. A major EcoGene focus has been the re-evaluation of translation start sites.
EcoliWiki from EcoliHub ecoliwiki EcoliWiki is a wiki-based resource to store information related to non-pathogenic E. coli, its phages, plasmids, and mobile genetic elements. This collection references genes.
Environmental conditions, treatments and exposures ontology ecto ECTO describes exposures to experimental treatments of plants and model organisms (e.g. exposures to modification of diet, lighting levels, temperature); exposures of humans or any other organisms to stressors through a variety of routes, for purposes of public health, environmental monitoring etc, stimuli, natural and experimental, any kind of environmental condition or change in condition that can be experienced by an organism or population of organisms on earth. The scope is very general and can include for example plant treatment regimens, as well as human clinical exposures (although these may better be handled by a more specialized ontology).
E-cyanobacterium entity ecyano.entity E-cyanobacterium.org is a web-based platform for public sharing, annotation, analysis, and visualisation of dynamical models and wet-lab experiments related to cyanobacteria. It allows models to be represented at different levels of abstraction — as biochemical reaction networks or ordinary differential equations.It provides concise mappings of mathematical models to a formalised consortium-agreed biochemical description, with the aim of connecting the world of biological knowledge with benefits of mathematical description of dynamic processes. This collection references entities.
E-cyanobacterium Experimental Data ecyano.experiment E-cyanobacterium experiments is a repository of wet-lab experiments related to cyanobacteria. The emphasis is placed on annotation via mapping to local database of biological knowledge and mathematical models along with the complete experimental setup supporting the reproducibility of the experiments.
E-cyanobacterium model ecyano.model E-cyanobacterium.org is a web-based platform for public sharing, annotation, analysis, and visualisation of dynamical models and wet-lab experiments related to cyanobacteria. It allows models to be represented at different levels of abstraction — as biochemical reaction networks or ordinary differential equations.It provides concise mappings of mathematical models to a formalised consortium-agreed biochemical description, with the aim of connecting the world of biological knowledge with benefits of mathematical description of dynamic processes. This collection references models.
E-cyanobacterium rule ecyano.rule E-cyanobacterium.org is a web-based platform for public sharing, annotation, analysis, and visualisation of dynamical models and wet-lab experiments related to cyanobacteria. It allows models to be represented at different levels of abstraction — as biochemical reaction networks or ordinary differential equations.It provides concise mappings of mathematical models to a formalised consortium-agreed biochemical description, with the aim of connecting the world of biological knowledge with benefits of mathematical description of dynamic processes. This collection references rules.
Bioinformatics operations, data types, formats, identifiers and topics edam EDAM is an ontology of general bioinformatics concepts, including topics, data types, formats, identifiers and operations. EDAM provides a controlled vocabulary for the description, in semantic terms, of things such as: web services (e.g. WSDL files), applications, tool collections and packages, work-benches and workflow software, databases and ontologies, XSD data schema and data objects, data syntax and file formats, web portals and pages, resource catalogues and documents (such as scientific publications).
Experimental Factor Ontology efo The Experimental Factor Ontology (EFO) provides a systematic description of many experimental variables available in EBI databases. It combines parts of several biological ontologies, such as anatomy, disease and chemical compounds. The scope of EFO is to support the annotation, analysis and visualization of data handled by the EBI Functional Genomics Team.
European Genome-phenome Archive Dataset ega.dataset The EGA is a service for permanent archiving and sharing of all types of personally identifiable genetic and phenotypic data resulting from biomedical research projects. The EGA contains exclusive data collected from individuals whose consent agreements authorize data release only for specific research use or to bona fide researchers. Strict protocols govern how information is managed, stored and distributed by the EGA project. This collection references 'Datasets'.
European Genome-phenome Archive Study ega.study The EGA is a service for permanent archiving and sharing of all types of personally identifiable genetic and phenotypic data resulting from biomedical research projects. The EGA contains exclusive data collected from individuals whose consent agreements authorize data release only for specific research use or to bona fide researchers. Strict protocols govern how information is managed, stored and distributed by the EGA project. This collection references 'Studies' which are experimental investigations of a particular phenomenon, often drawn from different datasets.
eggNOG eggnog eggNOG (evolutionary genealogy of genes: Non-supervised Orthologous Groups) is a database of orthologous groups of genes. The orthologous groups are annotated with functional description lines (derived by identifying a common denominator for the genes based on their various annotations), with functional categories (i.e derived from the original COG/KOG categories).
Human developmental anatomy, timed version ehda
Human developmental anatomy, abstract version ehdaa
Human developmental anatomy, abstract ehdaa2 A structured controlled vocabulary of stage-specific anatomical structures of the developing human.
Eukaryotic Linear Motif Resource elm Linear motifs are short, evolutionarily plastic components of regulatory proteins. Mainly focused on the eukaryotic sequences,the Eukaryotic Linear Motif resource (ELM) is a database of curated motif classes and instances.
Mouse gross anatomy and development, timed emap A structured controlled vocabulary of stage-specific anatomical structures of the mouse (Mus).
Mouse Developmental Anatomy Ontology emapa An ontology for mouse anatomy covering embryonic development and postnatal stages.
Electron Microscopy Data Bank emdb The Electron Microscopy Data Bank (EMDB) is a public repository for electron microscopy density maps of macromolecular complexes and subcellular structures. It covers a variety of techniques, including single-particle analysis, electron tomography, and electron (2D) crystallography. The EMDB map distribution format follows the CCP4 definition, which is widely recognized by software packages used by the structural biology community.
Reaxys eMolecules emolecules Catalog of purchasable reagents and building blocks
Electron Microscopy Public Image Archive empiar EMPIAR, the Electron Microscopy Public Image Archive, is a public resource for raw, 2D electron microscopy images. Here, you can browse, upload and download the raw images used to build a 3D structure
European Nucleotide Archive ena.embl The European Nucleotide Archive (ENA) captures and presents information relating to experimental workflows that are based around nucleotide sequencing. ENA is made up of a number of distinct databases that includes EMBL-Bank, the Sequence Read Archive (SRA) and the Trace Archive each with their own data formats and standards. This collection references Embl-Bank identifiers.
ENCODE: Encyclopedia of DNA Elements encode The ENCODE Consortium is integrating multiple technologies and approaches in a collective effort to discover and define the functional elements encoded in the human genome, including genes, transcripts, and transcriptional regulatory regions, together with their attendant chromatin states and DNA methylation patterns.
eNanoMapper Ontology enm The eNanoMapper project (www.enanomapper.net) is creating a pan-European computational infrastructure for toxicological data management for ENMs, based on semantic web standards and ontologies. > This ontology is an application ontology targeting the full domain of nanomaterial safety assessment. It re-uses several other ontologies including the NPO, CHEMINF, ChEBI, and ENVO.
Ensembl gene ID ensembl Ensembl is a joint project between EMBL - EBI and the Sanger Institute to develop a software system which produces and maintains automatic annotation on selected eukaryotic genomes. This collections also references outgroup organisms.
Ensembl Bacteria ensembl.bacteria Ensembl Genomes consists of five sub-portals (for bacteria, protists, fungi, plants and invertebrate metazoa) designed to complement the availability of vertebrate genomes in Ensembl. This collection is concerned with bacterial genomes.
Ensembl Fungi, the Ensembl database for accessing genome-scale data from fungi. ensembl.fungi Ensembl Genomes consists of five sub-portals (for bacteria, protists, fungi, plants and invertebrate metazoa) designed to complement the availability of vertebrate genomes in Ensembl. This collection is concerned with fungal genomes.
Ensembl Metazoa, the Ensembl database for accessing genome-scale data from non-vertebrate metazoa. ensembl.metazoa Ensembl Genomes consists of five sub-portals (for bacteria, protists, fungi, plants and invertebrate metazoa) designed to complement the availability of vertebrate genomes in Ensembl. This collection is concerned with metazoa genomes.
Ensembl Plants ensembl.plant Ensembl Genomes consists of five sub-portals (for bacteria, protists, fungi, plants and invertebrate metazoa) designed to complement the availability of vertebrate genomes in Ensembl. This collection is concerned with plant genomes.
Ensembl Protists ensembl.protist Ensembl Genomes consists of five sub-portals (for bacteria, protists, fungi, plants and invertebrate metazoa) designed to complement the availability of vertebrate genomes in Ensembl. This collection is concerned with protist genomes.
Ensembl Glossary ensemblglossary The Ensembl glossary lists the terms, data types and file types that are used in Ensembl and describes how they are used.
enviPath envipath enviPath is a database and prediction system for the microbial biotransformation of organic environmental contaminants. The database provides the possibility to store and view experimentally observed biotransformation pathways. The pathway prediction system provides different relative reasoning models to predict likely biotransformation pathways and products.
Environment Ontology envo The Environment Ontology is a resource and research target for the semantically controlled description of environmental entities. The ontology's initial aim was the representation of the biomes, environmental features, and environmental materials pertinent to genomic and microbiome-related investigations.
Environment Ontology for Livestock eol L'ontologie EOL décrit les conditions d'environnement des élevages d'animaux domestiques. Elle décrit plus particulièrement les modalités de l'alimentation, de l'environnement, de la structure des élevages et des systèmes d'élevage
European Paediatric Cardiac Codes epcc Collection of European paediatric cardiac coding files
Eukaryotic Promoter Database epd The Eukaryotic Promoter Database (EPD) is an annotated non-redundant collection of eukaryotic POL II promoters, for which the transcription start site has been determined experimentally. Access to promoter sequences is provided by pointers to positions in nucleotide sequence entries. The annotation part of an entry includes description of the initiation site mapping data, cross-references to other databases, and bibliographic references. EPD is structured in a way that facilitates dynamic extraction of biologically meaningful promoter subsets for comparative sequence analysis.
Epidemiology Ontology epo An ontology designed to support the semantic annotation of epidemiology resources
European Registry of Materials erm The European Registry of Materials is a simple registry with the sole purpose to mint material identifiers to be used by research projects throughout the life cycle of their project.
eagle-i resource ontology ero An ontology of research resources such as instruments. protocols, reagents, animal models and biospecimens.
Human Endogenous Retrovirus Database erv Endogenous retroviruses (ERVs) are common in vertebrate genomes; a typical mammalian genome contains tens to hundreds of thousands of ERV elements. Most ERVs are evolutionarily old and have accumulated multiple mutations, playing important roles in physiology and disease processes. The Human Endogenous Retrovirus Database (hERV) is compiled from the human genome nucleotide sequences obtained from Human Genome Projects, and screens those sequences for hERVs, whilst continuously improving classification and characterization of retroviral families. It provides access to individual reconstructed HERV elements, their sequence, structure and features.
JRC Data Catalogue eu89h The JRC Data Catalogue gives access to the multidisciplinary data produced and maintained by the Joint Research Centre, the European Commission's in-house science service providing independent scientific advice and support to policies of the European Union.
EU Clinical Trials euclinicaltrials The EU Clinical Trials Register contains information on clinical trials conducted in the European Union (EU), or the European Economic Area (EEA) which started after 1 May 2004. It also includes trials conducted outside these areas if they form part of a paediatric investigation plan (PIP), or are sponsored by a marketing authorisation holder, and involve the use of a medicine in the paediatric population.
VEuPathDB ontology eupath The VEuPathDB ontology is an application ontology developed to encode our understanding of what data is about in the public resources developed and maintained by the Eukaryotic Pathogen, Host and Vector Genomics Resource (VEuPathDB; https://veupathdb.org). The VEuPathDB ontology was previously named the EuPathDB ontology prior to EuPathDB joining with VectorBase.The ontology was built based on the Ontology of Biomedical Investigations (OBI) with integration of other OBO ontologies such as PATO, OGMS, DO, etc. as needed for coverage. Currently the VEuPath ontology is primarily intended to be used for support of the VEuPathDB sites. Terms with VEuPathDB ontology IDs that are not specific to VEuPathDB will be submitted to OBO Foundry ontologies for subsequent import and replacement of those terms when they are available.
European Food Information Resource Network eurofir EuroFir (European Food Information Resource Network), the world-leading European Network of Excellence on Food Composition Databank systems, is a partnership between 48 universities, research institutes and small-to-medium sized enterprises (SMEs) from 25 European countries.
eVOC (Expressed Sequence Annotation for Humans) ev
eVOC mouse development stage evm
ExAC Gene exac.gene The Exome Aggregation Consortium (ExAC) is a coalition of investigators seeking to aggregate and harmonize exome sequencing data from a variety of large-scale sequencing projects, and to make summary data available for the wider scientific community. The data pertains to unrelated individuals sequenced as part of various disease-specific and population genetic studies and serves as a reference set of allele frequencies for severe disease studies. This collection references gene information.
ExAC Transcript exac.transcript The Exome Aggregation Consortium (ExAC) is a coalition of investigators seeking to aggregate and harmonize exome sequencing data from a variety of large-scale sequencing projects, and to make summary data available for the wider scientific community. The data pertains to unrelated individuals sequenced as part of various disease-specific and population genetic studies and serves as a reference set of allele frequencies for severe disease studies. This collection references transcript information.
ExAC Variant exac.variant The Exome Aggregation Consortium (ExAC) is a coalition of investigators seeking to aggregate and harmonize exome sequencing data from a variety of large-scale sequencing projects, and to make summary data available for the wider scientific community. The data pertains to unrelated individuals sequenced as part of various disease-specific and population genetic studies and serves as a reference set of allele frequencies for severe disease studies. This collection references variant information.
Exposure ontology exo ExO is intended to bridge the gap between exposure science and diverse environmental health disciplines including toxicology, epidemiology, disease surveillance, and epigenetics.
FaceBase Data Repository facebase FaceBase is a collaborative NIDCR-funded consortium to generate data in support of advancing research into craniofacial development and malformation. It serves as a community resource by generating large datasets of a variety of types and making them available to the wider research community via this website. Practices emphasize a comprehensive and multidisciplinary approach to understanding the developmental processes that create the face. The data offered spotlights high-throughput genetic, molecular, biological, imaging and computational techniques. One of the missions of this consortium is to facilitate cooperation and collaboration between projects.
FAIRsharing fairsharing The web-based FAIRSharing catalogues aim to centralize bioscience data policies, reporting standards and links to other related portals. This collection references bioinformatics data exchange standards, which includes 'Reporting Guidelines', Format Specifications and Terminologies.
Feature Annotation Location Description Ontology faldo It is a simple ontology to describe sequence feature positions and regions as found in GFF3, DBBJ, EMBL, GenBank files, UniProt, and many other bioinformatics resources
Fungal gross anatomy fao A structured controlled vocabulary for the anatomy of fungi.
Biological Imaging Methods Ontology fbbi A structured controlled vocabulary of sample preparation, visualization and imaging methods used in biomedical research.
Drosophila gross anatomy fbbt An ontology of Drosophila melanogaster anatomy.
FlyBase Controlled Vocabulary fbcv A miscellaneous ontology of terms used for curation in FlyBase, including the DPO.
Drosophila development fbdv An ontology of Drosophila melanogaster developmental stages.
International Fungal Working Group Fungal Barcoding. fbol DNA barcoding is the use of short standardised segments of the genome for identification of species in all the Kingdoms of Life. The goal of the Fungal Barcoding site is to promote the DNA barcoding of fungi and other fungus-like organisms.
FlyBase Reference Report fbrf FlyBase internal citation identifiers
Fly taxonomy fbsp The taxonomy of the family <i>Drosophilidae</i> (largely after Baechli) and of other taxa referred to in FlyBase.
Flybase Cell Line fbtc The cell line vocabulary inside FlyBase
Food Interactions with Drugs Evidence Ontology fideo The Food Interactions with Drugs Evidence Ontology (FIDEO) represents Food-Drug Interactions and underlying interaction mechanisms described in scientific publications, drug and adverse effects databases, and drug interactions compendia. The ontology builds on previous efforts from the FoodOn, DRON, ChEBI, and DIDEO ontologies as well as the Thériaque database. This ontology is maintained at https://gitub.u-bordeaux.fr/erias/fideo, and requests for changes or additions should be filed at the issue tracker there.
FishBase fishbase.species Global biodiversity database on finfishes. It offers a wide range of information on all species currently known in the world: taxonomy, biology, trophic ecology, life history, and uses, as well as historical data reaching back to 250 years.
Physico-chemical methods and properties fix An ontology of physico-chemical methods and properties.
Flora Phenotype Ontology flopo Traits and phenotypes of flowering plants occurring in digitized Floras
FlowRepository flowrepository FlowRepository is a database of flow cytometry experiments where you can query and download data collected and annotated according to the MIFlowCyt standard. It is primarily used as a data deposition place for experimental findings published in peer-reviewed journals in the flow cytometry field.
Influenza Ontology flu
FlyBase Gene ID flybase FlyBase is the database of the Drosophila Genome Projects and of associated literature.
Foundational Model of Anatomy fma The Foundational Model of Anatomy Ontology (FMA) is a biomedical informatics ontology. It is concerned with the representation of classes or types and relationships necessary for the symbolic representation of the phenotypic structure of the human body. Specifically, the FMA is a domain ontology that represents a coherent body of explicit declarative knowledge about human anatomy.
Friend of a Friend foaf FOAF is a project devoted to linking people and information using the Web. Regardless of whether information is in people's heads, in physical or digital documents, or in the form of factual data, it can be linked. FOAF integrates three kinds of network: social networks of human collaboration, friendship and association; representational networks that describe a simplified view of a cartoon universe in factual terms, and information networks that use Web-based linking to share independently published descriptions of this inter-connected world.
Food-Biomarker Ontology fobi FOBI (Food-Biomarker Ontology) is an ontology to represent food intake data and associate it with metabolomic data
FooDB Compound foodb.compound FooDB is resource on food and its constituent compounds. It includes data on the compound’s nomenclature, its description, information on its structure, chemical class, its physico-chemical data, its food source(s), its color, its aroma, its taste, its physiological effect, presumptive health effects (from published studies), and concentrations in various foods. This collection references compounds.
The Food Ontology foodon FoodOn is a comprehensive and easily accessible global farm-to-fork ontology about food that accurately and consistently describes foods commonly known in cultures from around the world. It is a consortium-driven project built to interoperate with the The Open Biological and Biomedical Ontology Foundry library of ontologies.
FuTRES Ontology of Vertebrate Traits fovt None
FamPlex fplx FamPlex is a collection of resources for grounding biological entities from text and describing their hierarchical relationships.
F-SNP fsnp The Functional Single Nucleotide Polymorphism (F-SNP) database integrates information obtained from databases about the functional effects of SNPs. These effects are predicted and indicated at the splicing, transcriptional, translational and post-translational level. In particular, users can retrieve SNPs that disrupt genomic regions known to be functional, including splice sites and transcriptional regulatory regions. Users can also identify non-synonymous SNPs that may have deleterious effects on protein structure or function, interfere with protein translation or impede post-translational modification.
FuncBase Fly funcbase.fly Computational gene function prediction can serve to focus experimental resources on high-priority experimental tasks. FuncBase is a web resource for viewing quantitative machine learning-based gene function annotations. Quantitative annotations of genes, including fungal and mammalian genes, with Gene Ontology terms are accompanied by a community feedback system. Evidence underlying function annotations is shown. FuncBase provides links to external resources, and may be accessed directly or via links from species-specific databases. This collection references Drosophila data.
FuncBase Human funcbase.human Computational gene function prediction can serve to focus experimental resources on high-priority experimental tasks. FuncBase is a web resource for viewing quantitative machine learning-based gene function annotations. Quantitative annotations of genes, including fungal and mammalian genes, with Gene Ontology terms are accompanied by a community feedback system. Evidence underlying function annotations is shown. FuncBase provides links to external resources, and may be accessed directly or via links from species-specific databases. This collection references human data.
FuncBase Mouse funcbase.mouse Computational gene function prediction can serve to focus experimental resources on high-priority experimental tasks. FuncBase is a web resource for viewing quantitative machine learning-based gene function annotations. Quantitative annotations of genes, including fungal and mammalian genes, with Gene Ontology terms are accompanied by a community feedback system. Evidence underlying function annotations is shown. FuncBase provides links to external resources, and may be accessed directly or via links from species-specific databases. This collection references mouse.
FuncBase Yeast funcbase.yeast Computational gene function prediction can serve to focus experimental resources on high-priority experimental tasks. FuncBase is a web resource for viewing quantitative machine learning-based gene function annotations. Quantitative annotations of genes, including fungal and mammalian genes, with Gene Ontology terms are accompanied by a community feedback system. Evidence underlying function annotations is shown. FuncBase provides links to external resources, and may be accessed directly or via links from species-specific databases. This collection references yeast.
FunderRegistry funderregistry The Funder Registry is an open registry of persistent identifiers for grant-giving organizations around the world.
FungiDB fungidb FungiDB is a genomic resource for fungal genomes. It contains contains genome sequence and annotation from several fungal classes, including the Ascomycota classes, Eurotiomycetes, Sordariomycetes, Saccharomycetes and the Basidiomycota orders, Pucciniomycetes and Tremellomycetes, and the basal 'Zygomycete' lineage Mucormycotina.
Fyler fyler A hierarchical classification of congenital heart disease
Fission Yeast Phenotype Ontology fypo A formal ontology of phenotypes observed in fission yeast.
Data Object Service ga4ghdos Assists in resolving data across cloud resources.
Network of Different Plant Genomic Research Projects gabi GabiPD (Genome Analysis of Plant Biological Systems Primary Database) constitutes a repository for a wide array of heterogeneous data from high-throughput experiments in several plant species. These data (i.e. genomics, transcriptomics, proteomics and metabolomics), originating from different model or crop species, can be accessed through a central gene 'Green Card'.
Galen Ontology galen
Genetic and Rare Diseases Information Center gard Database of rare diseases and related terms, including symptoms, healthcare resources, and organizations supporting research of the disease.
Health Data Research Innovation Gateway gateway The Health Data Research Innovation Gateway (the 'Gateway') provides a common entry point to discover and enquire about access to UK health datasets for research and innovation. It provides detailed information about the datasets, which are held by members of the UK Health Data Research Alliance, such as a description, size of the population, and the legal basis for access.
Gazetteer gaz A gazetteer constructed on ontological principles
Global Biodiversity Information Facility gbif Database of living organisms, taxonomic. The GBIF—the Global Biodiversity Information Facility—is international network and data infrastructure funded by the world's governments and aimed at providing anyone, anywhere, open access to data about all types of life on Earth.
Genetic Code gc Genetic code, mitochontrial genetic code, and other linked information to NCBI taxonomy entries.
GWAS Catalog gcst The GWAS Catalog provides a consistent, searchable, visualisable and freely available database of published SNP-trait associations, which can be easily integrated with other resources, and is accessed by scientists, clinicians and other users worldwide.
Genomic Data Commons Data Portal gdc The GDC Data Portal is a robust data-driven platform that allows cancer researchers and bioinformaticians to search and download cancer data for analysis.
Genomics of Drug Sensitivity in Cancer gdsc The Genomics of Drug Sensitivity in Cancer (GDSC) database is designed to facilitate an increased understanding of the molecular features that influence drug response in cancer cells and which will enable the design of improved cancer therapies.
Genomics Cohorts Knowledge Ontology gecko An ontology to represent genomics cohort attributes.
Genatlas genatlas GenAtlas is a database containing information on human genes, markers and phenotypes.
GenBank genbank GenBank ® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences (Nucleic Acids Research, 2013 Jan;41(D1):D36-42).
GeneCards genecards The GeneCards human gene database stores gene related transcriptomic, genetic, proteomic, functional and disease information. It uses standard nomenclature and approved gene symbols. GeneCards presents a complete summary for each human gene.
GeneDB ID genedb GeneDB is a genome database for prokaryotic and eukaryotic organisms and provides a portal through which data generated by the "Pathogen Genomics" group at the Wellcome Trust Sanger Institute and other collaborating sequencing centres can be accessed.
GeneFarm genefarm GeneFarm is a database whose purpose is to store traceable annotations for Arabidopsis nuclear genes and gene products.
Genomic Epidemiology Ontology genepio The Genomic Epidemiology Ontology (GenEpiO) covers vocabulary necessary to identify, document and research foodborne pathogens and associated outbreaks.
GeneTree genetree Genetree displays the maximum likelihood phylogenetic (protein) trees representing the evolutionary history of the genes. These are constructed using the canonical protein for every gene in Ensembl.
Gene Wiki genewiki The Gene Wiki is project which seeks to provide detailed information on human genes. Initial 'stub' articles are created in an automated manner, with further information added by the community. Gene Wiki can be accessed in wikipedia using Gene identifiers from NCBI.
Genotype Ontology geno GENO is an OWL model of genotypes, their more fundamental sequence components, and links to related biological and experimental entities. At present many parts of the model are exploratory and set to undergo refactoring. In addition, many classes and properties have GENO URIs but are place holders for classes that will be imported from an external ontology (e.g. SO, ChEBI, OBI, etc). Furthermore, ongoing work will implement a model of genotype-to-phenotype associations. This will support description of asserted and inferred relationships between a genotypes, phenotypes, and environments, and the evidence/provenance behind these associations. Documentation is under development as well, and for now a slidedeck is available at http://www.slideshare.net/mhb120/brush-icbo-2013
GenPept genpept The GenPept database is a collection of sequences based on translations from annotated coding regions in GenBank.
Genome Properties genprop Genome properties is an annotation system whereby functional attributes can be assigned to a genome, based on the presence of a defined set of protein signatures within that genome.
NCBI Gene Expression Omnibus geo The Gene Expression Omnibus (GEO) is a gene expression repository providing a curated, online resource for gene expression data browsing, query and retrieval.
Geographical Entity Ontology geogeo An ontology and inventory of geopolitical entities such as nations and their components (states, provinces, districts, counties) and the actual physical territories over which they have jurisdiction. We thus distinguish and assign different identifiers to the US in "The US declared war on Germany" vs. the US in "The plane entered US airspace".
Gene Expression Ontology gexo Gene Expression Ontology
Genetics Home Reference ghr MedlinePlus Genetics contains detailed information about the effects of genetic variation on human health, covering more than 1,300 genetic conditions and 1,400 genes, all of the human chromosomes, and mitochondrial DNA (mtDNA).
GiardiaDB giardiadb GiardiaDB is one of the databases that can be accessed through the EuPathDB (http://EuPathDB.org; formerly ApiDB) portal, covering eukaryotic pathogens of the genera Cryptosporidium, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Trichomonas and Trypanosoma. While each of these groups is supported by a taxon-specific database built upon the same infrastructure, the EuPathDB portal offers an entry point to all these resources, and the opportunity to leverage orthology for searches across genera.
github github GitHub is an online host of Git source code repositories.
GLIDA GPCR glida.gpcr The GPCR-LIgand DAtabase (GLIDA) is a GPCR-related chemical genomic database that is primarily focused on the correlation of information between GPCRs and their ligands. It provides correlation data between GPCRs and their ligands, along with chemical information on the ligands. This collection references G-protein coupled receptors.
GLIDA Ligand glida.ligand The GPCR-LIgand DAtabase (GLIDA) is a GPCR-related chemical genomic database that is primarily focused on the correlation of information between GPCRs and their ligands. It provides correlation data between GPCRs and their ligands, along with chemical information on the ligands. This collection references ligands.
GlycoEpitope glycoepitope GlycoEpitope is a database containing useful information about carbohydrate antigens (glyco-epitopes) and the antibodies (polyclonal or monoclonal) that can be used to analyze their expression. This collection references Glycoepitopes.
GlycomeDB glycomedb GlycomeDB is the result of a systematic data integration effort, and provides an overview of all carbohydrate structures available in public databases, as well as cross-links.
GlycoNAVI glyconavi GlycoNAVI is a website for carbohydrate research. It consists of the "GlycoNAVI Database" that provides information such as existence ratios and names of glycans, 3D structures of glycans and complex glycoconjugates, and the "GlycoNAVI tools" such as editing of 2D structures of glycans, glycan structure viewers, and conversion tools.
GlycoPOST glycopost GlycoPOST is a mass spectrometry data repository for glycomics and glycoproteomics. Users can release their "raw/processed" data via this site with a unique identifier number for the paper publication. Submission conditions are in accordance with the Minimum Information Required for a Glycomics Experiment (MIRAGE) guidelines.
GlyGen: Computational and Informatics Resources for Glycoscience glygen GlyGen is a data integration and dissemination project for carbohydrate and glycoconjugate related data. GlyGen retrieves information from multiple international data sources and integrates and harmonizes this data.
GlyTouCan glytoucan GlyTouCan is the single worldwide registry of glycan (carbohydrate sugar chain) data.
Golm Metabolome Database gmd Golm Metabolome Database (GMD) provides public access to custom mass spectral libraries, metabolite profiling experiments as well as additional information and tools. This collection references metabolite information, relating the biologically active substance to metabolic pathways or signalling phenomena.
Golm Metabolome Database Analyte gmd.analyte Golm Metabolome Database (GMD) provides public access to custom mass spectral libraries, metabolite profiling experiments as well as additional information and tools. For GC-MS profiling analyses, polar metabolite extracts are chemically converted, i.e. derivatised into less polar and volatile compounds, so called analytes. This collection references analytes.
Golm Metabolome Database GC-MS spectra gmd.gcms Golm Metabolome Database (GMD) provides public access to custom mass spectral libraries, metabolite profiling experiments as well as additional information and tools. Analytes are subjected to a gas chromatograph coupled to a mass spectrometer, which records the mass spectrum and the retention time linked to an analyte. This collection references GC-MS spectra.
Golm Metabolome Database Profile gmd.profile Golm Metabolome Database (GMD) provides public access to custom mass spectral libraries, metabolite profiling experiments as well as additional information and tools. GMD's metabolite profiles provide relative metabolite concentrations normalised according to fresh weight (or comparable quantitative data, such as volume, cell count, etc.) and internal standards (e.g. ribotol) of biological reference conditions and tissues.
Golm Metabolome Database Reference Substance gmd.ref Golm Metabolome Database (GMD) provides public access to custom mass spectral libraries, metabolite profiling experiments as well as additional information and tools. Since metabolites often cannot be obtained in their respective native biological state, for example organic acids may be only acquirable as salts, the concept of reference substance was introduced. This collection references reference substances.
Gmelins Handbuch der anorganischen Chemie gmelin The Gmelin database is a large database of organometallic and inorganic compounds updated quarterly. It is based on the German publication Gmelins Handbuch der anorganischen Chemie which was originally published by Leopold Gmelin in 1817; the last print edition, the 8th, appeared in the 1990s.
Glycan Naming and Subsumption Ontology gno An ontology for glycans based on GlyTouCan, but organized by subsumption.
GnpIS gnpis GnpIS is an integrative information system focused on plants and fungal pests. It provides both genetic (e.g. genetic maps, quantitative trait loci, markers, single nucleotide polymorphisms, germplasms and genotypes) and genomic data (e.g. genomic sequences, physical maps, genome annotation and expression data) for species of agronomical interest.
Gene Ontology go The Gene Ontology project provides a controlled vocabulary to describe gene and gene product attributes in any organism.
Gene Ontology Causal Assembly Model go.model GO-Causal Activity Models (GO-CAMs) use a defined “grammar” for linking multiple GO annotations into larger models of biological function (such as “pathways”) in a semantically structured manner. GO-CAMs are created by expert biocurators from the GO Consortium, using the Noctua Curation Platform.
Gene Ontology Database references go.ref The GO reference collection is a set of abstracts that can be cited in the GO ontologies (e.g. as dbxrefs for term definitions) and annotation files (in the Reference column). It provides two types of reference; It can be used to provide details of why specific Evidence codes (see http://identifiers.org/eco/) are assigned, or to present abstract-style descriptions of "GO content" meetings at which substantial changes in the ontologies are discussed and made.
Gene Ontology Annotation Database Identifier goa The GOA (Gene Ontology Annotation) project provides high-quality Gene Ontology (GO) annotations to proteins in the UniProt Knowledgebase (UniProtKB) and International Protein Index (IPI). This involves electronic annotation and the integration of high-quality manual GO annotation from all GO Consortium model organism groups and specialist groups.
GO Evidence Code goeco A GO annotation is a statement about the function of a particular gene. Each annotation includes an evidence code to indicate how the annotation to a particular term is supported.
Genomes Online Database gold The Genomes OnLine Database (GOLD) catalogues genome and metagenome sequencing projects from around the world, along with their associated metadata. Information in GOLD is organized into four levels: Study, Biosample/Organism, Sequencing Project and Analysis Project.
GOLD genome gold.genome - DEPRECATION NOTE - Please, keep in mind that this namespace has been superseeded by ‘gold’ prefix at https://registry.identifiers.org/registry/gold, and this namespace is kept here for support to already existing citations, new ones would need to use the pointed ‘gold’ namespace. The GOLD (Genomes OnLine Database)is a resource for centralised monitoring of genome and metagenome projects worldwide. It stores information on complete and ongoing projects, along with their associated metadata. This collection references the sequencing status of individual genomes.
GOLD metadata gold.meta - DEPRECATION NOTE - Please, keep in mind that this namespace has been superseeded by ‘gold’ prefix at https://registry.identifiers.org/registry/gold, and this namespace is kept here for support to already existing citations, new ones would need to use the pointed ‘gold’ namespace. The GOLD (Genomes OnLine Database)is a resource for centralized monitoring of genome and metagenome projects worldwide. It stores information on complete and ongoing projects, along with their associated metadata. This collection references metadata associated with samples.
Google Patents google.patent Google Patents covers the entire collection of granted patents and published patent applications from the USPTO, EPO, and WIPO. US patent documents date back to 1790, EPO and WIPO to 1978. Google Patents can be searched using patent number, inventor, classification, and filing date.
Google Scholar Researcher ID google.scholar Google Scholar provides a simple way to broadly search for scholarly literature. You can search across many disciplines and sources: articles, theses, books, abstracts and court opinions, from academic publishers, professional societies, online repositories, universities and other web sites.
GO Relations gorel Documentation of GO that provides a description of some of the commonly used relationships and conventions in GO.
G protein-coupled receptor database gpcrdb The G protein-coupled receptor database (GPCRDB) collects, large amounts of heterogeneous data on GPCRs. It contains experimental data on sequences, ligand-binding constants, mutations and oligomers, and derived data such as multiple sequence alignments and homology models.
Global Proteome Machine Database gpmdb The Global Proteome Machine Database was constructed to utilize the information obtained by GPM servers to aid in the difficult process of validating peptide MS/MS spectra as well as protein coverage patterns.
Gramene Gene gramene.gene Gramene is a comparative genome mapping database for grasses and crop plants. It combines a semi-automatically generated database of cereal genomic and expressed sequence tag sequences, genetic maps, map relations, quantitative trait loci (QTL), and publications, with a curated database of mutants (genes and alleles), molecular markers, and proteins. This datatype refers to genes in Gramene.
Gramene Growth Stage Ontology gramene.growthstage Gramene is a comparative genome mapping database for grasses and crop plants. It combines a semi-automatically generated database of cereal genomic and expressed sequence tag sequences, genetic maps, map relations, quantitative trait loci (QTL), and publications, with a curated database of mutants (genes and alleles), molecular markers, and proteins. This collection refers to growth stage ontology information in Gramene.
Gramene protein gramene.protein Gramene is a comparative genome mapping database for grasses and crop plants. It combines a semi-automatically generated database of cereal genomic and expressed sequence tag sequences, genetic maps, map relations, quantitative trait loci (QTL), and publications, with a curated database of mutants (genes and alleles), molecular markers, and proteins. This datatype refers to proteins in Gramene.
Gramene QTL gramene.qtl Gramene is a comparative genome mapping database for grasses and crop plants. It combines a semi-automatically generated database of cereal genomic and expressed sequence tag sequences, genetic maps, map relations, quantitative trait loci (QTL), and publications, with a curated database of mutants (genes and alleles), molecular markers, and proteins. This datatype refers to quantitative trait loci identified in Gramene.
Gramene Taxonomy gramene.taxonomy Gramene is a comparative genome mapping database for grasses and crop plants. It combines a semi-automatically generated database of cereal genomic and expressed sequence tag sequences, genetic maps, map relations, quantitative trait loci (QTL), and publications, with a curated database of mutants (genes and alleles), molecular markers, and proteins. This datatype refers to taxonomic information in Gramene.
16S rRNA gene database greengenes A 16S rRNA gene database which provides chimera screening, standard alignment, and taxonomic classification using multiple published taxonomies.
Global Research Identifier Database grid International coverage of the world's leading research organisations, indexing 92% of funding allocated globally.
Germplasm Resources Information Network grin.taxonomy GRIN (Germplasm Resources Information Network) Taxonomy for Plants provides information on scientific and common names, classification, distribution, references, and economic impact.
Cereal Plant Gross Anatomy gro
G-Rich Sequences Database grsdb GRSDB is a database of G-quadruplexes and contains information on composition and distribution of putative Quadruplex-forming G-Rich Sequences (QGRS) mapped in the eukaryotic pre-mRNA sequences, including those that are alternatively processed (alternatively spliced or alternatively polyadenylated). The data stored in the GRSDB is based on computational analysis of NCBI Entrez Gene entries and their corresponding annotated genomic nucleotide sequences of RefSeq/GenBank.
General Standard for Food Additives (GSFA) Online Database gsfa The "Codex General Standard for Food Additives" (GSFA, Codex STAN 192-1995) sets forth the conditions under which permitted food additives may be used in all foods, whether or not they have previously been standardized by Codex. The Preamble of the GSFA contains additional information for interpreting the data. Users are encouraged to consult the Preamble when using this database.
Gender, Sex, and Sexual Orientation Ontology gsso The Gender, Sex, and Sexual Orientation (GSSO) ontology is an interdisciplinary ontology connecting terms from biology, medicine, psychology, sociology, and gender studies, aiming to bridge gaps between linguistic variations inside and outside of the health care environment. A large focus of the ontology is its consideration of LGBTQIA+ terminology.
Genotype-Tissue Expression gtex The Genotype-Tissue Expression (GTEx) project aims to provide to the scientific community a resource with which to study human gene expression and regulation and its relationship to genetic variation.
Genetic Testing Registry gtr The Genetic Testing Registry (GTR®) provides a central location for voluntary submission of genetic test information by providers. The scope includes the test's purpose, methodology, validity, evidence of the test's usefulness, and laboratory contacts and credentials. The overarching goal of the GTR is to advance the public health and research into the genetic basis of health and disease
Genitourinary Development Molecular Anatomy Project gudmap The GenitoUrinary Development Molecular Anatomy Project (GUDMAP) is a consortium of laboratories working to provide the scientific and medical community with tools to facilitate research on the GenitoUrinary (GU) tract.
GWAS Central Marker gwascentral.marker GWAS Central (previously the Human Genome Variation database of Genotype-to-Phenotype information) is a database of summary level findings from genetic association studies, both large and small. It gathers datasets from public domain projects, and accepts direct data submission. It is based upon Marker information encompassing SNP and variant information from public databases, to which allele and genotype frequency data, and genetic association findings are additionally added. A Study (most generic level) contains one or more Experiments, one or more Sample Panels of test subjects, and one or more Phenotypes. This collection references a GWAS Central Marker.
GWAS Central Phenotype gwascentral.phenotype GWAS Central (previously the Human Genome Variation database of Genotype-to-Phenotype information) is a database of summary level findings from genetic association studies, both large and small. It gathers datasets from public domain projects, and accepts direct data submission. It is based upon Marker information encompassing SNP and variant information from public databases, to which allele and genotype frequency data, and genetic association findings are additionally added. A Study (most generic level) contains one or more Experiments, one or more Sample Panels of test subjects, and one or more Phenotypes. This collection references a GWAS Central Phenotype.
GWAS Central Study gwascentral.study GWAS Central (previously the Human Genome Variation database of Genotype-to-Phenotype information) is a database of summary level findings from genetic association studies, both large and small. It gathers datasets from public domain projects, and accepts direct data submission. It is based upon Marker information encompassing SNP and variant information from public databases, to which allele and genotype frequency data, and genetic association findings are additionally added. A Study (most generic level) contains one or more Experiments, one or more Sample Panels of test subjects, and one or more Phenotypes. This collection references a GWAS Central Study.
GXA Expt gxa.expt The Gene Expression Atlas (GXA) is a semantically enriched database of meta-analysis based summary statistics over a curated subset of ArrayExpress Archive, servicing queries for condition-specific gene expression patterns as well as broader exploratory searches for biologically interesting genes/samples. This collection references experiments.
GXA Gene gxa.gene The Gene Expression Atlas (GXA) is a semantically enriched database of meta-analysis based summary statistics over a curated subset of ArrayExpress Archive, servicing queries for condition-specific gene expression patterns as well as broader exploratory searches for biologically interesting genes/samples. This collection references genes.
Habronattus courtship habronattus
High-quality Automated and Manual Annotation of microbial Proteomes hamap HAMAP is a system that identifies and semi-automatically annotates proteins that are part of well-conserved and orthologous microbial families or subfamilies. These are used to build rules which are used to propagate annotations to member bacterial, archaeal and plastid-encoded protein entries.
Human Ancestry Ontology hancestro Human ancestry ontology for the NHGRI GWAS Catalog
Hymenoptera Anatomy Ontology hao A structured controlled vocabulary of the anatomy of the Hymenoptera (bees, wasps, and ants)
Human Cell Atlas Ontology hcao Application ontology for human cell types, anatomy and development stages for the Human Cell Atlas.
Healthcare Common Procedure Coding System hcpcs HCPCS is a collection of standardized codes that represent medical procedures, supplies, products and services. The codes are used to facilitate the processing of health insurance claims by Medicare and other insurers.
Hepatitis C Virus Database hcvdb the European Hepatitis C Virus Database (euHCVdb, http://euhcvdb.ibcp.fr), a collection of computer-annotated sequences based on reference genomes.mainly dedicated to HCV protein sequences, 3D structures and functional analyses.
Homeodomain Research hdr The Homeodomain Resource is a curated collection of sequence, structure, interaction, genomic and functional information on the homeodomain family. It contains sets of curated homeodomain sequences from fully sequenced genomes, including experimentally derived homeodomain structures, homeodomain protein-protein interactions, homeodomain DNA-binding sites and homeodomain proteins implicated in human genetic disorders.
Human Gene Mutation Database hgmd The Human Gene Mutation Database (HGMD) collates data on germ-line mutations in nuclear genes associated with human inherited disease. It includes information on single base-pair substitutions in coding, regulatory and splicing-relevant regions; micro-deletions and micro-insertions; indels; triplet repeat expansions as well as gross deletions; insertions; duplications; and complex rearrangements. Each mutation entry is unique, and includes cDNA reference sequences for most genes, splice junction sequences, disease-associated and functional polymorphisms, as well as links to data present in publicly available online locus-specific mutation databases.
HUGO Gene Nomenclature Committee hgnc The HGNC (HUGO Gene Nomenclature Committee) provides an approved gene name and symbol (short-form abbreviation) for each known human gene. All approved symbols are stored in the HGNC database, and each symbol is unique. HGNC identifiers refer to records in the HGNC symbol database.
HGNC Family hgnc.genefamily The HGNC (HUGO Gene Nomenclature Committee) provides an approved gene name and symbol (short-form abbreviation) for each known human gene. All approved symbols are stored in the HGNC database, and each symbol is unique. In addition, HGNC also provides symbols for both structural and functional gene families. This collection refers to records using the HGNC family symbol.
HGNC Gene Group hgnc.genegroup The HGNC (HUGO Gene Nomenclature Committee) provides an approved gene name and symbol (short-form abbreviation) for each known human gene. All approved symbols are stored in the HGNC database, and each symbol is unique. In addition, HGNC also provides a unique numerical ID to identify gene families, providing a display of curated hierarchical relationships between families.
HGNC Symbol hgnc.symbol The HGNC (HUGO Gene Nomenclature Committee) provides an approved gene name and symbol (short-form abbreviation) for each known human gene. All approved symbols are stored in the HGNC database, and each symbol is unique. This collection refers to records using the HGNC symbol.
H-InvDb Locus hinv.locus H-Invitational Database (H-InvDB) is an integrated database of human genes and transcripts. It provides curated annotations of human genes and transcripts including gene structures, alternative splicing isoforms, non-coding functional RNAs, protein functions, functional domains, sub-cellular localizations, metabolic pathways, protein 3D structure, genetic polymorphisms (SNPs, indels and microsatellite repeats), relation with diseases, gene expression profiling, molecular evolutionary features, protein-protein interactions (PPIs) and gene families/groups. This datatype provides access to the 'Locus' view.
H-InvDb Protein hinv.protein H-Invitational Database (H-InvDB) is an integrated database of human genes and transcripts. It provides curated annotations of human genes and transcripts including gene structures, alternative splicing isoforms, non-coding functional RNAs, protein functions, functional domains, sub-cellular localizations, metabolic pathways, protein 3D structure, genetic polymorphisms (SNPs, indels and microsatellite repeats), relation with diseases, gene expression profiling, molecular evolutionary features, protein-protein interactions (PPIs) and gene families/groups. This datatype provides access to the 'Protein' view.
H-InvDb Transcript hinv.transcript H-Invitational Database (H-InvDB) is an integrated database of human genes and transcripts. It provides curated annotations of human genes and transcripts including gene structures, alternative splicing isoforms, non-coding functional RNAs, protein functions, functional domains, sub-cellular localizations, metabolic pathways, protein 3D structure, genetic polymorphisms (SNPs, indels and microsatellite repeats), relation with diseases, gene expression profiling, molecular evolutionary features, protein-protein interactions (PPIs) and gene families/groups. This datatype provides access to the 'Transcript' view.
Human Metabolome Database hmdb The Human Metabolome Database (HMDB) is a database containing detailed information about small molecule metabolites found in the human body.It contains or links 1) chemical 2) clinical and 3) molecular biology/biochemistry data.
HMS LINCS Compound hms.lincs.compound Database contains all publicly available HMS LINCS datasets and information for each dataset about experimental reagents (small molecule perturbagens, cells, antibodies, and proteins) and experimental and data analysis protocols.
Homologous Organ Groups hog Documentation of HOGS (Homologous Organs Groups). Contains links to HOGs download, HOGs onthology, HOGs creation, composition, etc.
Database of Complete Genome Homologous Genes Families hogenom HOGENOM is a database of homologous genes from fully sequenced organisms (bacteria, archeae and eukarya). This collection references phylogenetic trees which can be retrieved using either UniProt accession numbers, or HOGENOM tree family identifier.
Homology Ontology hom This ontology represents concepts related to homology, as well as other concepts used to describe similarity and non-homology.
HOMD Sequence Metainformation homd.seq The Human Oral Microbiome Database (HOMD) provides a site-specific comprehensive database for the more than 600 prokaryote species that are present in the human oral cavity. It contains genomic information based on a curated 16S rRNA gene-based provisional naming scheme, and taxonomic information. This datatype contains genomic sequence information.
HOMD Taxonomy homd.taxon The Human Oral Microbiome Database (HOMD) provides a site-specific comprehensive database for the more than 600 prokaryote species that are present in the human oral cavity. It contains genomic information based on a curated 16S rRNA gene-based provisional naming scheme, and taxonomic information. This datatype contains taxonomic information.
HomoloGene ID homologene HomoloGene is a system for automated detection of homologs among the annotated genes of several completely sequenced eukaryotic genomes.
Homologous Vertebrate Genes Database hovergen HOVERGEN is a database of homologous vertebrate genes that allows one to select sets of homologous genes among vertebrate species, and to visualize multiple alignments and phylogenetic trees.
Human Phenotype Ontology hp The Human Phenotype Ontology (HPO) aims to provide a standardized vocabulary of phenotypic abnormalities encountered in human disease. Each term in the HPO describes a phenotypic abnormality, such as atrial septal defect. The HPO is currently being developed using the medical literature, Orphanet, DECIPHER, and OMIM.
Human Protein Atlas tissue profile information hpa The Human Protein Atlas (HPA) is a publicly available database with high-resolution images showing the spatial distribution of proteins in different normal and cancer human cell lines. Primary access to this collection is through Ensembl Gene identifiers.
Histopathology Ontology hpath An ontology of histopathological morphologies used by pathologists to classify/categorise animal lesions observed histologically during regulatory toxicology studies. The ontology was developed using real data from over 6000 regulatory toxicology studies donated by 13 companies spanning nine species.
Human Proteome Map Peptide hpm.peptide The Human Proteome Map (HPM) portal integrates the peptide sequencing result from the draft map of the human proteome project. The project was based on LC-MS/MS by utilizing of high resolution and high accuracy Fourier transform mass spectrometry. The HPM contains direct evidence of translation of a number of protein products derived from human genes, based on peptide identifications of multiple organs/tissues and cell types from individuals with clinically defined healthy tissues. The HPM portal provides data on individual proteins, as well as on individual peptide spectra. This collection references individual peptides through spectra.
Human Proteome Map Protein hpm.protein The Human Proteome Map (HPM) portal integrates the peptide sequencing result from the draft map of the human proteome project. The project was based on LC-MS/MS by utilizing of high resolution and high accuracy Fourier transform mass spectrometry. The HPM contains direct evidence of translation of a number of protein products derived from human genes, based on peptide identifications of multiple organs/tissues and cell types from individuals with clinically defined healthy tissues. The HPM portal provides data on individual proteins, as well as on individual peptide spectra. This collection references proteins.
Human Protein Reference Database hprd The Human Protein Reference Database (HPRD) represents a centralized platform to visually depict and integrate information pertaining to domain architecture, post-translational modifications, interaction networks and disease association for each protein in the human proteome.
Human Pluripotent Stem Cell Registry hpscreg hPSCreg is a freely accessible global registry for human pluripotent stem cell lines (hPSC-lines).
Human Developmental Stages hsapdv Life cycle stages for Human
Hazardous Substances Data Bank hsdb The Hazardous Substances Data Bank (HSDB) is a toxicology database that focuses on the toxicology of potentially hazardous chemicals. It provides information on human exposure, industrial hygiene, emergency handling procedures, environmental fate, regulatory requirements, nanomaterials, and related areas.
Health Surveillance Ontology hso The health Surveillance Ontology (HSO) focuses on "surveillance system level data", that is, data outputs from surveillance activities, such as number of samples collected, cases observed, etc. It aims to support One-Health surveillance, covering animal health, public health and food safety surveillance.
Database of homology-derived secondary structure of proteins hssp HSSP (homology-derived structures of proteins) is a derived database merging structural (2-D and 3-D) and sequence information (1-D). For each protein of known 3D structure from the Protein Data Bank, the database has a file with all sequence homologues, properly aligned to the PDB protein.
Hypertension Ontology htn An ontology for representing clinical data about hypertension, intended to support classification of patients according to various diagnostic guidelines
Human Unidentified Gene-Encoded huge The Human Unidentified Gene-Encoded (HUGE) protein database contains results from sequence analysis of human novel large (>4 kb) cDNAs identified in the Kazusa cDNA sequencing project.
Information Artifact Ontology iao An ontology of information entities, originally driven by work by the Ontology of Biomedical Investigation (OBI) digital entity and realizable information entity branch.
International Classification of Diseases icd10 The International Classification of Diseases is the international standard diagnostic classification for all general epidemiological and many health management purposes.
International Classification of Diseases 11th Revision icd11 Diagnostic tool for epidemiology, health management and clinical purposes, maintained by the World Health Organization (WHO). It provides a system of diagnostic codes for classifying diseases, including nuanced classifications of a wide variety of signs, symptoms, abnormal findings, complaints, social circumstances, and external causes of injury or disease.
International Classification of Diseases 9th Revision icd9 ICD-9-CM is the official system of assigning codes to diagnoses and procedures associated with hospital utilization in the United States.
Integrated Canine Data Commons icdc The Integrated Canine Data Commons is one of several repositories within the NCI Cancer Research Data Commons (CRDC), a cloud-based data science infrastructure that provides secure access to a large, comprehensive, and expanding collection of cancer research data. The ICDC was established to further research on human cancers by enabling comparative analysis with canine cancer.
The International Classification of Diseases for Oncology icdo An ontologied version of ICD
ICEberg element iceberg.element ICEberg (Integrative and conjugative elements) is a database of integrative and conjugative elements (ICEs) found in bacteria. ICEs are conjugative self-transmissible elements that can integrate into and excise from a host chromosome, and can carry likely virulence determinants, antibiotic-resistant factors and/or genes coding for other beneficial traits. It contains details of ICEs found in representatives bacterial species, and which are organised as families. This collection references ICE elements.
ICEberg family iceberg.family ICEberg (Integrative and conjugative elements) is a database of integrative and conjugative elements (ICEs) found in bacteria. ICEs are conjugative self-transmissible elements that can integrate into and excise from a host chromosome, and can carry likely virulence determinants, antibiotic-resistant factors and/or genes coding for other beneficial traits. It contains details of ICEs found in representatives bacterial species, and which are organised as families. This collection references ICE families.
Integrative and Conjugative Element Ontology iceo A biological ontology to standardize and integrate Integrative and Conjugative Element (ICE) information and to support computer-assisted reasoning.
Informed Consent Ontology ico The Informed Consent Ontology (ICO) is an ontology for the informed consent and informed consent process in the medical field.
Intrinsically Disordered proteins with Extensive Annotations and Literature ideal IDEAL provides a collection of knowledge on experimentally verified intrinsically disordered proteins. It contains manual annotations by curators on intrinsically disordered regions, interaction regions to other molecules, post-translational modification sites, references and structural domain assignments.
Identifiers.org namespace identifiers.namespace Identifiers.org is an established resolving system that enables the referencing of data for the scientific community, with a current focus on the Life Sciences domain.
Infectious Disease Ontology ido Infectious Disease Ontology holds entities relevant to both biomedical and clinical aspects of most infectious diseases.
The COVID-19 Infectious Disease Ontology idocovid19 The COVID-19 Infectious Disease Ontology (IDO-COVID-19) is an extension of the Infectious Disease Ontology (IDO) and the Virus Infectious Disease Ontology (VIDO). IDO-COVID-19 follows OBO Foundry guidelines, employs the Basic Formal Ontology as its starting point, and covers epidemiology, classification, pathogenesis, and treatment of terms used to represent infection by the SARS-CoV-2 virus strain, and the associated COVID-19 disease.
Dengue Fever Ontology idoden An ontology for dengue fever.
Malaria Ontology idomal An application ontology to cover all aspects of malaria as well as the intervention attempts to control it.
Identifiers.org Ontology idoo Identifiers.org Ontology
Identifiers.org Terms idot Identifiers.org Terms (idot) is an RDF vocabulary providing useful terms for describing datasets.
Image Data Resource idr Image Data Resource (IDR) is an online, public data repository that seeks to store, integrate and serve image datasets from published scientific studies. We have collected and are continuing to receive existing and newly created “reference image" datasets that are valuable resources for a broad community of users, either because they will be frequently accessed and cited or because they can serve as a basis for re-analysis and the development of new computational tools.
Immune Epitope Database iedb The Immune Epitope Database (IEDB) is a freely available resource funded by NIAID. It catalogs experimental data on antibody and T cell epitopes studied in humans, non-human primates, and other animal species in the context of infectious disease, allergy, autoimmunity and transplantation. The IEDB also hosts tools to assist in the prediction and analysis of epitopes.
Event (INOH pathway ontology) iev
International Geo Sample Number igsn IGSN is a globally unique and persistent identifier for material samples and specimens. IGSNs are obtained from IGSN e.V. Agents.
International Histocompatibility Workshop cell lines ihw The International Histocompatibility Working Group provides a comprehensive inventory of HLA reference genes to support worldwide research in immunogenetics. We also offer selected cell lines and DNA from our substantial DNA Bank of more than 1,000 cell lines from selected families, as well as individuals with diverse ethnicity and immunologic characteristics.
International Medical Device Regulators Forum imdrf The International Medical Device Regulators Forum (IMDRF) is a forum of voluntary medical device regulators from around the world who have come together to build on the strong foundational work of the Global Harmonization Task Force on Medical Devices (GHTF), and to accelerate international medical device regulatory harmonization and convergence.
International Molecular Exchange imex The International Molecular Exchange (IMEx) is a consortium of molecular interaction databases which collaborate to share manual curation efforts and provide accessibility to multiple information sources.
Integrated Microbial Genomes Gene img.gene The integrated microbial genomes (IMG) system is a data management, analysis and annotation platform for all publicly available genomes. IMG contains both draft and complete JGI (DoE Joint Genome Institute) microbial genomes integrated with all other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. This datatype refers to gene information.
Integrated Microbial Genomes Taxon img.taxon The integrated microbial genomes (IMG) system is a data management, analysis and annotation platform for all publicly available genomes. IMG contains both draft and complete JGI (DoE Joint Genome Institute) microbial genomes integrated with all other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. This datatype refers to taxon information.
IMGT/HLA human major histocompatibility complex sequence database imgt.hla IMGT, the international ImMunoGeneTics project, is a collection of high-quality integrated databases specialising in Immunoglobulins, T cell receptors and the Major Histocompatibility Complex (MHC) of all vertebrate species. IMGT/HLA is a database for sequences of the human MHC, referred to as HLA. It includes all the official sequences for the WHO Nomenclature Committee For Factors of the HLA System. This collection references allele information through the WHO nomenclature.
ImMunoGeneTics database covering immunoglobulins and T-cell receptors imgt.ligm IMGT, the international ImMunoGeneTics project, is a collection of high-quality integrated databases specialising in Immunoglobulins, T cell receptors and the Major Histocompatibility Complex (MHC) of all vertebrate species. IMGT/LIGM is a comprehensive database of fully annotated sequences of Immunoglobulins and T cell receptors from human and other vertebrates.
Molecule role (INOH Protein name/family name ontology) imr
InChI inchi The IUPAC International Chemical Identifier (InChI) is a non-proprietary identifier for chemical substances that can be used in printed and electronic data sources. It is derived solely from a structural representation of that substance, such that a single compound always yields the same identifier.
InChIKey inchikey The IUPAC International Chemical Identifier (InChI, see MIR:00000383) is an identifier for chemical substances, and is derived solely from a structural representation of that substance. Since these can be quite unwieldly, particularly for web use, the InChIKey was developed. These are of a fixed length (25 character) and were created as a condensed, more web friendly, digital representation of the InChI.
International Nonproprietary Names inn Documentation of GO that provides a description of some of the commonly used relationships and conventions in GO.
Interaction Network Ontology ino he Interaction Network Ontology (INO) is an ontology in the domain of interactions and interaction networks. INO represents general and species-neutral types of interactions and interaction networks, and their related elements and relations. INO is a community-driven ontology, aligns with BFO, and is developed by following the OBO Foundry principles.
Nucleotide Sequence Database insdc The International Nucleotide Sequence Database Collaboration (INSDC) consists of a joint effort to collect and disseminate databases containing DNA and RNA sequences.
INSDC CDS insdc.cds The coding sequence or protein identifiers as maintained in INSDC.
Genome assembly database insdc.gca The genome assembly database contains detailed information about genome assemblies for eukaryota, bacteria and archaea. The scope of the genome collections database does not extend to viruses, viroids and bacteriophage.
International Nucleotide Sequence Database Collaboration (INSDC) Run insdc.run An experimental run, served thrugh the ENA
Sequence Read Archive insdc.sra The Sequence Read Archive (SRA) stores raw sequencing data from the next generation of sequencing platforms Data submitted to SRA. It is organized using a metadata model consisting of six objects: study, sample, experiment, run, analysis and submission. The SRA study contains high-level information including goals of the study and literature references, and may be linked to the INSDC BioProject database.
IntAct protein interaction database intact IntAct provides a freely available, open source database system and analysis tools for protein interaction data.
IntAct Molecule intact.molecule IntAct provides a freely available, open source database system and analysis tools for protein interaction data. This collection references interactor molecules.
InterLex interlex InterLex is a dynamic lexicon, initially built on the foundation of NeuroLex (PMID: 24009581), of biomedical terms and common data elements designed to help improve the way that biomedical scientists communicate about their data, so that information systems can find data more easily and provide more powerful means of integrating data across distributed resources and datasets. InterLex allows for the association of data fields and data values to common data elements and terminologies enabling the crowdsourcing of data-terminology mappings within and across communities. InterLex provides a stable layer on top of the many other existing terminologies, lexicons, ontologies, and common data element collections and provides a set of inter-lexical and inter-data-lexical mappings.
Protein Domains interpro InterPro is a database of protein families, domains and functional sites in which identifiable features found in known proteins can be applied to unknown protein sequences.
Protein Domains ipr
IRD Segment Sequence ird.segment Influenza Research Database (IRD) contains information related to influenza virus, including genomic sequence, strain, protein, epitope and bibliographic information. The Segment Details page contains descriptive information and annotation data about a particular genomic segment and its encoded product(s).
iRefWeb irefweb iRefWeb is an interface to a relational database containing the latest build of the interaction Reference Index (iRefIndex) which integrates protein interaction data from ten different interaction databases: BioGRID, BIND, CORUM, DIP, HPRD, INTACT, MINT, MPPI, MPACT and OPHID. In addition, iRefWeb associates interactions with the PubMed record from which they are derived.
Insect Resistance Ontology iro
International Standard Book Number isbn The International Standard Book Number (ISBN) is for identifying printed books.
Insertion sequence elements database isfinder ISfinder is a database of bacterial insertion sequences (IS). It assigns IS nomenclature and acts as a repository for ISs. Each IS is annotated with information such as the open reading frame DNA sequence, the sequence of the ends of the element and target sites, its origin and distribution together with a bibliography, where available.
International Standard Name Identifier isni ISNI is the ISO certified global standard number for identifying the millions of contributors to creative works and those active in their distribution, including researchers, inventors, writers, artists, visual creators, performers, producers, publishers, aggregators, and more. It is part of a family of international standard identifiers that includes identifiers of works, recordings, products and right holders in all repertoires, e.g. DOI, ISAN, ISBN, ISRC, ISSN, ISTC, and ISWC. The mission of the ISNI International Authority (ISNI-IA) is to assign to the public name(s) of a researcher, inventor, writer, artist, performer, publisher, etc. a persistent unique identifying number in order to resolve the problem of name ambiguity in search and discovery; and diffuse each assigned ISNI across all repertoires in the global supply chain so that every published work can be unambiguously attributed to its creator wherever that work is described.
International Standard Serial Number issn The International Standard Serial Number (ISSN) is a unique eight-digit number used to identify a print or electronic periodical publication, rather than individual articles or books.
Integrated Taxonomic Information System itis Information system with taxonomic data on plants, animals, fungi, and microbes of North America and the world.
Intelligence Task Ontology ito The Intelligence Task Ontology (ITO) provides a comprehensive map of machine intelligence tasks, as well as broader human intelligence or hybrid human/machine intelligence tasks.
IUPHAR family iuphar.family The IUPHAR Compendium details the molecular, biophysical and pharmacological properties of identified mammalian sodium, calcium and potassium channels, as well as the related cyclic nucleotide-modulated ion channels and the recently described transient receptor potential channels. It includes information on nomenclature systems, and on inter and intra-species molecular structure variation. This collection references families of receptors or subunits.
Guide to Pharmacology Ligand ID iuphar.ligand The IUPHAR Compendium details the molecular, biophysical and pharmacological properties of identified mammalian sodium, calcium and potassium channels, as well as the related cyclic nucleotide-modulated ion channels and the recently described transient receptor potential channels. It includes information on nomenclature systems, and on inter and intra-species molecular structure variation. This collection references ligands.
Guide to Pharmacology Target ID iuphar.receptor The IUPHAR Compendium details the molecular, biophysical and pharmacological properties of identified mammalian sodium, calcium and potassium channels, as well as the related cyclic nucleotide-modulated ion channels and transient receptor potential channels. It includes information on nomenclature systems, and on inter and intra-species molecular structure variation. This collection references individual receptors or subunits.
Jackson Laboratories Strain jax Information about the C57BL/6J. Includes genetic background and disease data.
JAX Mice jaxmice JAX Mice is a catalogue of mouse strains supplied by the Jackson Laboratory.
Japan Consortium for Glycobiology and Glycotechnology Database jcggdb JCGGDB (Japan Consortium for Glycobiology and Glycotechnology DataBase) is a database that aims to integrate all glycan-related data held in various repositories in Japan. This includes databases for large-quantity synthesis of glycogenes and glycans, analysis and detection of glycan structure and glycoprotein, glycan-related differentiation markers, glycan functions, glycan-related diseases and transgenic and knockout animals, etc.
Japan Collection of Microorganisms jcm The Japan Collection of Microorganisms (JCM) collects, catalogues, and distributes cultured microbial strains, restricted to those classified in Risk Group 1 or 2.
Japan Chemical Substance Dictionary jcsd The Japan Chemical Substance Dictionary is an organic compound dictionary database prepared by the Japan Science and Technology Agency (JST).
Digital archive of scholarly articles jstor JSTOR (Journal Storage) is a digital library containing digital versions of historical academic journals, as well as books, pamphlets and current issues of journals. Some public domain content is free to access, while other articles require registration.
JWS Online jws JWS Online is a repository of curated biochemical pathway models, and additionally provides the ability to run simulations of these models in a web browser.
Kaggle kaggle Kaggle is a platform for sharing data, performing reproducible analyses, interactive data analysis tutorials, and machine learning competitions.
Kyoto Encyclopedia of Genes and Genomes kegg Kyoto Encyclopedia of Genes and Genomes (KEGG) is a database resource for understanding high-level functions and utilities of the biological system, such as the cell, the organism and the ecosystem, from molecular-level information, especially large-scale molecular datasets generated by genome sequencing and other high-throughput experimental technologies.
KEGG Compound kegg.compound KEGG compound contains our knowledge on the universe of chemical substances that are relevant to life.
KEGG Disease kegg.disease The KEGG DISEASE database is a collection of disease entries capturing knowledge on genetic and environmental perturbations. Each disease entry contains a list of known genetic factors (disease genes), environmental factors, diagnostic markers, and therapeutic drugs. Diseases are viewed as perturbed states of the molecular system, and drugs as perturbants to the molecular system.
KEGG Drug kegg.drug KEGG DRUG contains chemical structures of drugs and additional information such as therapeutic categories and target molecules.
KEGG Environ kegg.environ KEGG ENVIRON (renamed from EDRUG) is a collection of crude drugs, essential oils, and other health-promoting substances, which are mostly natural products of plants. It will contain environmental substances and other health-damagine substances as well. Each KEGG ENVIRON entry is identified by the E number and is associated with the chemical component, efficacy information, and source species information whenever applicable.
KEGG Enzyme kegg.enzyme KEGG ENZYME is an implementation of the Enzyme Nomenclature (EC number system) produced by the IUBMB/IUPAC Biochemical Nomenclature Committee. KEGG ENZYME is based on the ExplorEnz database at Trinity College Dublin, and is maintained in the KEGG relational database with additional annotation of reaction hierarchy and sequence data links.
KEGG Genes kegg.genes KEGG GENES is a collection of gene catalogs for all complete genomes and some partial genomes, generated from publicly available resources.
KEGG Genome kegg.genome KEGG Genome is a collection of organisms whose genomes have been completely sequenced.
KEGG Glycan kegg.glycan KEGG GLYCAN, a part of the KEGG LIGAND database, is a collection of experimentally determined glycan structures. It contains all unique structures taken from CarbBank, structures entered from recent publications, and structures present in KEGG pathways.
KEGG LIGAND kegg.ligand
KEGG Metagenome kegg.metagenome The KEGG Metagenome Database collection information on environmental samples (ecosystems) of genome sequences for multiple species.
KEGG Module kegg.module KEGG Modules are manually defined functional units used in the annotation and biological interpretation of sequenced genomes. Each module corresponds to a set of 'KEGG Orthology' (MIR:00000116) entries. KEGG Modules can represent pathway, structural, functional or signature modules.
KEGG Orthology kegg.orthology KEGG Orthology (KO) consists of manually defined, generalised ortholog groups that correspond to KEGG pathway nodes and BRITE hierarchy nodes in all organisms.
KEGG Pathways Database kegg.pathway KEGG PATHWAY is a collection of manually drawn pathway maps representing our knowledge on the molecular interaction and reaction networks.
KEGG ID kegg.reaction KEGG reaction contains our knowledge on the universe of reactions that are relevant to life.
KNApSAcK ID knapsack KNApSAcK provides information on metabolites and the taxonomic class with which they are associated.
Kidney and Urinary Pathway Ontology kupo
clinical LABoratory Ontology labo LABO is an ontology of informational entities formalizing clinical laboratory tests prescriptions and reporting documents.
Livestock Breed Ontology lbo A vocabulary for cattle, chicken, horse, pig, and sheep breeds.
Global LEI Index lei Established by the Financial Stability Board in June 2014, the Global Legal Entity Identifier Foundation (GLEIF) is tasked to support the implementation and use of the Legal Entity Identifier (LEI). The foundation is backed and overseen by the LEI Regulatory Oversight Committee, representing public authorities from around the globe that have come together to jointly drive forward transparency within the global financial markets. GLEIF is a supra-national not-for-profit organization headquartered in Basel, Switzerland.
LG Chemical Entity Detection Dataset (LGCEDe) lgai.cede LG Chemical Entity Detection Dataset (LGCEDe) is only available open-dataset with molecular instance level annotations (i.e. atom-bond level position annotations within an image) for molecular structure images. This dataset was designed to encourage research on detection-based pipelines for Optical Chemical Structure Recognition (OCSR).
Ligand-Gated Ion Channel database lgic The Ligand-Gated Ion Channel database provides nucleic and proteic sequences of the subunits of ligand-gated ion channels. These transmembrane proteins can exist under different conformations, at least one of which forms a pore through the membrane connecting two neighbouring compartments. The database can be used to generate multiple sequence alignments from selected subunits, and gives the atomic coordinates of subunits, or portion of subunits, where available.
LiceBase licebase Sea lice (Lepeophtheirus salmonis and Caligus species) are the major pathogens of salmon, significantly impacting upon the global salmon farming industry. Lice control is primarily accomplished through chemotherapeutants, though emerging resistance necessitates the development of new treatment methods (biological agents, prophylactics and new drugs). LiceBase is a database for sea lice genomics, providing genome annotation of the Atlantic salmon louse Lepeophtheirus salmonis, a genome browser, and access to related high-thoughput genomics data. LiceBase also mines and stores data from related genome sequencing and functional genomics projects.
LigandBook ligandbook Ligandbook is a public repository for force field parameters with a special emphasis on small molecules and known ligands of proteins. It acts as a warehouse for parameter files that are supplied by the community.
LigandBox ligandbox LigandBox is a database of 3D compound structures. Compound information is collected from the catalogues of various commercial suppliers, with approved drugs and biochemical compounds taken from KEGG and PDB databases. Each chemical compound in the database has several 3D conformers with hydrogen atoms and atomic charges, which are ready to be docked into receptors using docking programs. Various physical properties, such as aqueous solubility (LogS) and carcinogenicity have also been calculated to characterize the ADME-Tox properties of the compounds.
Ligand Expo ligandexpo Ligand Expo is a data resource for finding information about small molecules bound to proteins and nucleic acids.
LINCS Cell lincs.cell The Library of Network-Based Cellular Signatures (LINCS) Program aims to create a network-based understanding of biology by cataloging changes in gene expression and other cellular processes that occur when cells are exposed to a variety of perturbing agents. The LINCS cell model system can have the following cell categories: cell lines, primary cells, induced pluripotent stem cells, differentiated cells, and embryonic stem cells. The metadata contains information provided by each LINCS Data and Signature Generation Center (DSGC) and the association with a tissue or organ from which the cells were derived, in many cases are also associated to a disease.
LINCS Data lincs.data The Library of Network-Based Cellular Signatures (LINCS) Program aims to create a network-based understanding of biology by cataloguing changes in gene expression and other cellular processes that occur when cells are exposed to perturbing agents. The data is organized and available as datasets, each including experimental data, metadata and a description of the dataset and assay. The dataset group comprises datasets for the same experiment but with different data level results (data processed to a different level).
LINCS Protein lincs.protein The HMS LINCS Database currently contains information on experimental reagents (small molecule perturbagens, cells, and proteins). It aims to collect and disseminate information relating to the fundamental principles of cellular response in humans to perturbation. This collection references proteins.
LINCS Small Molecule lincs.smallmolecule The Library of Network-Based Cellular Signatures (LINCS) Program aims to create a network-based understanding of biology by cataloging changes in gene expression and other cellular processes that occur when cells are exposed to a variety of perturbing agents. The LINCS small molecule collection is used as perturbagens in LINCS experiments. The small molecule metadata includes substance-specific batch information provided by each LINCS Data and Signature Generation Center (DSGC).
Linguist linguist Registry of programming languages for the Linguist program for detecting and highlighting programming languages.
LipidBank lipidbank LipidBank is an open, publicly free database of natural lipids including fatty acids, glycerolipids, sphingolipids, steroids, and various vitamins.
LIPID MAPS lipidmaps The LIPID MAPS Lipid Classification System is comprised of eight lipid categories, each with its own subclassification hierarchy. All lipids in the LIPID MAPS Structure Database (LMSD) have been classified using this system and have been assigned LIPID MAPS ID's which reflects their position in the classification hierarchy.
Lipid Ontology lipro An ontology representation of the LIPIDMAPS nomenclature classification.
LNCipedia lncipedia A comprehensive compendium of human long non-coding RNAs
Loggerhead nesting loggerhead
Logical Observation Identifiers Names and Codes loinc The international standard for identifying health measurements, observations, and documents.
Locus Reference Genomic lrg A Locus Reference Genomic (LRG) is a manually curated record that contains stable genomic, transcript and protein reference sequences for reporting clinically relevant sequence variants. All LRGs are generated and maintained by the NCBI and EMBL-EBI.
Laboratory of Systems Pharmacology Compound ID lspci Internal identifiers form the LSP for ChEBML compound classes (e.g., combining various salts and ions)
Mouse adult gross anatomy ma A structured controlled vocabulary of the adult anatomy of the mouse (Mus)
Mechanism, Annotation and Classification in Enzymes macie MACiE (Mechanism, Annotation and Classification in Enzymes) is a database of enzyme reaction mechanisms. Each entry in MACiE consists of an overall reaction describing the chemical compounds involved, as well as the species name in which the reaction occurs. The individual reaction stages for each overall reaction are listed with mechanisms, alternative mechanisms, and amino acids involved.
MaizeGDB Locus maizegdb.locus MaizeGDB is the maize research community's central repository for genetics and genomics information.
Mathematical modeling ontology mamo The Mathematical Modelling Ontology (MAMO) is a classification of the types of mathematical models used mostly in the life sciences, their variables, relationships and other relevant features.
Multiple alignment mao
MassBank massbank MassBank is a federated database of reference spectra from different instruments, including high-resolution mass spectra of small metabolites (<3000 Da).
MassIVE massive MassIVE is a community resource developed by the NIH-funded Center for Computational Mass Spectrometry to promote the global, free exchange of mass spectrometry data.
Minimal Anatomical Terminology mat
MatrixDB matrixdb MatrixDB is a freely available database focused on interactions established by extracellular matrix proteins, proteoglycans and polysaccharides
MatrixDB Association matrixdb.association MatrixDB stores experimentally determined interactions involving at least one extracellular biomolecule. It includes mostly protein-protein and protein-glycosaminoglycan interactions, as well as interactions with lipids and cations.
Medical Action Ontology maxo An ontology to represent medically relevant actions, procedures, therapies, interventions, and recommendations.
Cell Line Ontology [derivative] mcc
Microbial Conditions Ontology mco Microbial Conditions Ontology is an ontology...
Medical Data Models mdm The MDM (Medical Data Models) Portal is a meta-data registry for creating, analysing, sharing and reusing medical forms. Electronic forms are central in numerous processes involving data, including the collection of data through electronic health records (EHRs), Electronic Data Capture (EDC), and as case report forms (CRFs) for clinical trials. The MDM Portal provides medical forms in numerous export formats, facilitating the sharing and reuse of medical data models and exchange between information systems.
Medical Dictionary for Regulatory Activities Terminology meddra The Medical Dictionary for Regulatory Activities (MedDRA) was developed by the International Council for Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH)to provide a standardised medical terminology to facilitate sharing of regulatory information internationally for medical products used by humans. It is used within regulatory processes, safety monitoring, as well as for marketing activities. Products covered by the scope of MedDRA include pharmaceuticals, biologics, vaccines and drug-device combination products. The MedDRA dictionary is organized by System Organ Class (SOC), divided into High-Level Group Terms (HLGT), High-Level Terms (HLT), Preferred Terms (PT) and finally into Lowest Level Terms (LLT).
Human Medical Genetics medgen MedGen is a portal for information about conditions and phenotypes related to Medical Genetics. Terms from multiple sources are aggregated into concepts, each of which is assigned a unique identifier and a preferred name and symbol. The core content of the record may include names, identifiers used by other databases, mode of inheritance, clinical features, and map location of the loci affecting the disorder.
MedlinePlus Health Topics medlineplus MedlinePlus is the National Institutes of Health's Web site for patients and their families and friends. Produced by the National Library of Medicine, it provides information about diseases, conditions, and wellness issues using non-technical terms and language.
MEROPS peptidase database merops The MEROPS database is an information resource for peptidases (also termed proteases, proteinases and proteolytic enzymes) and the proteins that inhibit them.
MEROPS Family merops.family The MEROPS database is an information resource for peptidases (also termed proteases, proteinases and proteolytic enzymes) and the proteins that inhibit them. These are hierarchically classified and assigned to a Family on the basis of statistically significant similarities in amino acid sequence. Families thought to be homologous are grouped together in a Clan. This collection references peptidase families.
MEROPS Inhibitor merops.inhibitor The MEROPS database is an information resource for peptidases (also termed proteases, proteinases and proteolytic enzymes) and the proteins that inhibit them. This collections references inhibitors.
Medical Subject Headings mesh MeSH (Medical Subject Headings) is the National Library of Medicine's controlled vocabulary thesaurus. It consists of sets of terms naming descriptors in a hierarchical structure that permits searching at various levels of specificity. This thesaurus is used by NLM for indexing articles from biomedical journals, cataloguing of books, documents, etc.
MeSH 2012 mesh.2012 MeSH (Medical Subject Headings) is the National Library of Medicine's controlled vocabulary thesaurus. It consists of sets of terms naming descriptors in a hierarchical structure that permits searching at various levels of specificity. This thesaurus is used by NLM for indexing articles from biomedical journals, cataloging of books, documents, etc. This collection references MeSH terms published in 2012.
MeSH 2013 mesh.2013 MeSH (Medical Subject Headings) is the National Library of Medicine's controlled vocabulary thesaurus. It consists of sets of terms naming descriptors in a hierarchical structure that permits searching at various levels of specificity. This thesaurus is used by NLM for indexing articles from biomedical journals, cataloging of books, documents, etc. This collection references MeSH terms published in 2013.
MetaboLights Compound ID metabolights MetaboLights is a database for Metabolomics experiments and derived information. The database is cross-species, cross-technique and covers metabolite structures and their reference spectra as well as their biological roles, locations and concentrations, and experimental data from metabolic experiments. This collection references individual metabolomics studies.
Metabolic Encyclopedia of metabolic and other pathways metacyc.compound MetaCyc is a curated database of experimentally elucidated metabolic pathways from all domains of life. MetaCyc contains 2526 pathways from 2844 different organisms. MetaCyc contains pathways involved in both primary and secondary metabolism, as well as associated metabolites, reactions, enzymes, and genes. The goal of MetaCyc is to catalog the universe of metabolism by storing a representative sample of each experimentally elucidated pathway.
MetaCyc Reaction metacyc.reaction MetaCyc is a curated database of experimentally elucidated metabolic pathways from all domains of life. MetaCyc contains 2526 pathways from 2844 different organisms. MetaCyc contains pathways involved in both primary and secondary metabolism, as well as associated metabolites, reactions, enzymes, and genes. The goal of MetaCyc is to catalog the universe of metabolism by storing a representative sample of each experimentally elucidated pathway.
MetaNetX chemical metanetx.chemical MetaNetX/MNXref integrates various information from genome-scale metabolic network reconstructions such as information on reactions, metabolites and compartments. This information undergoes a reconciliation process to minimise for discrepancies between different data sources, and makes the data accessible under a common namespace. This collection references chemical or metabolic components.
MetaNetX compartment metanetx.compartment MetaNetX/MNXref integrates various information from genome-scale metabolic network reconstructions such as information on reactions, metabolites and compartments. This information undergoes a reconciliation process to minimise for discrepancies between different data sources, and makes the data accessible under a common namespace. This collection references cellular compartments.
MetaNetX reaction metanetx.reaction MetaNetX/MNXref integrates various information from genome-scale metabolic network reconstructions such as information on reactions, metabolites and compartments. This information undergoes a reconciliation process to minimise for discrepancies between different data sources, and makes the data accessible under a common namespace. This collection references reactions.
Metabolite and Tandem Mass Spectrometry Database metlin The METLIN (Metabolite and Tandem Mass Spectrometry) Database is a repository of metabolite information as well as tandem mass spectrometry data, providing public access to its comprehensive MS and MS/MS metabolite data. An annotated list of known metabolites and their mass, chemical formula, and structure are available, with each metabolite linked to external resources for further reference and inquiry.
Metabolome Express mex A public place to process, interpret and share GC/MS metabolomics datasets.
Mental Functioning Ontology mf The Mental Functioning Ontology is an overarching ontology for all aspects of mental functioning.
Mammalian Feeding Muscle Ontology mfmo The Mammalian Feeding Muscle Ontology is an antomy ontology for the muscles of the head and neck that participate in feeding, swallowing, and other oral-pharyngeal behaviors.
Medaka fish anatomy and development mfo A structured controlled vocabulary of the anatomy and development of the Japanese medaka fish, <i>Oryzias latipes</i>.
Emotion Ontology mfoem An ontology of affective phenomena such as emotions, moods, appraisals and subjective feelings.
Mental Disease Ontology mfomd The Mental Disease Ontology is developed to facilitate representation for all aspects of mental disease. It is an extension of the Ontology for General Medical Science (OGMS) and Mental Functioning Ontology (MF).
Aclame mge ACLAME is a database dedicated to the collection and classification of mobile genetic elements (MGEs) from various sources, comprising all known phage genomes, plasmids and transposons.
Mouse Genome Informatics mgi The Mouse Genome Database (MGD) project includes data on gene characterization, nomenclature, mapping, gene homologies among mammals, sequence links, phenotypes, allelic variants and mutants, and strain data.
MGnify Analysis mgnify.analysis Analyses of microbiome data within MGnify
MGnify Project mgnify.proj MGnify is a resource for the analysis and archiving of microbiome data to help determine the taxonomic diversity and functional & metabolic potential of environmental samples. Users can submit their own data for analysis or freely browse all of the analysed public datasets held within the repository. In addition, users can request analysis of any appropriate dataset within the European Nucleotide Archive (ENA). User-submitted or ENA-derived datasets can also be assembled on request, prior to analysis.
MGnify Sample mgnify.samp The EBI Metagenomics service is an automated pipeline for the analysis and archiving of metagenomic data that aims to provide insights into the phylogenetic diversity as well as the functional and metabolic potential of a sample. Metagenomics is the study of all genomes present in any given environment without the need for prior individual identification or amplification. This collection references samples.
Molecular Interactions Controlled Vocabulary mi The Molecular Interactions (MI) ontology forms a structured controlled vocabulary for the annotation of experiments concerned with protein-protein interactions. MI is developed by the HUPO Proteomics Standards Initiative.
Minimal Information About Anatomy ontology miaa
MIAPA Ontology miapa The MIAPA ontology is intended to be an application ontology for the purpose of semantic annotation of phylogenetic data according to the requirements and recommendations of the Minimum Information for A Phylogenetic Analysis (MIAPA) metadata reporting standard. The ontology leverages (imports) primarily from the CDAO (Comparative Data Analysis Ontology), PROV (W3C Provenance Ontology), and SWO (Software Ontology, which includes the EDAM ontologies) ontologies. It adds some assertions of its own, as well as some classes and individuals that may eventually get pushed down into one of the respective source ontologies. This ontology is maintained at http://github.com/miapa/miapa, and requests for changes or additions should be filed at the issue tracker there. The discussion list is at miapa-discuss@googlegroups.com. Further resources about MIAPA can be found at the project's main page at http://evoio.org/wiki/MIAPA.
Ontology of Prokaryotic Phenotypic and Metabolic Characters micro An ontology of prokaryotic phenotypic and metabolic characters
MicroScope microscope MicroScope is an integrative resource that supports systematic and efficient revision of microbial genome annotation, data management and comparative analysis.
MicrosporidiaDB microsporidia MicrosporidiaDB is one of the databases that can be accessed through the EuPathDB (http://EuPathDB.org; formerly ApiDB) portal, covering eukaryotic pathogens of the genera Cryptosporidium, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Trichomonas and Trypanosoma. While each of these groups is supported by a taxon-specific database built upon the same infrastructure, the EuPathDB portal offers an entry point to all these resources, and the opportunity to leverage orthology for searches across genera.
OMIM ID mim Online Mendelian Inheritance in Man is a catalog of human genes and genetic disorders.
OMIM Phenotypic Series mim.ps A Phenotypic Series is a tabular view of genetic heterogeneity of similar phenotypes across the genome.
MimoDB mimodb MimoDB is a database collecting peptides that have been selected from random peptide libraries based on their ability to bind small compounds, nucleic acids, proteins, cells, tissues and organs. It also stores other information such as the corresponding target, template, library, and structures. As of March 2016, this database was renamed Biopanning Data Bank.
Minimal Viable Identifier minid Minid are identifiers used to provide robust reference to intermediate data generated during the course of a research investigation.
MINID Test minid.test Minid are identifiers used to provide robust reference to intermediate data generated during the course of a research investigation. This is a prefix for referencing identifiers in the minid test namespace.
Molecular Interaction Database mint The Molecular INTeraction database (MINT) stores, in a structured format, information about molecular interactions by extracting experimental details from work published in peer-reviewed journals.
MIPModDB mipmod MIPModDb is a database of comparative protein structure models of MIP (Major Intrinsic Protein) family of proteins, identified from complete genome sequence. It provides key information of MIPs based on their sequence and structures.
Identifiers.org Registry mir The Identifiers.org registry contains registered namespace and provider prefixes with associated access URIs for a large number of high quality data collections. These prefixes are used in web resolution of compact identifiers of the form “PREFIX:ACCESSION” or "PROVIDER/PREFIX:ACCESSION” commonly used to specify bioinformatics and other data resources.
miRBase pre-miRNA ID mirbase The miRBase Sequence Database is a searchable database of published miRNA sequences and annotation. The data were previously provided by the miRNA Registry. Each entry in the miRBase Sequence database represents a predicted hairpin portion of a miRNA transcript (termed mir in the database), with information on the location and sequence of the mature miRNA sequence (termed miR).
miRBase Families mirbase.family The miRBase database is a searchable database of published miRNA sequences and annotation. Each entry in the miRBase Sequence database represents a predicted hairpin portion of a miRNA transcript (termed mir in the database), with information on the location and sequence of the mature miRNA sequence (termed miR). Both hairpin and mature sequences are available for searching and browsing, and entries can also be retrieved by name, keyword, references and annotation. All sequence and annotation data are also available for download.
miRBase mature miRNA ID mirbase.mature The miRBase Sequence Database is a searchable database of published miRNA sequences and annotation. This collection refers specifically to the mature miRNA sequence.
mirEX mirex mirEX is a comprehensive platform for comparative analysis of primary microRNA expression data, storing RT–qPCR-based gene expression profile over seven development stages of Arabidopsis. It also provides RNA structural models, publicly available deep sequencing results and experimental procedure details. This collection provides profile information for a single microRNA over all development stages.
MIRIAM Registry collection miriam.collection MIRIAM Registry is an online resource created to catalogue collections (Gene Ontology, Taxonomy or PubMed are some examples) and the corresponding resources (physical locations) providing access to those data collections. The Registry provides unique and perennial URIs for each entity of those data collections.
MIRIAM Registry resource miriam.resource MIRIAM Registry is an online resource created to catalogue data types (Gene Ontology, Taxonomy or PubMed are some examples), their URIs and the corresponding resources (or physical locations), whether these are controlled vocabularies or databases.
microRNA Ontology mirnao An application ontology for use with miRNA databases.
miRNEST mirnest miRNEST is a database of animal, plant and virus microRNAs, containing miRNA predictions conducted on Expressed Sequence Tags of animal and plant species.
Mosquito insecticide resistance miro Application ontology for entities related to insecticide resistance in mosquitos
miRTarBase mirtarbase miRTarBase is a database of miRNA-target interactions (MTIs), collected manually from relevant literature, following Natural Language Processing of the text to identify research articles related to functional studies of miRNAs. Generally, the collected MTIs are validated experimentally by reporter assay, western blot, microarray and next-generation sequencing experiments.
miRNA Target Prediction at EMBL mirte This website provides access to our 2003 and 2005 miRNA-Target predictions for Drosophila miRNAs
MLCommons Association mlc MLCommons Association artifacts, including benchmark results, datasets, and saved models.
Molecular Modeling Database mmdb The Molecular Modeling Database (MMDB) is a database of experimentally determined structures obtained from the Protein Data Bank (PDB). Since structures are known for a large fraction of all protein families, structure homologs may facilitate inference of biological function, or the identification of binding or catalytic sites.
Melanoma Molecular Map Project Biomaps mmmp.biomaps A collection of molecular interaction maps and pathways involved in cancer development and progression with a focus on melanoma.
Measurement method ontology mmo A representation of the variety of methods used to make clinical and phenotype measurements.
MarCat mmp.cat MarCat is a gene (protein) catalogue of uncultivable and cultivable marine genes and proteins derived from metagenomics samples.
MarDB mmp.db MarDB includes all sequenced marine microbial genomes regardless of level of completeness.
MarFun mmp.fun MarFun is manually curated database for marine fungi which is a part of the MAR databases.
MarRef mmp.ref MarRef is a manually curated marine microbial reference genome database that contains completely sequenced genomes.
Mutant Mouse Resource and Research Centers mmrrc The MMRRC database is a repository of available mouse stocks and embryonic stem cell line collections.
Multum MediSource Lexicon mmsl The Lexicon is a foundational database with comprehensive drug product and disease nomenclature information. It includes drug names, drug product information, disease names, coding systems such as ICD-9-CM and NDC, generic names, brand names and common abbreviations. A comprehensive list of standard or customized disease names and ICD-9 codes is also included.
Mouse Developmental Stages mmusdv Life cycle stages for Mus Musculus
Microarray experimental conditions mo The MGED Ontology (MO) provides terms for annotating all aspects of a microarray experiment from the design of the experiment and array layout, through to the preparation of the biological sample and the protocols used to hybridize the RNA and analyze the data.
MobiDB mobidb MobiDB is a database of protein disorder and mobility annotations.
Protein modification mod The Proteomics Standards Initiative modification ontology (PSI-MOD) aims to define a concensus nomenclature and ontology reconciling, in a hierarchical representation, the complementary descriptions of residue modifications.
ModelDB modeldb ModelDB is a curated, searchable database of published models in the computational neuroscience domain. It accommodates models expressed in textual form, including procedural or declarative languages (e.g. C++, XML dialects) and source code written for any simulation environment.
Molbase molbase Molbase provides compound data information for researchers as well as listing suppliers and price information. It can be searched by keyword or CAS indetifier.
MolBase molbase.sheffield An online database of inorganic compounds, MolBase was constructed by Dr Mark Winter of the University of Sheffield with input from undergraduate students.
MolMeDB molmedb MolMeDB is an open chemistry database about interactions of molecules with membranes. We collect information on how chemicals interact with individual membranes either from experiment or from simulations.
Monarch Disease Ontology mondo A semi-automatically constructed ontology that merges in multiple disease resources to yield a coherent merged ontology.
Molecular Process Ontology mop MOP is the molecular process ontology. It contains the molecular processes that underlie the name reaction ontology RXNO, for example cyclization, methylation and demethylation.
Morpheus model repository morpheus The Morpheus model repository is an open-access data resource to store, search and retrieve unpublished and published computational models of spatio-temporal and multicellular biological systems, encoded in the MorpheusML language and readily executable with the Morpheus software.
Mammalian Phenotype Ontology mp The Mammalian Phenotype Ontology (MP) classifies and organises phenotypic information related to the mouse and other mammalian species. This ontology has been applied to mouse phenotype descriptions in various databases allowing comparisons of data from diverse mammalian sources. It can facilitate in the identification of appropriate experimental disease models, and aid in the discovery of candidate disease genes and molecular signaling pathways.
Mouse pathology ontology mpath A structured controlled vocabulary of mutant and transgenic mouse pathology phenotypes
Microbial Protein Interaction Database mpid The microbial protein interaction database (MPIDB) provides physical microbial interaction data. The interactions are manually curated from the literature or imported from other databases, and are linked to supporting experimental evidence, as well as evidences based on interaction conservation, protein complex membership, and 3D domain contacts.
Minimum PDDI Information Ontology mpio An ontology of minimum information regarding potential drug-drug interaction information.
MHC Restriction Ontology mro The MHC Restriction Ontology is an application ontology capturing how Major Histocompatibility Complex (MHC) restriction is defined in experiments, spanning exact protein complexes, individual protein chains, serotypes, haplotypes and mutant molecules, as well as evidence for MHC restrictions.
Mass spectrometry ontology ms The PSI-Mass Spectrometry (MS) CV contains all the terms used in the PSI MS-related data standards. The CV contains a logical hierarchical structure to ensure ease of maintenance and the development of software that makes use of complex semantics. The CV contains terms required for a complete description of an MS analysis pipeline used in proteomics, including sample labeling, digestion enzymes, instrumentation parts and parameters, software used for identification and quantification of peptides/proteins and the parameters and scores used to determine their significance.
Molecular Signatures Database msigdb The Molecular Signatures Database (MSigDB) is a collection of annotated gene sets for use with GSEA software. From this web site, you can
Metabolomics Standards Initiative Ontology msio an application ontology for supporting description and annotation of mass-spectrometry and nmr-spectroscopy based metabolomics experiments and fluxomics studies.
MultiCellDS multicellds MultiCellDS is data standard for multicellular simulation, experimental, and clinical data. A digital cell line is a hierarchical organization of quantitative phenotype data for a single biological cell line, including the microenvironmental context of the measurements and essential metadata.
MultiCellDS Digital Cell Line multicellds.cell_line MultiCellDS is data standard for multicellular simulation, experimental, and clinical data. A digital cell line is a hierarchical organization of quantitative phenotype data for a single biological cell line, including the microenvironmental context of the measurements and essential metadata.
MultiCellDS collection multicellds.collection MultiCellDS is data standard for multicellular simulation, experimental, and clinical data. A collection groups one or more individual uniquely identified cell lines, snapshots, or collections. Primary uses are times series (collections of snapshots), patient cohorts (collections of cell lines), and studies (collections of time series collections).
MultiCellDS Digital snapshot multicellds.snapshot MultiCellDS is data standard for multicellular simulation, experimental, and clinical data. A digital snapshot is a single-time output of the microenvironment (including basement membranes and the vascular network), any cells contained within, and essential metadata. Cells may include phenotypic data.
Metabolomics Workbench Project mw.project Metabolomics Workbench stores metabolomics data for small and large studies on cells, tissues and organisms for the Metabolomics Consortium Data Repository and Coordinating Center (DRCC).
Metabolomics Workbench Study mw.study Metabolomics Workbench stores metabolomics data for small and large studies on cells, tissues and organisms for the Metabolomics Consortium Data Repository and Coordinating Center (DRCC).
MycoBrowser leprae myco.lepra Mycobrowser is a resource that provides both in silico generated and manually reviewed information within databases dedicated to the complete genomes of Mycobacterium tuberculosis, Mycobacterium leprae, Mycobacterium marinum and Mycobacterium smegmatis. This collection references Mycobacteria leprae information.
MycoBrowser marinum myco.marinum Mycobrowser is a resource that provides both in silico generated and manually reviewed information within databases dedicated to the complete genomes of Mycobacterium tuberculosis, Mycobacterium leprae, Mycobacterium marinum and Mycobacterium smegmatis. This collection references Mycobacteria marinum information.
MycoBrowser smegmatis myco.smeg Mycobrowser is a resource that provides both in silico generated and manually reviewed information within databases dedicated to the complete genomes of Mycobacterium tuberculosis, Mycobacterium leprae, Mycobacterium marinum and Mycobacterium smegmatis. This collection references Mycobacteria smegmatis information.
MycoBrowser tuberculosis myco.tuber Mycobrowser is a resource that provides both in silico generated and manually reviewed information within databases dedicated to the complete genomes of Mycobacterium tuberculosis, Mycobacterium leprae, Mycobacterium marinum and Mycobacterium smegmatis. This collection references Mycobacteria tuberculosis information.
Fungal Nomenclature and Species Bank mycobank MycoBank is an online database, documenting new mycological names and combinations, eventually combined with descriptions and illustrations.
Universal Spectrum Identifier mzspec The Universal Spectrum Identifier (USI) is a compound identifier that provides an abstract path to refer to a single spectrum generated by a mass spectrometer, and potentially the ion that is thought to have produced it.
NameRXN namerxn The nomenclature used for named reactions in text mining software from NextMove. While it's proprietary, there are a few publications listing parts. 487/1,855 have mappings to the Reaction Ontology (RXNO).
Natural Product-Drug Interaction Research Data Repository napdi The Natural Product-Drug Interaction Research Data Repository, a publicly accessible database where researchers can access scientific results, raw data, and recommended approaches to optimally assess the clinical significance of pharmacokinetic natural product-drug interactions (PK-NPDIs).
Nucleic Acids Phylogenetic Profiling napp NAPP (Nucleic Acids Phylogenetic Profiling is a clustering method based on conserved noncoding RNA (ncRNA) elements in a bacterial genomes. Short intergenic regions from a reference genome are compared with other genomes to identify RNA rich clusters.
National Academic Research and Collaborations Information System narcis NARCIS provides access to scientific information, including (open access) publications from the repositories of all the Dutch universities, KNAW, NWO and a number of research institutes, which is not referenced in other citation databases.
NASC code nasc The Nottingham Arabidopsis Stock Centre (NASC) provides seed and information resources to the International Arabidopsis Genome Programme and the wider research community.
National Bibliography Number nbn The National Bibliography Number (NBN), is a URN-based publication identifier system employed by a variety of national libraries such as those of Germany, the Netherlands and Switzerland. They are used to identify documents archived in national libraries, in their native format or language, and are typically used for documents which do not have a publisher-assigned identifier.
Neuro Behavior Ontology nbo An ontology of human and animal behaviours and behavioural phenotypes
NITE Biological Resource Center nbrc NITE Biological Research Center (NBRC) provides a collection of microbial resources, performing taxonomic characterization of individual microorganisms such as bacteria including actinomycetes and archaea, yeasts, fungi, algaes, bacteriophages and DNA resources for academic research and industrial applications. A catalogue is maintained which states strain nomenclature, synonyms, and culture and sequence information.
NCATS Drugs ncats.drug The National Center for Advancing Translational Sciences (NCATS) has developed Inxight: Drugs as a comprehensive portal for drug development information. NCATS Inxight: Drugs contains information on ingredients in medicinal products, including:
Entrez Gene ID ncbigene Entrez Gene is the NCBI's database for gene-specific information, focusing on completely sequenced genomes, those with an active research community to contribute gene-specific information, or those that are scheduled for intense sequence analysis.
NCBI Protein ncbiprotein The Protein database is a collection of sequences from several sources, including translations from annotated coding regions in GenBank, RefSeq and TPA, as well as records from SwissProt, PIR, PRF, and PDB.
NCBI organismal classification ncbitaxon The taxonomy contains the relationships between all living forms for which nucleic acid or protein sequence have been determined.
NCI Metathesaurus ncim NCI Metathesaurus (NCIm) is a wide-ranging biomedical terminology database that covers most terminologies used by NCI for clinical care, translational and basic research, and public information and administrative activities. It integrates terms and definitions from different terminologies, including NCI Thesaurus, however the representation is not identical.
NCI Thesaurus ncit NCI Thesaurus (NCIt) provides reference terminology covering vocabulary for clinical care, translational and basic research, and public information and administrative activities, providing a stable and unique identification code.
Non-Coding RNA Ontology ncro An ontology for non-coding RNA, both of biological origin, and engineered.
National Drug Code ndc The National Drug Code (NDC) is a unique, three-segment number used by the Food and Drug Administration (FDA) to identify drug products for commercial use. This is required by the Drug Listing Act of 1972. The FDA publishes and updates the listed NDC numbers daily.
National Drug Data File nddf FDB MedKnowledge encompasses medications approved by the U.S. Food and Drug Administration, and information on commonly-used over-the-counter and alternative therapy agents such as herbals, nutraceuticals and dietary supplements.
Network Data Exchange ndex The Network Data Exchange (NDEx) is an open-source framework where scientists and organizations can store, share, manipulate, and publish biological network knowledge.
National Drug File - Reference Terminology ndfrt NDF-RT combines the NDF hierarchical drug classification with a multi-category reference model. The categories are: Cellular or Molecular Interactions [MoA]; Chemical Ingredients [Chemical/Ingredient]; Clinical Kinetics [PK]; Diseases, Manifestations or Physiologic States [Disease/Finding]Dose Forms [Dose Form]; Pharmaceutical Preparations; Physiological Effects [PE]; Therapeutic Categories [TC]; and VA Drug Interactions [VA Drug Interaction].
Neural ElectroMagnetic Ontology nemo This namespace is about Neuroscience Multi-Omic data, specially focused on that data generated from the BRAIN Initiative and related brain research projects.
NeuroLex neurolex The NeuroLex project is a dynamic lexicon of terms used in neuroscience. It is supported by the Neuroscience Information Framework project and incorporates information from the NIF standardised ontology (NIFSTD), and its predecessor, the Biomedical Informatics Research Network Lexicon (BIRNLex).
NeuroMorpho neuromorpho NeuroMorpho.Org is a centrally curated inventory of digitally reconstructed neurons.
NeuroNames neuronames BrainInfo is designed to help you identify structures in the brain. If you provide the name of a structure, BrainInfo will show it and tell you about it.
NeuronDB neurondb NeuronDB provides a dynamically searchable database of three types of neuronal properties: voltage gated conductances, neurotransmitter receptors, and neurotransmitter substances. It contains tools that provide for integration of these properties in a given type of neuron and compartment, and for comparison of properties across different types of neurons and compartments.
NeuroVault Collection neurovault.collection Neurovault is an online repository for statistical maps, parcellations and atlases of the brain. This collection references sets (collections) of images.
NeuroVault Image neurovault.image Neurovault is an online repository for statistical maps, parcellations and atlases of the brain. This collection references individual images.
Nematode Expression Pattern DataBase nextdb NextDb is a database that provides information on the expression pattern map of the 100Mb genome of the nematode Caenorhabditis elegans. This was done through EST analysis and systematic whole mount in situ hybridization. Information available includes 5' and 3' ESTs, and in-situ hybridization images of 11,237 cDNA clones.
nextProt nextprot neXtProt is a resource on human proteins, and includes information such as proteins’ function, subcellular location, expression, interactions and role in diseases.
neXtProt family nextprot.family NeXtProt is a comprehensive human-centric discovery platform, offering its users a seamless integration of and navigation through protein-related data. (Developed by the SIB Swiss Institute of Bioinformatics)
NASA GeneLab ngl NASA's GeneLab gathers spaceflight genomic data, RNA and protein expression, and metabolic profiles, interfaces with existing databases for expanded research, will offer tools to conduct data analysis, and is in the process of creating a place online where scientists, researchers, teachers and students can connect with their peers, share their results, and communicate with NASA.
NIA Mouse cDNA Project niaest A catalog of mouse genes expressed in early embryos, embryonic and adult stem cells, including 250000 ESTs, was assembled by the NIA (National Institute on Aging) assembled.This collection represents the name and sequence from individual cDNA clones.
NIF Cell nif_cell Neuronal cell types
NIF Dysfunction nif_dysfunction
NIF Gross Anatomy nif_grossanatomy
NIST Chemistry WebBook nist The NIST Chemistry WebBook provides users with easy access to chemical and physical property data for chemical species through the internet. The data provided in the site are from collections maintained by the NIST Standard Reference Data Program and outside contributors. Data in the WebBook system are organized by chemical species.
National Library of Medicine (NLM) Catalog nlm Bibliographic data for all the journals, books, audiovisuals, computer software, electronic resources and other materials that are in the library's holdings.
NeuroLex Anatomy nlxanat The InterLex project - a core component of SciCrunch and supported by projects such as the Neuroscience Information Framework project (NIF), the NIDDK Information Network (dkNET), and the Open Data Commons for Spinal Cord Injury - is a dynamic lexicon of biomedical terms.
NeuroLex Dysfunction nlxdys The InterLex project - a core component of SciCrunch and supported by projects such as the Neuroscience Information Framework project (NIF), the NIDDK Information Network (dkNET), and the Open Data Commons for Spinal Cord Injury - is a dynamic lexicon of biomedical terms.
National Microbiome Data Collaborative nmdc An initiative to empower the research community to harness microbiome data exploration and discovery through a collaborative integrative data science ecosystem.
NMR-instrument specific component of metabolomics investigations nmr nmrCV is a controlled vocabulary to deliver standardized descriptors for the open mark-up language for NMR raw and spectrum data, sanctioned by the metabolomics standards initiative msi.
NMR Shift Database nmrshiftdb2 NMR database for organic structures and their nuclear magnetic resonance (nmr) spectra. It allows for spectrum prediction (13C, 1H and other nuclei) as well as for searching spectra, structures and other properties.
NOMEN - A nomenclatural ontology for biological names nomen NOMEN is a nomenclatural ontology for biological names (not concepts). It encodes the goverened rules of nomenclature.
NONCODE v3 noncodev3 NONCODE is a database of expression and functional lncRNA (long noncoding RNA) data obtained from microarray studies. LncRNAs have been shown to play key roles in various biological processes such as imprinting control, circuitry controlling pluripotency and differentiation, immune responses and chromosome dynamics. The collection references NONCODE version 3. This was replaced in 2013 by version 4.
NONCODE v4 Gene noncodev4.gene NONCODE is a database of expression and functional lncRNA (long noncoding RNA) data obtained from microarray studies. LncRNAs have been shown to play key roles in various biological processes such as imprinting control, circuitry controlling pluripotency and differentiation, immune responses and chromosome dynamics. The collection references NONCODE version 4 and relates to gene regions.
NONCODE v4 Transcript noncodev4.rna NONCODE is a database of expression and functional lncRNA (long noncoding RNA) data obtained from microarray studies. LncRNAs have been shown to play key roles in various biological processes such as imprinting control, circuitry controlling pluripotency and differentiation, immune responses and chromosome dynamics. The collection references NONCODE version 4 and relates to individual transcripts.
Nonribosomal Peptides Database norine Norine is a database dedicated to nonribosomal peptides (NRPs). In bacteria and fungi, in addition to the traditional ribosomal proteic biosynthesis, an alternative ribosome-independent pathway called NRP synthesis allows peptide production. The molecules synthesized by NRPS contain a high proportion of nonproteogenic amino acids whose primary structure is not always linear, often being more complex and containing cycles and branchings.
Natural Product Activity and Species Source Database npass Database for integrating species source of natural products & connecting natural products to biological targets via experimental-derived quantitative activity data.
NucleaRDB nuclearbd NucleaRDB is an information system that stores heterogenous data on Nuclear Hormone Receptors (NHRs). It contains data on sequences, ligand binding constants and mutations for NHRs.
Web Annotation Ontology oa The W3C Web Annotation Working Group is chartered to develop a set of specifications for an interoperable, sharable, distributed Web Annotation architecture.
Ontology of Adverse Events oae The Ontology of Adverse Eventsy (OAE) is a biomedical ontology in the domain of adverse events. OAE aims to standardize adverse event annotation, integrate various adverse event data, and support computer-assisted reasoning. OAE is a community-based ontology. Its development follows the OBO Foundry principles. Vaccine adverse events have been used as an initial testing use case. OAE also studies adverse events associated with the administration of drug and nutritional products, the operation of surgeries, and the usage of medical devices, etc.
Ontology of Arthropod Circulatory Systems oarcs OArCS is an ontology describing the Arthropod ciruclatory system.
Ontology of Biological Attributes oba A collection of biological attributes (traits) covering all kingdoms of life.
Ontology of Biological and Clinical Statistics obcs OBCS stands for the Ontology of Biological and Clinical Statistics. OBCS is an ontology in the domain of biological and clinical statistics. It is aligned with the Basic Formal Ontology (BFO) and the Ontology for Biomedical Investigations (OBI). OBCS imports all possible biostatistics terms in OBI and includes many additional biostatistics terms, some of which were proposed and discussed in the OBI face-to-face workshop in Ann Arbor in 2012.
Ontology for Biomedical Investigations obi The Ontology for Biomedical Investigations (OBI) project is developing an integrated ontology for the description of biological and clinical investigations. The ontology will represent the design of an investigation, the protocols and instrumentation used, the material used, the data generated and the type analysis performed on it. Currently OBI is being built under the Basic Formal Ontology (BFO).
Ontology for Biobanking obib The Ontology for Biobanking (OBIB) is an ontology for the annotation and modeling of the activities, contents, and administration of a biobank. Biobanks are facilities that store specimens, such as bodily fluids and tissues, typically along with specimen annotation and clinical data. OBIB is based on a subset of the Ontology for Biomedical Investigation (OBI), has the Basic Formal Ontology (BFO) as its upper ontology, and is developed following OBO Foundry principles. The first version of OBIB resulted from the merging of two existing biobank-related ontologies, OMIABIS and biobank ontology.
Internal OBO and PyOBO Relations obo Community development of interoperable ontologies for the biological sciences
OBO in OWL oboinowl This meta-ontology is self-describing. OBO metamodel properties are described using OBO metamodel properties
OpenCitations Corpus occ The OpenCitations Corpus is open repository of scholarly citation data made available under a Creative Commons public domain dedication (CC0), which provides accurate bibliographic references harvested from the scholarly literature that others may freely build upon, enhance and reuse for any purpose, without restriction under copyright or database law.
Open Citation Identifier oci Each OCI (Open Citation Identifier) has a simple structure: oci:number-number, where “oci:” is the identifier prefix, and is used to identify a citation as a first-class data entitiy - see https://opencitations.wordpress.com/2018/02/19/citations-as-first-class-data-entities-introduction/ for additional information. OCIs for citations stored within the OpenCitations Corpus are constructed by combining the OpenCitations Corpus local identifiers for the citing and cited bibliographic resources, separating them with a dash. For example, oci:2544384-7295288 is a valid OCI for the citation between two papers stored within the OpenCitations Corpus. OCIs can also be created for bibliographic resources described in an external bibliographic database, if they are similarly identified there by identifiers having a unique numerical part. For example, the OCI for the citation that exists between Wikidata resources Q27931310 and Q22252312 is oci:01027931310–01022252312. OCIs can also be created for bibliographic resources described in external bibliographic database such as Crossref or DataCite where they are identified by alphanumeric Digital Object Identifiers (DOIs), rather than purely numerical strings.
Ontology Concept Identifiers ocid 'ocid' stands for "Ontology Concept Identifiers" and are 12 digit long integers covering IDs in topical ontologies from anatomy up to toxicology.
Online Computer Library Center (OCLC) WorldCat oclc The global library cooperative OCLC maintains WorldCat. WorldCat is the world's largest network of library content and services. WorldCat libraries are dedicated to providing access to their resources on the Web, where most people start their search for information.
Open Data for Access and Mining odam Experimental data table management software to make research data accessible and available for reuse with minimal effort on the part of the data provider. Designed to manage experimental data tables in an easy way for users, ODAM provides a model for structuring both data and metadata that facilitates data handling and analysis. It also encourages data dissemination according to FAIR principles by making the data interoperable and reusable by both humans and machines, allowing the dataset to be explored and then extracted in whole or in part as needed.
Open Data Commons for Spinal Cord Injury odc.sci The Open Data Commons for Spinal Cord Injury is a cloud-based community-driven repository to store, share, and publish spinal cord injury research data.
Open Data Commons for Traumatic Brain Injury odc.tbi The Open Data Commons for Traumatic Brain Injury is a cloud-based community-driven repository to store, share, and publish traumatic brain injury research data.
Odor Molecules DataBase odor OdorDB stores information related to odorous compounds, specifically identifying those that have been shown to interact with olfactory receptors
The Ontology of Genes and Genomes ogg OGG is a biological ontology in the area of genes and genomes. OGG uses the Basic Formal Ontology (BFO) as its upper level ontology. This OGG document contains the genes and genomes of a list of selected organisms, including human, two viruses (HIV and influenza virus), and bacteria (B. melitensis strain 16M, E. coli strain K-12 substrain MG1655, M. tuberculosis strain H37Rv, and P. aeruginosa strain PAO1). More OGG information for other organisms (e.g., mouse, zebrafish, fruit fly, yeast, etc.) may be found in other OGG subsets.
Ontology for genetic interval ogi OGI formalized the genomic element by defining an upper class 'genetic interval'. The definition of 'genetic interval' is "the spatial continuous physical entity which contains ordered genomic sets(DNA, RNA, Allele, Marker,etc.) between and including two points (Nucleic Acid Base Residue) on a chromosome or RNA molecule which must have a liner primary sequence sturcture." Related paper: 1. Yu Lin, Norihiro Sakamoto (2009) “Genome, Gene, Interval and Ontology” Interdisciplinary Ontology Vol.2 - Proceedings of the Second Interdisciplinary Meeting, Tokyo, Feb. 28th- Mar. 1st, 2009. Page(s):25-34 (http://cdb-riken.academia.edu/LinYu/Papers/142399/Genome_Gene_Interval_and_Ontology) Yu Lin, Hiroshi Tarui, Peter Simons (2009) “From Ontology for Genetic Interval(OGI) to Sequence Assembly – Ontology apply to next generation sequencing” Proceeding of the Semantic Web Applications and Tools for Life Science Workshop, Amsterdam, Nov.20th, 2009. (http://ceur-ws.org/Vol-559/Poster2.pdf) Yu Lin, Peter Simons (2010) “DNA sequence from below: A Nominalist Approach” Interdisciplinary Ontology Vol.3 - Proceedings of the Second Interdisciplinary Meeting, Tokyo, Feb. 28th- Mar. 1st, 2010. (http://philpapers.org/rec/LINDSF)
Ontology for General Medical Science ogms The Ontology for General Medical Science (OGMS) is an ontology of entities involved in a clinical encounter. OGMS includes very general terms that are used across medical disciplines, including: 'disease', 'disorder', 'disease course', 'diagnosis', 'patient', and 'healthcare provider'. OGMS uses the Basic Formal Ontology (BFO) as an upper-level ontology. The scope of OGMS is restricted to humans, but many terms can be applied to a variety of organisms. OGMS provides a formal theory of disease that can be further elaborated by specific disease ontologies. This theory is implemented using OWL-DL and OBO Relation Ontology relations and is available in OWL and OBO formats. OGMS is based on the papers Toward an Ontological Treatment of Disease and Diagnosis and On Carcinomas and Other Pathological Entities. The ontology attempts to address some of the issues raised at the Workshop on Ontology of Diseases (Dallas, TX) and the Signs, Symptoms, and Findings Workshop(Milan, Italy). OGMS was formerly called the clinical phenotype ontology. Terms from OGMS hang from the Basic Formal Ontology.
Ontology of Genetic Susceptibility Factor ogsf An application ontology to represent genetic susceptibility to a specific disease, adverse event, or a pathological process.
The Oral Health and Disease Ontology ohd The Oral Health and Disease Ontology was created, initially, to represent the content of dental practice health records.
Ontology of Host-Microbiome Interactions ohmi OHMI is a biomedical ontology that represents the entities and relations in the domain of host-microbiome interactions.
Ontology of Host Pathogen Interactions ohpi OHPI is a biomedical ontology in the area of host-pathogen interactions. OHPI is developed by following the OBO Foundry Principles (e.g., openness and collaboration).
OID Repository oid OIDs provide a persistent identification of objects based on a hierarchical structure of Registration Authorities (RA), where each parent has an object identifier and allocates object identifiers to child nodes.
Medaka Developmental Stages olatdv Life cycle stages for Medaka
Ontology of units of Measure om The OM ontology provides classes, instances, and properties that represent the different concepts used for defining and using measures and units. It includes, for instance, common units such as the SI units meter and kilogram, but also units from other systems of units such as the mile or nautical mile. For many application areas it includes more specific units and quantities, such as the unit of the Hubble constant: km/s/Mpc, or the quantity vaselife. OM defines the complete set of concepts in the domain as distinguished in the textual standards. As a result the ontology can answer a wider range of competency questions than the existing approaches do. The following application areas are supported by OM: Geometry; Mechanics; Thermodynamics; Electromagnetism; Fluid mechanics; Chemical physics; Photometry; Radiometry and Radiobiology; Nuclear physics; Astronomy and Astrophysics; Cosmology; Earth science; Meteorology; Material science; Microbiology; Economics; Information technology; Typography; Shipping; Food engineering; Post-harvest; technology; Dynamics of texture and taste; Packaging
OMA Group oma.grp OMA (Orthologous MAtrix) is a database that identifies orthologs among publicly available, complete genome sequences. It identifies orthologous relationships which can be accessed either group-wise, where all group members are orthologous to all other group members, or on a sequence-centric basis, where for a given protein all its orthologs in all other species are displayed. This collection references groupings of orthologs.
OMA HOGs oma.hog Hierarchical orthologous groups predicted by OMA (Orthologous MAtrix) database. Hierarchical orthologous groups are sets of genes that have started diverging from a single common ancestor gene at a certain taxonomic level of reference.
OMA Protein oma.protein OMA (Orthologous MAtrix) is a database that identifies orthologs among publicly available, complete genome sequences. It identifies orthologous relationships which can be accessed either group-wise, where all group members are orthologous to all other group members, or on a sequence-centric basis, where for a given protein all its orthologs in all other species are displayed. This collection references individual protein records.
Online Mendelian Inheritance in Animals omia Online Mendelian Inheritance in Animals is a a database of genes, inherited disorders and traits in animal species (other than human and mouse).
Ontologized MIABIS omiabis An ontological version of MIABIS (Minimum Information About BIobank data Sharing)
Ontology for MicroRNA Target omit The purpose of the OMIT ontology is to establish data exchange standards and common data elements in the microRNA (miR) domain. Biologists (cell biologists in particular) and bioinformaticians can make use of OMIT to leverage emerging semantic technologies in knowledge acquisition and discovery for more effective identification of important roles performed by miRs in humans' various diseases and biological processes (usually through miRs' respective target genes).
OBO Metadata Ontology omo An ontology specifies terms that are used to annotate ontology terms for all OBO ontologies. The ontology was developed as part of Information Artifact Ontology (IAO).
Observational Medical Outcomes Partnership omop The OMOP Common Data Model allows for the systematic analysis of disparate observational databases. The concept behind this approach is to transform data contained within those databases into a common format (data model) as well as a common representation (terminologies, vocabularies, coding schemes), and then perform systematic analyses using a library of standard analytic routines that have been written based on the common format.
Ontology of Microbial Phenotypes omp An ontology of phenotypes covering microbes
Ontology of Medically Related Social Entities omrse This ontology covers the domain of social entities that are related to health care, such as demographic information and the roles of various individuals and organizations.
OncoTree oncotree OncoTree is a dynamic and flexible community-driven cancer classification platform encompassing rare and common cancers that provides clinically relevant and appropriately granular cancer classification for clinical decision support systems and oncology research.
Ontology for Nutritional Epidemiology one An ontology to standardize research output of nutritional epidemiologic studies.
Ontology for Nutritional Studies ons The Ontology for Nutritional Studies (ONS) has been developed as part of the ENPADASI European project (http://www.enpadasi.eu/) with the aim to define a common language and building ontologies for nutritional studies.
Obstetric and Neonatal Ontology ontoneo The Obstetric and Neonatal Ontology is a structured controlled vocabulary to provide a representation of the data from electronic health records (EHRs) involved in the care of the pregnant woman, and of her baby.
Ontology of Organizational Structures of Trauma centers and Trauma systems oostt An ontology built for representating the organizational components of trauma centers and trauma systems.
Ontology of Physics for Biology opb The OPB is a reference ontology of classical physics as applied to the dynamics of biological systems. It is designed to encompass the multiple structural scales (multiscale atoms to organisms) and multiple physical domains (multidomain fluid dynamics, chemical kinetics, particle diffusion, etc.) that are encountered in the study and analysis of biological organisms.
Ontology for Parasite LifeCycle opl The Ontology for Parasite Lifecycle (OPL) models the life cycle stage details of various parasites, including Trypanosoma sp., Leishmania major, and Plasmodium sp., etc. In addition to life cycle stages, the ontology also models necessary contextual details, such as host information, vector information, and anatomical location. OPL is based on the Basic Formal Ontology (BFO) and follows the rules set by the OBO Foundry consortium.
Orientations of Proteins in Membranes Database opm The Orientations of Proteins in Membranes (OPM) database provides spatial positions of membrane-bound peptides and proteins of known three-dimensional structure in the lipid bilayer, together with their structural classification, topology and intracellular localization.
Ontology of Precision Medicine and Investigation opmi OPMI is a biomedical ontology in the area of precision medicine and its related investigations. It is community-driven and developed by following the OBO Foundry ontology development principles.
Open Researcher and Contributor ID orcid ORCID (Open Researcher and Contributor ID) is an open, non-profit, community-based effort to create and maintain a registry of unique identifiers for individual researchers. ORCID records hold non-sensitive information such as name, email, organization name, and research activities.
Olfactory Receptor Database ordb The Olfactory Receptor Database (ORDB) is a repository of genomics and proteomics information of olfactory receptors (ORs). It includes a broad range of chemosensory genes and proteins, that includes in addition to ORs the taste papilla receptors (TPRs), vomeronasal organ receptors (VNRs), insect olfactory receptors (IORs), Caenorhabditis elegans chemosensory receptors (CeCRs), fungal pheromone receptors (FPRs).
OriDB Saccharomyces oridb.sacch OriDB is a database of collated genome-wide mapping studies of confirmed and predicted replication origin sites in Saccharomyces cerevisiae and the fission yeast Schizosaccharomyces pombe. This collection references Saccharomyces cerevisiae.
OriDB Schizosaccharomyces oridb.schizo OriDB is a database of collated genome-wide mapping studies of confirmed and predicted replication origin sites in Saccharomyces cerevisiae and the fission yeast Schizosaccharomyces pombe. This collection references Schizosaccharomyces pombe.
Ontology of RNA Sequencing ornaseq An application ontology designed to annotate next-generation sequencing experiments performed on RNA.
Orphanet orphanet Orphanet is a reference portal for information on rare diseases and orphan drugs. It’s aim is to help improve the diagnosis, care and treatment of patients with rare diseases.
Orphanet Rare Disease Ontology orphanet.ordo The Orphanet Rare Disease ontology (ORDO) is a structured vocabulary for rare diseases, capturing relationships between diseases, genes and other relevant features which will form a useful resource for the computational analysis of rare diseases. It integrates a nosology (classification of rare diseases), relationships (gene-disease relations, epiemological data) and connections with other terminologies (MeSH, UMLS, MedDRA), databases (OMIM, UniProtKB, HGNC, ensembl, Reactome, IUPHAR, Geantlas) and classifications (ICD10).
Orthology Ontology orth The need of a common ontology for describing orthology information in biological research communities has led to the creation of the Orthology Ontology (ORTH). ORTH ontology is designed to describe sequence homology data available in multiple orthology databases on the Web (e.g.: OMA, OrthoDB, HieranoiDB, and etc.). By sequence homology data, we mostly mean gene region, gene and protein centric orthology, paralogy, and xenology information. Depending on the database, the homology information is structured in different ways. ORTH ontology accommodates these disparate data structures namely Hierarchical Orthologous Group (HOG), cluster of homologous sequences and homologous-pairwise relations between sequences. In addition to the specific ORTH terms, this specification includes terms of the imported ontologies (e.g. Semanticscience Integrated Ontology, SIO) which are pertinents to represent the information from various orthology databases in a homogeneous way.
OrthoDB orthodb OrthoDB presents a catalog of eukaryotic orthologous protein-coding genes across vertebrates, arthropods, and fungi. Orthology refers to the last common ancestor of the species under consideration, and thus OrthoDB explicitly delineates orthologs at each radiation along the species phylogeny. The database of orthologs presents available protein descriptors, together with Gene Ontology and InterPro attributes, which serve to provide general descriptive annotations of the orthologous groups
Oryzabase Gene oryzabase.gene Oryzabase provides a view of rice (Oryza sativa) as a model monocot plant by integrating biological data with molecular genomic information. It contains information about rice development and anatomy, rice mutants, and genetic resources, especially for wild varieties of rice. Developmental and anatomical descriptions include in situ gene expression data serving as stage and tissue markers. This collection references gene information.
Oryzabase Mutant oryzabase.mutant Oryzabase provides a view of rice (Oryza sativa) as a model monocot plant by integrating biological data with molecular genomic information. It contains information about rice development and anatomy, rice mutants, and genetic resources, especially for wild varieties of rice. Developmental and anatomical descriptions include in situ gene expression data serving as stage and tissue markers. This collection references mutant strain information.
Oryzabase Reference oryzabase.reference The Oryzabase is a comprehensive rice science database established in 2000 by rice researcher's committee in Japan.
Oryzabase Stage oryzabase.stage Oryzabase provides a view of rice (Oryza sativa) as a model monocot plant by integrating biological data with molecular genomic information. It contains information about rice development and anatomy, rice mutants, and genetic resources, especially for wild varieties of rice. Developmental and anatomical descriptions include in situ gene expression data serving as stage and tissue markers. This collection references development stage information.
Oryzabase Strain oryzabase.strain Oryzabase provides a view of rice (Oryza sativa) as a model monocot plant by integrating biological data with molecular genomic information. It contains information about rice development and anatomy, rice mutants, and genetic resources, especially for wild varieties of rice. Developmental and anatomical descriptions include in situ gene expression data serving as stage and tissue markers. This collection references wild strain information.
Oryza Tag Line otl Oryza Tag Line is a database that was developed to collect information generated from the characterization of rice (Oryza sativa L cv. Nipponbare) insertion lines resulting in potential gene disruptions. It collates morpho-physiological alterations observed during field evaluation, with each insertion line documented through a generic passport data including production records, seed stocks and FST information.
Ontology of Vaccine Adverse Events ovae OVAE is a biomedical ontology in the area of vaccine adverse events. OVAE is an extension of the community-based Ontology of Adverse Events (OAE).
OWL Ontology owl Overview of the Web Ontology Language (OWL) which provides an introduction to OWL by informally describing the features of each of the sublanguages.
P3DB Protein p3db.protein Plant Protein Phosphorylation DataBase (P3DB) is a database that provides information on experimentally determined phosphorylation sites in the proteins of various plant species. This collection references plant proteins that contain phosphorylation sites.
P3DB Site p3db.site Plant Protein Phosphorylation DataBase (P3DB) is a database that provides information on experimentally determined phosphorylation sites in the proteins of various plant species. This collection references phosphorylation sites in proteins.
Paleobiology Database paleodb The Paleobiology Database seeks to provide researchers and the public with information about the entire fossil record. It stores global, collection-based occurrence and taxonomic data for marine and terrestrial animals and plants of any geological age, as well as web-based software for statistical analysis of the data.
Panorama Public panorama Panorama is a freely-available, open-source repository server application for targeted mass spectrometry assays that integrates into a Skyline mass spec workflow. It makes links to the Proteomics Exchange when possible.
Protein ANalysis THrough Evolutionary Relationships Classification System panther.family The PANTHER (Protein ANalysis THrough Evolutionary Relationships) Classification System is a resource that classifies genes by their functions, using published scientific experimental evidence and evolutionary relationships to predict function even in the absence of direct experimental evidence. This collection references groups of genes that have been organised as families.
PANTHER Node panther.node The PANTHER (Protein ANalysis THrough Evolutionary Relationships) Classification System is a resource that classifies genes by their functions, using published scientific experimental evidence and evolutionary relationships to predict function even in the absence of direct experimental evidence. PANTHER tree is a key element of the PANTHER System to represent ‘all’ of the evolutionary events in the gene family. PANTHER nodes represent the evolutionary events, either speciation or duplication, within the tree. PANTHER is maintaining stable identifier for these nodes.
PANTHER Pathway panther.pathway The PANTHER (Protein ANalysis THrough Evolutionary Relationships) Classification System is a resource that classifies genes by their functions, using published scientific experimental evidence and evolutionary relationships to predict function even in the absence of direct experimental evidence. The PANTHER Pathway collection references pathway information, primarily for signaling pathways, each with subfamilies and protein sequences mapped to individual pathway components.
PANTHER Pathway Component panther.pthcmp The PANTHER (Protein ANalysis THrough Evolutionary Relationships) Classification System is a resource that classifies genes by their functions, using published scientific experimental evidence and evolutionary relationships to predict function even in the absence of direct experimental evidence. The PANTHER Pathway Component collection references specific classes of molecules that play the same mechanistic role within a pathway, across species. Pathway components may be proteins, genes/DNA, RNA, or simple molecules. Where the identified component is a protein, DNA, or transcribed RNA, it is associated with protein sequences in the PANTHER protein family trees through manual curation.
Plant Anatomy Ontology pao
Protein Alignment organised as Structural Superfamily pass2 The PASS2 database provides alignments of proteins related at the superfamily level and are characterized by low sequence identity.
PathBank pathbank PathBank is an interactive, visual database containing more than 100 000 machine-readable pathways found in model organisms such as humans, mice, E. coli, yeast, and Arabidopsis thaliana.
Pathway Commons pathwaycommons Pathway Commons is a convenient point of access to biological pathway information collected from public pathway databases, which you can browse or search. It is a collection of publicly available pathways from multiple organisms that provides researchers with convenient access to a comprehensive collection of pathways from multiple sources represented in a common language.
Phenotype And Trait Ontology pato PATO is an ontology of phenotypic qualities, intended for use in a number of applications, primarily defining composite phenotypes and phenotype annotation.
Provenance, Authoring, and Versioning Vocabulary pav PAV is a lightweight ontology for tracking provenance, authorship, and versioning. It specializes the W3C provenance ontology PROV-O in order to describe authorship, curation and digital creation of online resources.
PaxDb Organism paxdb.organism PaxDb is a resource dedicated to integrating information on absolute protein abundance levels across different organisms. Publicly available experimental data are mapped onto a common namespace and, in the case of tandem mass spectrometry data, re-processed using a standardized spectral counting pipeline. Data sets are scored and ranked to assess consistency against externally provided protein-network information. PaxDb provides whole-organism data as well as tissue-resolved data, for numerous proteins. This collection references protein abundance information by species.
PaxDb Protein paxdb.protein PaxDb is a resource dedicated to integrating information on absolute protein abundance levels across different organisms. Publicly available experimental data are mapped onto a common namespace and, in the case of tandem mass spectrometry data, re-processed using a standardized spectral counting pipeline. Data sets are scored and ranked to assess consistency against externally provided protein-network information. PaxDb provides whole-organism data as well as tissue-resolved data, for numerous proteins. This collection references individual protein abundance levels.
Pazar Transcription Factor pazar The PAZAR database unites independently created and maintained data collections of transcription factor and regulatory sequence annotation. It provides information on the sequence and target of individual transcription factors.
Population and Community Ontology pco The Population and Community Ontology (PCO) describes material entities, qualities, and processes related to collections of interacting organisms such as populations and communities. It is taxon neutral, and can be used for any species, including humans. The classes in the PCO are useful for describing evolutionary processes, organismal interactions, and ecological experiments. Practical applications of the PCO include community health care, plant pathology, behavioral studies, sociology, and ecology.
Platynereis stage ontology pd_st
PDB structure ID pdb The Protein Data Bank is the single worldwide archive of structural data of biological macromolecules.
Chemical Component Dictionary pdb-ccd The Chemical Component Dictionary is as an external reference file describing all residue and small molecule components found in Protein Data Bank entries. It contains detailed chemical descriptions for standard and modified amino acids/nucleotides, small molecule ligands, and solvent molecules. Each chemical definition includes descriptions of chemical properties such as stereochemical assignments, aromatic bond assignments, idealized coordinates, chemical descriptors (SMILES & InChI), and systematic chemical names.
Protein Data Bank Ligand pdb.ligand The Protein Data Bank is the single worldwide archive of structural data of biological macromolecules. This collection references ligands.
The Prescription of Drugs Ontology pdro An ontology to describe entities related to prescription of drugs
Platynereis Developmental Stages pdumdv Life cycle stages for Platynereis dumerilii
Plant Environment Ontology peco The Plant Environment Ontology is a set of standardized controlled vocabularies to describe various types of treatments given to an individual plant / a population or a cultured tissue and/or cell type sample to evaluate the response on its exposure. It also includes the study types, where the terms can be used to identify the growth study facility. Each growth facility such as field study, growth chamber, green house etc is a environment on its own it may also involve instances of biotic and abiotic environments as supplemental treatments used in these studies.
Protein Ensemble Database ped The Protein Ensemble Database is an open access database for the deposition of structural ensembles, including intrinsically disordered proteins.
Protein Ensemble Database ensemble ped.ensemble The Protein Ensemble Database is an open access database for the deposition of structural ensembles, including intrinsically disordered proteins.
PSI Extended File Format peff
PeptideAtlas peptideatlas The PeptideAtlas Project provides a publicly accessible database of peptides identified in tandem mass spectrometry proteomics studies and software tools.
PeptideAtlas Dataset peptideatlas.dataset Experiment details about PeptideAtlas entries. Each PASS entry provides direct access to the data files submitted to PeptideAtlas.
Peroxibase peroxibase Peroxibase provides access to peroxidase sequences from all kingdoms of life, and provides a series of bioinformatics tools and facilities suitable for analysing these sequences.
Alan Wood's Pesticides pesticides The Alan Wood’s Pesticides is a compendium of pesticides which contain nomenclature data sheets for more than 1700 different active ingredients.
Pfam ID pfam The Pfam database contains information about protein domains and families. For each entry a protein sequence alignment and a Hidden Markov Model is stored.
Pfam Clans pfam.clan Higher order grouping of Pfam families
Proteoform Atlas pfr Database that provides a central location for scientists to browse uniquely observed proteoforms and to contribute their own datasets.
Plant Growth and Development Stage pgdso
Polygenic Score Catalog pgs The Polygenic Score (PGS) Catalog is an open database of PGS and the relevant metadata required for accurate application and evaluation.
Progenetix pgx The Progenetix database provides an overview of mutation data in cancer, with a focus on copy number abnormalities (CNV / CNA), for all types of human malignancies. The resource contains genome profiles of more than 130'000 individual samples and represents about 700 cancer types, according to the NCIt "neoplasm" classification. Additionally to this genome profiles and associated metadata, the website present information about thousands of publications referring to cancer genome profiling experiments, and services for mapping cancer classifications and accessing supplementary data through its APIs.
PharmacoDB Cells pharmacodb.cell Web-application assembling the largest in vitro drug screens in a single database, and allowing users to easily query the union of studies released to date. Query by cell line.
PharmacoDB Datasets pharmacodb.dataset Web-application assembling the largest in vitro drug screens in a single database, and allowing users to easily query the union of studies released to date. Query by dataset.
PharmacoDB Tissues pharmacodb.tissue Web-application assembling the largest in vitro drug screens in a single database, and allowing users to easily query the union of studies released to date. Query by tissue.
PharmGKB Disease pharmgkb.disease The PharmGKB database is a central repository for genetic, genomic, molecular and cellular phenotype data and clinical information about people who have participated in pharmacogenomics research studies. The data includes, but is not limited to, clinical and basic pharmacokinetic and pharmacogenomic research in the cardiovascular, pulmonary, cancer, pathways, metabolic and transporter domains.
PharmGKB Drug pharmgkb.drug The PharmGKB database is a central repository for genetic, genomic, molecular and cellular phenotype data and clinical information about people who have participated in pharmacogenomics research studies. The data includes, but is not limited to, clinical and basic pharmacokinetic and pharmacogenomic research in the cardiovascular, pulmonary, cancer, pathways, metabolic and transporter domains.
Pharmacogenetics and Pharmacogenomics Knowledge Base pharmgkb.gene The PharmGKB database is a central repository for genetic, genomic, molecular and cellular phenotype data and clinical information about people who have participated in pharmacogenomics research studies. The data includes, but is not limited to, clinical and basic pharmacokinetic and pharmacogenomic research in the cardiovascular, pulmonary, cancer, pathways, metabolic and transporter domains.
PharmGKB Pathways pharmgkb.pathways The PharmGKB database is a central repository for genetic, genomic, molecular and cellular phenotype data and clinical information about people who have participated in pharmacogenomics research studies. The data includes, but is not limited to, clinical and basic pharmacokinetic and pharmacogenomic research in the cardiovascular, pulmonary, cancer, pathways, metabolic and transporter domains. PharmGKB Pathways are drug centric, gene based, interactive pathways which focus on candidate genes and gene groups and associated genotype and phenotype data of relevance for pharmacogenetic and pharmacogenomic studies.
Phenol-Explorer phenolexplorer Phenol-Explorer is an electronic database on polyphenol content in foods. Polyphenols form a wide group of natural antioxidants present in a large number of foods and beverages. They contribute to food characteristics such as taste, colour or shelf-life. They also participate in the prevention of several major chronic diseases such as cardiovascular diseases, diabetes, cancers, neurodegenerative diseases or osteoporosis.
Pathogen Host Interaction Phenotype Ontology phipo Ontology of species-neutral phenotypes observed in pathogen-host interactions.
PhosphoPoint Kinase phosphopoint.kinase PhosphoPOINT is a database of the human kinase and phospho-protein interactome. It describes the interactions among kinases, their potential substrates and their interacting (phospho)-proteins. It also incorporates gene expression and uses gene ontology (GO) terms to annotate interactions. This collection references kinase information.
PhosphoPoint Phosphoprotein phosphopoint.protein PhosphoPOINT is a database of the human kinase and phospho-protein interactome. It describes the interactions among kinases, their potential substrates and their interacting (phospho)-proteins. It also incorporates gene expression and uses gene ontology (GO) terms to annotate interactions. This collection references phosphoprotein information.
PhosphoSite Protein phosphosite.protein PhosphoSite is a mammalian protein database that provides information about in vivo phosphorylation sites. This datatype refers to protein-level information, providing a list of phosphorylation sites for each protein in the database.
PhosphoSite Residue phosphosite.residue PhosphoSite is a mammalian protein database that provides information about in vivo phosphorylation sites. This datatype refers to residue-level information, providing a information about a single modification position in a specific protein sequence.
PhylomeDB phylomedb PhylomeDB is a database of complete phylomes derived for different genomes within a specific taxonomic range. It provides alignments, phylogentic trees and tree-based orthology predictions for all encoded proteins.
Plant Genome Network phytozome.locus Phytozome is a project to facilitate comparative genomic studies amongst green plants. Famlies of orthologous and paralogous genes that represent the modern descendents of ancestral gene sets are constructed at key phylogenetic nodes. These families allow easy access to clade specific orthology/paralogy relationships as well as clade specific genes and gene expansions. This collection references locus information.
PicTar pictar
NCI Pathway Interaction Database: Pathway pid.pathway The Pathway Interaction Database is a highly-structured, curated collection of information about known human biomolecular interactions and key cellular processes assembled into signaling pathways. This datatype provides access to pathway information.
Animal Genome Pig QTL pigqtldb The Animal Quantitative Trait Loci (QTL) database (Animal QTLdb) is designed to house publicly all available QTL and single-nucleotide polymorphism/gene association data on livestock animal species. This collection references pig QTLs.
Protein Interaction Network Analysis pina Protein Interaction Network Analysis (PINA) platform is an integrated platform for protein interaction network construction, filtering, analysis, visualization and management. It integrates protein-protein interaction data from six public curated databases and builds a complete, non-redundant protein interaction dataset for six model organisms.
PiroplasmaDB piroplasma PiroplasmaDB is one of the databases that can be accessed through the EuPathDB (http://EuPathDB.org; formerly ApiDB) portal, covering eukaryotic pathogens of the genera Cryptosporidium, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Trichomonas and Trypanosoma. While each of these groups is supported by a taxon-specific database built upon the same infrastructure, the EuPathDB portal offers an entry point to all these resources, and the opportunity to leverage orthology for searches across genera.
PIR Superfamily Classification System pirsf The PIR SuperFamily concept is being used as a guiding principle to provide comprehensive and non-overlapping clustering of UniProtKB sequences into a hierarchical order to reflect their evolutionary relationships.
PK-DB pkdb PK-DB an open database for pharmacokinetics information from clinical trials as well as pre-clinical research. The focus of PK-DB is to provide high-quality pharmacokinetics data enriched with the required meta-information for computational modeling and data integration.
planaria-ontology plana PLANA, the PLANarian Anatomy Ontology, encompasses the anatomy of developmental stages and adult biotypes of Schmidtea mediterranea.
Planarian Phenotype Ontology planp Planarian Phenotype Ontology is an ontology of phenotypes observed in the planarian Schmidtea mediterranea.
Plant Transcription Factor Database planttfdb The Plant TF database (PlantTFDB) systematically identifies transcription factors for plant species. It includes annotation for identified TFs, including information on expression, regulation, interaction, conserved elements, phenotype information. It also provides curated descriptions and cross-references to other life science databases, as well as identifying evolutionary relationship among identified factors.
PlasmoDB Plasmodium Genome Resource plasmodb AmoebaDB is one of the databases that can be accessed through the EuPathDB (http://EuPathDB.org; formerly ApiDB) portal, covering eukaryotic pathogens of the genera Cryptosporidium, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Trichomonas and Trypanosoma. While each of these groups is supported by a taxon-specific database built upon the same infrastructure, the EuPathDB portal offers an entry point to all these resources, and the opportunity to leverage orthology for searches across genera.
Plasmodium Life Cycle plo
CutDB pmap.cutdb The Proteolysis MAP is a resource for proteolytic networks and pathways. PMAP is comprised of five databases, linked together in one environment. CutDB is a database of individual proteolytic events (cleavage sites).
SubstrateDB pmap.substratedb The Proteolysis MAP is a resource for proteolytic networks and pathways. PMAP is comprised of five databases, linked together in one environment. SubstrateDB contains molecular information on documented protease substrates.
Pubmed Central pmc PMC International (PMCI) is a free full-text archive of biomedical and life sciences journal literature. PMCI is a collaborative effort between the U.S. National Institutes of Health and the National Library of Medicine, the publishers whose journal content makes up the PMC archive, and organizations in other countries that share NIH's and NLM's interest in archiving life sciences literature.
Protein Model Database pmdb The Protein Model DataBase (PMDB), is a database that collects manually built three dimensional protein models, obtained by different structure prediction techniques.
Protein Model Portal pmp The number of known protein sequences exceeds those of experimentally solved protein structures. Homology (or comparative) modeling methods make use of experimental protein structures to build models for evolutionary related proteins. The Protein Model Portal (PMP) provides a single portal to access these models, which are accessed through their UniProt identifiers.
Physical Medicine and Rehabilitation pmr Resource for the community to store, retrieve, search, reference, and reuse CellML models.
Plant Ontology po The Plant Ontology is a structured vocabulary and database resource that links plant anatomy, morphology and growth and development to plant genomics data.
Pocketome pocketome Pocketome is an encyclopedia of conformational ensembles of all druggable binding sites that can be identified experimentally from co-crystal structures in the Protein Data Bank. Each Pocketome entry corresponds to a small molecule binding site in a protein which has been co-crystallized in complex with at least one drug-like small molecule, and is represented in at least two PDB entries.
PolBase polbase Polbase is a database of DNA polymerases providing information on polymerase protein sequence, target DNA sequence, enzyme structure, sequence mutations and details on polymerase activity.
PomBase systematic ID pombase PomBase is a model organism database established to provide access to molecular data and biological information for the fission yeast Schizosaccharomyces pombe. It encompasses annotation of genomic sequence and features, comprehensive manual literature curation and genome-wide data sets.
Porifera Ontology poro An ontology describing the anatomical structures and characteristics of Porifera (sponges)
Pesticide Properties DataBase ppdb PPDB is a comprehensive source of data on pesticide chemical, physical and biological properties.
Plant Phenology Ontology ppo An ontology for describing the phenology of individual plants and populations of plants, and for integrating plant phenological data across sources and scales.
Europe PMC Preprints ppr Preprints are articles which have not been peer-reviewed from various preprint servers and open research platforms such as bioRxiv, ChemRxiv, PeerJ Preprints and F1000.
Protein Ontology pr The PRotein Ontology (PRO) has been designed to describe the relationships of proteins and protein evolutionary classes, to delineate the multiple protein forms of a gene locus (ontology for protein forms), and to interconnect existing ontologies.
PRIDE Controlled Vocabulary pride The PRIDE PRoteomics IDEntifications database is a centralized, standards compliant, public data repository that provides protein and peptide identifications together with supporting evidence. This collection references experiments and assays.
PRIDE Project pride.project The PRIDE PRoteomics IDEntifications database is a centralized, standards compliant, public data repository that provides protein and peptide identifications together with supporting evidence. This collection references projects.
PRINTS compendium of protein fingerprints prints PRINTS is a compendium of protein fingerprints. A fingerprint is a group of conserved motifs used to characterise a protein family; its diagnostic power is refined by iterative scanning of a SWISS-PROT/TrEMBL composite. Usually the motifs do not overlap, but are separated along a sequence, though they may be contiguous in 3D-space. Fingerprints can encode protein folds and functionalities more flexibly and powerfully than can single motifs, full diagnostic potency deriving from the mutual context provided by motif neighbours.
Probability Distribution Ontology probonto ProbOnto, is an ontology-based knowledge base of probability distributions, featuring uni- and multivariate distributions with their defining functions, characteristics, relationships and reparameterisation formulae. It can be used for annotation of models, facilitating the encoding of distribution-based models, related functions and quantities.
ProDom prodom ProDom is a database of protein domain families generated from the global comparison of all available protein sequences.
ProGlycProt proglyc ProGlycProt (Prokaryotic Glycoprotein) is a repository of bacterial and archaeal glycoproteins with at least one experimentally validated glycosite (glycosylated residue). Each entry in the database is fully cross-referenced and enriched with available published information about source organism, coding gene, protein, glycosites, glycosylation type, attached glycan, associated oligosaccharyl/glycosyl transferases (OSTs/GTs), supporting references, and applicable additional information.
Proteomics data and process provenance propreo
PROSITE documentation ID prosite PROSITE consists of documentation entries describing protein domains, families and functional sites as well as associated patterns and profiles to identify them.
ProtClustDB protclustdb ProtClustDB is a collection of related protein sequences (clusters) consisting of Reference Sequence proteins encoded by complete genomes. This database contains both curated and non-curated clusters.
ProteomicsDB Peptide proteomicsdb.peptide ProteomicsDB is an effort dedicated to expedite the identification of the human proteome and its use across the scientific community. This human proteome data is assembled primarily using information from liquid chromatography tandem-mass-spectrometry (LC-MS/MS) experiments involving human tissues, cell lines and body fluids. Information is accessible for individual proteins, or on the basis of protein coverage on the encoding chromosome, and for peptide components of a protein. This collection provides access to the peptides identified for a given protein.
ProteomicsDB Protein proteomicsdb.protein ProteomicsDB is an effort dedicated to expedite the identification of the human proteome and its use across the scientific community. This human proteome data is assembled primarily using information from liquid chromatography tandem-mass-spectrometry (LC-MS/MS) experiments involving human tissues, cell lines and body fluids. Information is accessible for individual proteins, or on the basis of protein coverage on the encoding chromosome, and for peptide components of a protein. This collection provides access to individual proteins.
ProtoNet Cluster protonet.cluster ProtoNet provides automatic hierarchical classification of protein sequences in the UniProt database, partitioning the protein space into clusters of similar proteins. This collection references cluster information.
ProtoNet ProteinCard protonet.proteincard ProtoNet provides automatic hierarchical classification of protein sequences in the UniProt database, partitioning the protein space into clusters of similar proteins. This collection references protein information.
PROV Namespace prov The namespace name http://www.w3.org/ns/prov# is intended for use with the PROV family of documents that support the interchange of provenance on the web.
Protein Structural Change Database pscdb The PSCDB (Protein Structural Change DataBase) collects information on the relationship between protein structural change upon ligand binding. Each entry page provides detailed information about this structural motion.
Performance Summary Display Ontology psdo Performance Summary Display Ontology (PSDO) (pronounced "pseudo" or "sudo") is an application ontology about charts, tables, and graphs that are used to communicate performance information to employees and teams in organizations. PSDO's domain focus is on healthcare organizations that use performance summary displays in clinical dashboards and feedback interventions for healthcare professionals and teams. The displays commonly show information about the quality of care and health outcomes that has been derived from clinical data using performance measures (aka metrics, process indicators, quality measures, etc). PSDO uses Basic Formal Ontology as its upper level ontology. This work is not peer-reviewed.
PseudoGene pseudogene This site contains a comprehensive database of identified pseudogenes, utilities used to find pseudogenes, various publication data sets and a pseudogene knowledgebase.
Pseudomonas Genome Database pseudomonas The Pseudomonas Genome Database is a resource for peer-reviewed, continually updated annotation for all Pseudomonas species. It includes gene and protein sequence information, as well as regulation and predicted function and annotation.
Protein Affinity Reagents psipar Protein Affinity Reagents (PSI-PAR) provides a structured controlled vocabulary for the annotation of experiments concerned with interactions, and interactor production methods. PAR is developed by the HUPO Proteomics Standards Initiative and contains the majority of the terms from the PSI-MI controlled vocabular, as well as additional terms.
Plant Stress Ontology pso The Plant Stress Ontology describes...
Phenoscape Publication pspub Documentation of the Phenoscape Curation Workflow
NCBI PubChem database of bioassay records pubchem.bioassay PubChem provides information on the biological activities of small molecules. It is a component of NIH's Molecular Libraries Roadmap Initiative. PubChem bioassay archives active compounds and bioassay results.
PubChem CID pubchem.compound PubChem provides information on the biological activities of small molecules. It is a component of NIH's Molecular Libraries Roadmap Initiative. PubChem Compound archives chemical structures and records.
PubChem Substance ID (SID) pubchem.substance PubChem provides information on the biological activities of small molecules. It is a component of NIH's Molecular Libraries Roadmap Initiative. PubChem Substance archives chemical substance records.
Publons Researcher ID publons.researcher Database of researchers to track publications, citation metrics, peer reviews, and journal editing work.
PubMed pubmed PubMed is a service of the U.S. National Library of Medicine that includes citations from MEDLINE and other life science journals for biomedical articles back to the 1950s.
Pathway ontology pw The Pathway Ontology captures information on biological networks, the relationships between netweorks and the alterations or malfunctioning of such networks within a hierarchical structure. The five main branches of the ontology are: classic metabolic pathways, regulatory, signaling, drug, and disease pathwaysfor complex human conditions.
ProteomeXchange px The ProteomeXchange provides a single point of submission of Mass Spectrometry (MS) proteomics data for the main existing proteomics repositories, and encourages the data exchange between them for optimal data dissemination.
PyPI pypi The Python Package Index (PyPI) is a repository for Python packages.
Animal Genome QTL qtldb The Animal Quantitative Trait Loci (QTL) database (Animal QTLdb) is designed to house publicly all available QTL and single-nucleotide polymorphism/gene association data on livestock animal species. This collection is species-independent.
Quantities, Units, Dimensions, and Types Ontology qudt Ontologies that aim to provide semantic specifications for units of measure, quantity kind, dimensions and data types.
Radiomics Ontology radiomics The Radiomics Ontology aims to cover the radiomics feature domain with a strong focus on first order, shape, textural radiomics features. In addition, in the original version. it includes classes about segmentation algorithms and imaging filters. Due to a recent collaboration with the IBSI (International Biomarkers Standardization Initiative), the ontology has been expanded (v 1.6) and it includes all the entities presented in the IBSI document. Therefore, a broad coverage of not only radiomics features, but also every entity (e.g. software properties, filter properties, features extraction parameters) involved into radiomics computation has been added. In the latest version (v2.0), the ontology URIs have been updated to reflect the codes avaialble in the IBSI latest manual. [bioportal]
RAP-DB Locus rapdb.locus Rice Annotation Project Database (RAP-DB) is a primary rice (Oryza sativa) annotation database established in 2004 upon the completion of the Oryza sativa ssp. japonica cv. Nipponbare genome sequencing by the International Rice Genome Sequencing Project. RAP-DB provides comprehensive resources (e.g. genome annotation, gene expression, DNA markers, genetic diversity, etc.) for biological and agricultural research communities. This collection provides locus information in RAP-DB.
RAP-DB Transcript rapdb.transcript Rice Annotation Project Database (RAP-DB) is a primary rice (Oryza sativa) annotation database established in 2004 upon the completion of the Oryza sativa ssp. japonica cv. Nipponbare genome sequencing by the International Rice Genome Sequencing Project. RAP-DB provides comprehensive resources (e.g. genome annotation, gene expression, DNA markers, genetic diversity, etc.) for biological and agricultural research communities. This collection provides transcript information in RAP-DB.
Rebuilding a Kidney rbk (Re)Building a Kidney is an NIDDK-funded consortium of research projects working to optimize approaches for the isolation, expansion, and differentiation of appropriate kidney cell types and their integration into complex structures that replicate human kidney function.
Radiation Biology Ontology rbo RBO is an ontology for the effects of radiation on biota in terrestrial and space environments.
RIKEN Bioresource Center Cell Bank rcb Collection of many cell lines derived from human and other various animals, preserved by the RIKEN BioResource Research Center.
Resource Description Framework rdf This is the RDF Schema for the RDF vocabulary terms in the RDF Namespace, defined in RDF 1.1 Concepts
RDF Schema rdfs RDF Schema provides a data-modelling vocabulary for RDF data. RDF Schema is an extension of the basic RDF vocabulary.
RGD Disease_Ontology rdo Ontologies of diseases that integrates many types of data for Rattus Norvegicus, Homo Sapiens, Mus Musculus and other organisms.
re3data re3data Re3data is a global registry of research data repositories that covers research data repositories from different academic disciplines.
Reactome ID reactome The Reactome project is a collaboration to develop a curated resource of core pathways and reactions in human biology.
Reaxys reaxys Reaxys is a web-based tool for the retrieval of chemistry information and data from published literature, including journals and patents. The information includes chemical compounds, chemical reactions, chemical properties, related bibliographic data, substance data with synthesis planning information, as well as experimental procedures from selected journals and patents. It is licensed by Elsevier.
REBASE restriction enzyme database rebase REBASE is a comprehensive database of information about restriction enzymes, DNA methyltransferases and related proteins involved in the biological process of restriction-modification (R-M). It contains fully referenced information about recognition and cleavage sites, isoschizomers, neoschizomers, commercial availability, methylation sensitivity, crystal and sequence data.
Human Plasma Membrane Receptome Families receptome.family The human receptor families involved in signaling (with the exception of channels) are presented in the Human Plasma Membrane Receptome database.
Reference Sequence Collection refseq The Reference Sequence (RefSeq) collection aims to provide a comprehensive, integrated, non-redundant set of sequences, including genomic DNA, transcript (RNA), and protein products.
RepeatsDB Protein repeatsdb.protein RepeatsDB is a database of annotated tandem repeat protein structures. This collection references protein entries in the database.
RepeatsDB Structure repeatsdb.structure RepeatsDB is a database of annotated tandem repeat protein structures. This collection references structural entries in the database.
REPRODUCE-ME Ontology reproduceme The REPRODUCE-ME ontology is an extension of the PROV-O and the P-Plan ontology to describe a complete path of a scientific experiment. It expresses the REPRODUCE-ME Data Model using the OWL2 Web Ontology Language (OWL2). It provides a set of classes and properties to represent a scientific experiment including its computational and non-computational steps to track the provenance of results. It describes a complete path of a scientific experiment considering the use-case of biological imaging and microscopy experiments, computational experiments, including Jupyter notebooks and scripts. It describes an experiment and its data, agents, activities, plans, steps, variables, instruments, materials, and settings required for its reproducibility.
Protein covalent bond resid The RESID Database of Protein Modifications is a comprehensive collection of annotations and structures for protein modifications including amino-terminal, carboxyl-terminal and peptide chain cross-link post-translational modifications.
Regulation of Transcription Ontology reto Regulation of Transcription
Physico-chemical process rex An ontology of physico-chemical processes, i.e. physico-chemical changes occurring in course of time.
Regulation of Gene Expression Ontology rexo Regulation of Gene Expression
Rfam database of RNA families rfam The Rfam database is a collection of RNA families, each represented by multiple sequence alignments, consensus secondary structures and covariance models (CMs). The families in Rfam break down into three broad functional classes: non-coding RNA genes, structured cis-regulatory elements and self-splicing RNAs. Typically these functional RNAs often have a conserved secondary structure which may be better preserved than the RNA sequence. The CMs used to describe each family are a slightly more complicated relative of the profile hidden Markov models (HMMs) used by Pfam. CMs can simultaneously model RNA sequence and the structure in an elegant and accurate fashion.
Rat Genome Database ID rgd Rat Genome Database seeks to collect, consolidate, and integrate rat genomic and genetic data with curated functional and physiological data and make these data widely available to the scientific community. This collection references genes.
Rat Genome Database qTL rgd.qtl Rat Genome Database seeks to collect, consolidate, and integrate rat genomic and genetic data with curated functional and physiological data and make these data widely available to the scientific community. This collection references quantitative trait loci (qTLs), providing phenotype and disease descriptions, mapping, and strain information as well as links to markers and candidate genes.
Rat Genome Database strain rgd.strain Rat Genome Database seeks to collect, consolidate, and integrate rat genomic and genetic data with curated functional and physiological data and make these data widely available to the scientific community. This collection references strain reports, which include a description of strain origin, disease, phenotype, genetics and immunology.
Rhea, the Annotated Reactions Database rhea Rhea is an expert-curated knowledgebase of chemical and transport reactions of biological interest. Enzyme-catalyzed and spontaneously occurring reactions are curated from peer-reviewed literature and represented in a computationally tractable manner by using the ChEBI (Chemical Entities of Biological Interest) ontology to describe reaction participants. Rhea covers the reactions described by the IUBMB Enzyme Nomenclature as well as many additional reactions and can be used for enzyme annotation, genome-scale metabolic modeling and omics-related analyses. Rhea is the standard for enzyme and transporter annotation in UniProtKB.
Rice Genome Annotation Project ricegap The objective of this project is to provide high quality annotation for the rice genome Oryza sativa spp japonica cv Nipponbare. All genes are annotated with functional annotation including expression data, gene ontologies, and tagged lines.
RiceNetDB Compound ricenetdb.compound RiceNetDB is currently the most comprehensive regulatory database on Oryza Sativa based on genome annotation. It was displayed in three levels: GEM, PPIs and GRNs to facilitate biomolecular regulatory analysis and gene-metabolite mapping.
RiceNetDB Gene ricenetdb.gene RiceNetDB is currently the most comprehensive regulatory database on Oryza Sativa based on genome annotation. It was displayed in three levels: GEM, PPIs and GRNs to facilitate biomolecular regulatory analysis and gene-metabolite mapping.
RiceNetDB miRNA ricenetdb.mirna RiceNetDB is currently the most comprehensive regulatory database on Oryza Sativa based on genome annotation. It was displayed in three levels: GEM, PPIs and GRNs to facilitate biomolecular regulatory analysis and gene-metabolite mapping.
RiceNetDB Protein ricenetdb.protein RiceNetDB is currently the most comprehensive regulatory database on Oryza Sativa based on genome annotation. It was displayed in three levels: GEM, PPIs and GRNs to facilitate biomolecular regulatory analysis and gene-metabolite mapping.
RiceNetDB Reaction ricenetdb.reaction RiceNetDB is currently the most comprehensive regulatory database on Oryza Sativa based on genome annotation. It was displayed in three levels: GEM, PPIs and GRNs to facilitate biomolecular regulatory analysis and gene-metabolite mapping.
RNAcentral rnacentral RNAcentral is a public resource that offers integrated access to a comprehensive and up-to-date set of non-coding RNA sequences provided by a collaborating group of Expert Databases.
The RNA Modification Database rnamod A comprehensive listing of post-transcriptionally modified nucleosides from RNA -
RNA Modification Database rnamods The RNA modification database provides a comprehensive listing of post-transcriptionally modified nucleosides from RNA. The database consists of all RNA-derived ribonucleosides of known structure, including those from established sequence positions, as well as those detected or characterized from hydrolysates of RNA.
RNA ontology rnao Controlled vocabulary pertaining to RNA function and based on RNA sequences, secondary and three-dimensional structures.
Relation Ontology ro The OBO Relation Ontology provides consistent and unambiguous formal definitions of the relational expressions used in biomedical ontologies.
Rodent Unidentified Gene-Encoded Large Proteins rouge The Rouge protein database contains results from sequence analysis of novel large (>4 kb) cDNAs identified in the Kazusa cDNA sequencing project.
Research Resource Identification rrid The Research Resource Identification Initiative provides RRIDs to 4 main classes of resources: Antibodies, Cell Lines, Model Organisms, and Databases / Software tools.: Antibodies, Model Organisms, and Databases / Software tools. The initiative works with participating journals to intercept manuscripts in the publication process that use these resources, and allows publication authors to incorporate RRIDs within the methods sections. It also provides resolver services that access curated data from 10 data sources: the antibody registry (a curated catalog of antibodies), the SciCrunch registry (a curated catalog of software tools and databases), and model organism nomenclature authority databases (MGI, FlyBase, WormBase, RGD), as well as various stock centers. These RRIDs are aggregated and can be searched through SciCrunch.
Rat Strain Ontology rs Ontology of rat strains
runBioSimulations runbiosimulations runBioSimulations is a platform for sharing simulation experiments and their results. runBioSimulations enables investigators to use a wide range of simulation tools to execute a wide range of simulations. runBioSimulations permanently saves the results of these simulations, and investigators can share results by sharing URLs similar to sharing URLs for files with DropBox and Google Drive.
Name Reaction Ontology rxno RXNO is the name reaction ontology. It contains more than 500 classes representing organic reactions such as the Diels–Alder cyclization.
RxNorm rxnorm RxNorm provides normalized names for clinical drugs and links its names to many of the drug vocabularies commonly used in pharmacy management and drug interaction software, including those of First Databank, Micromedex, and Gold Standard Drug Database. By providing links between these vocabularies, RxNorm can mediate messages between systems not using the same software and vocabulary.
SABIO-RK Compound sabiork.compound SABIO-RK is a relational database system that contains information about biochemical reactions, their kinetic equations with their parameters, and the experimental conditions under which these parameters were measured. The compound data set provides information regarding the reactions in which a compound participates as substrate, product or modifier (e.g. inhibitor, cofactor), and links to further information.
SABIO-RK EC Record sabiork.ec SABIO-RK is a relational database system that contains information about biochemical reactions, their kinetic equations with their parameters, and the experimental conditions under which these parameters were measured. The EC record provides for a given enzyme classification (EC) the associated list of enzyme-catalysed reactions and their corresponding kinetic data.
SABIO Reaction Kinetics sabiork.kineticrecord SABIO-RK is a relational database system that contains information about biochemical reactions, their kinetic equations with their parameters, and the experimental conditions under which these parameters were measured. The kinetic record data set provides information regarding the kinetic law, measurement conditions, parameter details and other reference information.
SABIO-RK Reaction sabiork.reaction SABIO-RK is a relational database system that contains information about biochemical reactions, their kinetic equations with their parameters, and the experimental conditions under which these parameters were measured. The reaction data set provides information regarding the organism in which a reaction is observed, pathways in which it participates, and links to further information.
Standards and Ontologies for Functional Genomics (SOFG) Anatomy Entry List sael
Salk Institute for Biological Studies Accession salk Scientific research institute for neuroscience, genetics, immunology, plant biology and more.
Subcellular Anatomy Ontology sao
Small Angle Scattering Biological Data Bank sasbdb Small Angle Scattering Biological Data Bank (SASBDB) is a curated repository for small angle X-ray scattering (SAXS) and neutron scattering (SANS) data and derived models. Small angle scattering (SAS) of X-ray and neutrons provides structural information on biological macromolecules in solution at a resolution of 1-2 nm. SASBDB provides freely accessible and downloadable experimental data, which are deposited together with the relevant experimental conditions, sample details, derived models and their fits to the data.
Systems Biology Ontology sbo The goal of the Systems Biology Ontology is to develop controlled vocabularies and ontologies tailored specifically for the kinds of problems being faced in Systems Biology, especially in the context of computational modeling. SBO is a project of the BioModels.net effort.
Sickle Cell Disease Ontology scdo An ontology for the standardization of terminology and integration of knowledge about Sickle Cell Disease.
Selventa Chemicals schem Selventa legacy chemical namespace used with the Biological Expression Language
Selventa Complexes scomp Selventa legacy complex namespace used with the Biological Expression Language
Structural Classification of Protein scop The SCOP (Structural Classification of Protein) database is a comprehensive ordering of all proteins of known structure according to their evolutionary, functional and structural relationships. The basic classification unit is the protein domain. Domains are hierarchically classified into species, proteins, families, superfamilies, folds, and classes.
Scopus Researcher Identifier scopus Scopus is the largest abstract and citation database of peer-reviewed literature: scientific journals, books and conference proceedings. Delivering a comprehensive overview of the world's research output in the fields of science, technology, medicine, social sciences, and arts and humanities, Scopus features smart tools to track, analyze and visualize research.
ScerTF scretf ScerTF is a database of position weight matrices (PWMs) for transcription factors in Saccharomyces species. It identifies a single matrix for each TF that best predicts in vivo data, providing metrics related to the performance of that matrix in accurately representing the DNA binding specificity of the annotated transcription factor.
Spectral Database for Organic Compounds sdbs The Spectral Database for Organic Compounds (SDBS) is an integrated spectral database system for organic compounds. It provides access to 6 different types of spectra for each compound, including Mass spectrum (EI-MS), a Fourier transform infrared spectrum (FT-IR), and NMR spectra.
Sustainable Development Goals Interface Ontology sdgio An OBO-compliant ontology representing the entities referenced by the SDGs, their targets, and indicators.
Selventa iseasess sdis Selventa legacy disease namespace used with the Biological Expression Language
SED-ML data format sedml.format Data format that can be used in conjunction with the Simulation Experimental Description Markup Language (SED-ML).
SED-ML model format sedml.language Model format that can be used in conjunction with the Simulation Experimental Description Markup Language (SED-ML).
The SEED; seed This cooperative effort, which includes Fellowship for Interpretation of Genomes (FIG), Argonne National Laboratory, and the University of Chicago, focuses on the development of the comparative genomics environment called the SEED. It is a framework to support comparative analysis and annotation of genomes, and the development of curated genomic data (annotation). Curation is performed at the level of subsystems by an expert annotator, across many genomes, and not on a gene by gene basis. This collection references subsystems.
SEED Compound seed.compound This cooperative effort, which includes Fellowship for Interpretation of Genomes (FIG), Argonne National Laboratory, and the University of Chicago, focuses on the development of the comparative genomics environment called the SEED. It is a framework to support comparative analysis and annotation of genomes, and the development of curated genomic data (annotation). Curation is performed at the level of subsystems by an expert annotator, across many genomes, and not on a gene by gene basis. This collection references subsystems.
SEED Reactions seed.reaction ModelSEED is a platform for creating genome-scale metabolic network reconstructions for microbes and plants. As part of the platform, a biochemistry database is managed that contains reactions unique to ModelSEED as well as reactions aggregated from other databases or from manually-curated genome-scale metabolic network reconstructions.
Sample processing and separation techniques sep A structured controlled vocabulary for the annotation of sample processing and separation techniques in scientific experiments.
Scientific Evidence and Provenance Information Ontology sepio An ontology for representing the provenance of scientific claims and the evidence that supports them.
Selventa Families sfam Selventa legacy protein family namespace used with the Biological Expression Language
Saccharomyces Genome Database sgd The Saccharomyces Genome Database (SGD) project collects information and maintains a database of the molecular biology of the yeast Saccharomyces cerevisiae.
Saccharomyces genome database pathways sgd.pathways Curated biochemical pathways for Saccharomyces cerevisiae at Saccharomyces genome database (SGD).
Sol Genomics Network sgn The Sol Genomics Network (SGN) is a database and website dedicated to the genomic information of the nightshade family, which includes species such as tomato, potato, pepper, petunia and eggplant.
Animal Genome Sheep QTL sheepqtldb The Animal Quantitative Trait Loci (QTL) database (Animal QTLdb) is designed to house publicly all available QTL and single-nucleotide polymorphism/gene association data on livestock animal species. This collection references sheep QTLs.
Social Insect Behavior Ontology sibo Social Behavior in insects
SIDER Drug sider.drug SIDER (Side Effect Resource) is a public, computer-readable side effect resource that connects drugs to side effect terms. It aggregates dispersed public information on side effects. This collection references drugs in SIDER.
SIDER Side Effect sider.effect SIDER (Side Effect Resource) is a public, computer-readable side effect resource that connects drugs to side effect terms. It aggregates dispersed public information on side effects. This collection references side effects of drugs as referenced in SIDER.
Signaling Gateway signaling-gateway The Signaling Gateway provides information on mammalian proteins involved in cellular signaling.
Signaling Network Open Resource signor SIGNOR, the SIGnaling Network Open Resource, organizes and stores in a structured format signaling information published in the scientific literature.
SIGNOR Relation signor.relation Identifiers for relationships between proteins and complexes, along with their type and provenance
Semanticscience Integrated Ontology sio The semanticscience integrated ontology (SIO) provides a simple, integrated upper level ontology (types, relations) for consistent knowledge representation across physical, processual and informational entities.
Sequencing Initiative Suomi sisu The Sequencing Initiative Suomi (SISu) project is an international collaboration to harmonize and aggregate whole genome and exome sequence data from Finnish samples, providing data for researchers and clinicians. The SISu project allows for the search of variants to determine their attributes and occurrence in Finnish cohorts, and provides summary data on single nucleotide variants and indels from exomes, sequenced in disease-specific and population genetic studies.
SitEx sitex SitEx is a database containing information on eukaryotic protein functional sites. It stores the amino acid sequence positions in the functional site, in relation to the exon structure of encoding gene This can be used to detect the exons involved in shuffling in protein evolution, or to design protein-engineering experiments.
Stemcell Knowledge and Information Portal skip SKIP is aiming to promote the exchange of information and joint research between researchers by aggregating various information of stem cells (iPS cells, iPS cells derived from patients, etc.) to stimulate research on disease and regenerative medicine.
Simple Knowledge Organization System skos SKOS is an area of work developing specifications and standards to support the use of knowledge organization systems (KOS) such as thesauri, classification schemes, subject heading lists and taxonomies within the framework of the Semantic Web
Simple Modular Architecture Research Tool smart The Simple Modular Architecture Research Tool (SMART) is an online tool for the identification and annotation of protein domains, and the analysis of domain architectures.
C. elegans Small Molecule Identifier Database smid SMIDs (Small Molecule Identifiers) represent gene-style identifiers for small molecules newly identified in C. elegans and other nematodes. SMIDs aim to make life easier for describing biogenic small molecules in metabolomic and genomic applications.
Simplified molecular-input line-entry system smiles Documentation of SMILES (Simplified Molecular Input Line Entry System), a line notation (a typographical method using printable characters) for entering and representing molecules and reactions.
Small Molecule Pathway Database smpdb The Small Molecule Pathway Database (SMPDB) contains small molecule pathways found in humans, which are presented visually. All SMPDB pathways include information on the relevant organs, subcellular compartments, protein cofactors, protein locations, metabolite locations, chemical structures and protein quaternary structures. Accompanying data includes detailed descriptions and references, providing an overview of the pathway, condition or processes depicted in each diagram.
Snapshot snap Theoretical explanation of a purely spatial ontology supporting snapshot views of the world at successive instants of time, as part of a modular ontology of the dynamic features of reality.
SNOMED CT (International Edition) snomedct SNOMED CT (Systematized Nomenclature of Medicine -- Clinical Terms), is a systematically organized computer processable collection of medical terminology covering most areas of clinical information such as diseases, findings, procedures, microorganisms, pharmaceuticals, etc.
snoRNABase snornabase A comprehensive database of human H/ACA and C/D box snoRNAs.
SNP to Transcription Factor Binding Sites snp2tfbs SNP2TFBS is aimed at studying variations (SNPs/indels) that affect transcription factor binding (TFB) in the Human genome.
Sequence types and features ontology so The Sequence Ontology (SO) is a structured controlled vocabulary for the parts of a genomic annotation. It provides a common set of terms and definitions to facilitate the exchange, analysis and management of genomic data.
Suggested Ontology for Pharmacogenomics sopharm
Glycine max Genome Database soybase SoyBase is a repository for curated genetics, genomics and related data resources for soybean.
Span span Theoretical explanation of to a purely spatiotemporal ontology of change and process, as part of a modular ontology of the dynamic features of reality.
Spider Ontology spd An ontology for spider comparative biology including anatomical parts (e.g. leg, claw), behavior (e.g. courtship, combing) and products (i.g. silk, web, borrow).
Software Package Data Exchange License spdx The SPDX License List is a list of commonly found licenses and exceptions used in free and open source and other collaborative software or documentation. The purpose of the SPDX License List is to enable easy and efficient identification of such licenses and exceptions in an SPDX document, in source files or elsewhere. The SPDX License List includes a standardized short identifier, full name, vetted license text including matching guidelines markup as appropriate, and a canonical permanent URL for each license and exception.
SPIKE Map spike.map SPIKE (Signaling Pathways Integrated Knowledge Engine) is a repository that can store, organise and allow retrieval of pathway information in a way that will be useful for the research community. The database currently focuses primarily on pathways describing DNA damage response, cell cycle, programmed cell death and hearing related pathways. Pathways are regularly updated, and additional pathways are gradually added. The complete database and the individual maps are freely exportable in several formats. This collection references pathway maps.
Spectra Hash Code splash The spectra hash code (SPLASH) is a unique and non-proprietary identifier for spectra, and is independent of how the spectra were acquired or processed. It can be easily calculated for a wide range of spectra, including Mass spectroscopy, infrared spectroscopy, ultraviolet and nuclear magnetic resonance.
Signaling Pathways Project spp The Signaling Pathways Project is an integrated 'omics knowledgebase based upon public, manually curated transcriptomic and cistromic (ChIP-Seq) datasets involving genetic and small molecule manipulations of cellular receptors, enzymes and transcription factors. Our goal is to create a resource where scientists can routinely generate research hypotheses or validate bench data relevant to cellular signaling pathways.
FAIRsharing Subject Ontology srao The FAIRsharing Subject Ontology (SRAO) is an application ontology for the categorization of research disciplines across all research domains, from the humanities to the natural sciences. It utilizes multiple external vocabularies.
System Science of Biological Dynamics dataset ssbd.dataset Systems Science of Biological Dynamics database (SSBD:database) is an added-value database for biological dynamics. It provides a rich set of open resources for analyzing quantitative data and microscopy images of biological objects, such as single-molecule, cell, tissue, individual, etc., and software tools for analysis. Quantitative biological data and microscopy images are collected from a variety of species, sources, and methods. These include data obtained from both experiments and computational simulations.
System Science of Biological Dynamics project ssbd.project Systems Science of Biological Dynamics database (SSBD:database) is an added-value database for biological dynamics. It provides a rich set of open resources for analyzing quantitative data and microscopy images of biological objects, such as single-molecule, cell, tissue, individual, etc., and software tools for analysis. Quantitative biological data and microscopy images are collected from a variety of species, sources, and methods. These include data obtained from both experiments and computational simulations.
Statistical Torsional Angles Potentials stap STAP (Statistical Torsional Angles Potentials) was developed since, according to several studies, some nuclear magnetic resonance (NMR) structures are of lower quality, are less reliable and less suitable for structural analysis than high-resolution X-ray crystallographic structures. The refined NMR solution structures (statistical torsion angle potentials; STAP) in the database are refined from the Protein Data Bank (PDB).
The Statistical Methods Ontology stato STATO is the statistical methods ontology. It contains concepts and properties related to statistical methods, probability distributions and other concepts related to statistical analysis, including relationships to study designs and plots.
Search Tool for Interactions of Chemicals stitch STITCH is a resource to explore known and predicted interactions of chemicals and proteins. Chemicals are linked to other chemicals and proteins by evidence derived from experiments, databases and the literature.
Store DB storedb STOREDB database is a repository for data used by the international radiobiology community, archiving and sharing primary data outputs from research on low dose radiation. It also provides a directory of bioresources and databases for radiobiology projects containing information and materials that investigators are willing to share. STORE supports the creation of a low dose radiation research commons.
Search Tool for Retrieval of Interacting Genes/Proteins string STRING (Search Tool for Retrieval of Interacting Genes/Proteins) is a database of known and predicted protein interactions. The interactions include direct (physical) and indirect (functional) associations; they are derived from four sources:Genomic Context, High-throughput Experiments,(Conserved) Coexpression, Previous Knowledge. STRING quantitatively integrates interaction data from these sources for a large number of organisms, and transfers information between these organisms where applicable.
Semantic Types Ontology sty UMLS Semantic Network The Semantic Network consists of (1) a set of broad subject categories, or Semantic Types, that provide a consistent categorization of all concepts represented in the UMLS Metathesaurus, and (2) a set of useful and important relationships, or Semantic Relations, that exist between Semantic Types.
Bacillus subtilis genome sequencing project subtilist SubtiList serves to collate and integrate various aspects of the genomic information from B. subtilis, the paradigm of sporulating Gram-positive bacteria. SubtiList provides a complete dataset of DNA and protein sequences derived from the paradigm strain B. subtilis 168, linked to the relevant annotations and functional assignments.
SubtiWiki subtiwiki SubtiWiki is a scientific wiki for the model bacterium Bacillus subtilis. It provides comprehensive information on all genes and their proteins and RNA products, as well as information related to the current investigation of the gene/protein. Note: Currently, direct access to RNA products is restricted. This is expected to be rectified soon.
SugarBind sugarbind The SugarBind Database captures knowledge of glycan binding of human pathogen lectins and adhesins, where each glycan-protein binding pair is associated with at least one published reference. It provides information on the pathogen agent, the lectin/adhesin involved, and the human glycan ligand. This collection provides information on ligands.
SUPERFAMILY supfam SUPERFAMILY provides structural, functional and evolutionary information for proteins from all completely sequenced genomes, and large sequence collections such as UniProt.
Software Heritage swh Software Heritage is the universal archive of software source code.
SWISS-MODEL Repository swiss-model The SWISS-MODEL Repository is a database of 3D protein structure models generated by the SWISS-MODEL homology-modelling pipeline for UniProtKB protein sequences.
SwissLipid swisslipid SwissLipids is a curated resource that provides information about known lipids, including lipid structure, metabolism, interactions, and subcellular and tissue localization. Information is curated from peer-reviewed literature and referenced using established ontologies, and provided with full provenance and evidence codes for curated assertions.
SwissRegulon swissregulon A database of genome-wide annotations of regulatory sites. It contains annotations for 17 prokaryotes and 3 eukaryotes. The database frontend offers an intuitive interface showing genomic information in a graphical form.
Software ontology swo The Software Ontology (SWO) is a resource for describing software tools, their types, tasks, versions, provenance and associated data. It contains detailed information on licensing and formats as well as software applications themselves, mainly (but not limited) to the bioinformatics community.
Symptom Ontology symp The Symptom Ontology has been developed as a standardized ontology for symptoms of human diseases.
Gemina Symptom Ontology Identifier syoid
Toxin and Toxin Target Database t3db Toxin and Toxin Target Database (T3DB) is a bioinformatics resource that combines detailed toxin data with comprehensive toxin target information.
Tick Anatomy Ontology tads The anatomy of the Tick, <i>Families: Ixodidae, Argassidae</i>
Terminology of Anatomy of Human Embryology tahe
Terminology of Anatomy of Human Histology tahh
TAIR Gene tair.gene The Arabidopsis Information Resource (TAIR) maintains a database of genetic and molecular biology data for the model higher plant Arabidopsis thaliana. This is the reference gene model for a given locus.
The Arabidopsis Information Resource tair.locus The Arabidopsis Information Resource (TAIR) maintains a database of genetic and molecular biology data for the model higher plant Arabidopsis thaliana. The name of a Locus is unique and used by TAIR, TIGR, and MIPS.
TAIR Protein tair.protein The Arabidopsis Information Resource (TAIR) maintains a database of genetic and molecular biology data for the model higher plant Arabidopsis thaliana. This provides protein information for a given gene model and provides links to other sources such as UniProtKB and GenPept
Teleost Anatomy Ontology tao Multispecies fish anatomy ontology. Originally seeded from ZFA, but intended to cover terms relevant to other taxa
TarBase tarbase TarBase stores microRNA (miRNA) information for miRNA–gene interactions, as well as miRNA- and gene-related facts to information specific to the interaction and the experimental validation methodologies used.
Taxonomic rank vocabulary taxrank A vocabulary of taxonomic ranks (species, family, phylum, etc)
Transporter Classification Database ID tcdb The database details a comprehensive IUBMB approved classification system for membrane transport proteins known as the Transporter Classification (TC) system. The TC system is analogous to the Enzyme Commission (EC) system for classification of enzymes, but incorporates phylogenetic information additionally.
Terminologia Embryologica te The Terminologia Embryologica (TE) is a standardized list of words used in the description of human embryologic and fetal structures. It was produced by the Federative International Committee on Anatomical Terminology on behalf of the International Federation of Associations of Anatomists and posted on the Internet since 2010. [wikipedia]
Classification of Transcription Factors in Mammalia tfclass TFClass is a classification of eukaryotic transcription factors based on the characteristics of their DNA-binding domains. It comprises four general levels (superclass, class, family, subfamily) and two levels of instantiation (genus and molecular species). Two of them (subfamily and factor species) are optional. More detailed explanations about the classification scheme and its criteria are given here..
Tetrahymena Genome Database tgd The Tetrahymena Genome Database (TGD) Wiki is a database of information about the Tetrahymena thermophila genome sequence. It provides information curated from the literature about each published gene, including a standardized gene name, a link to the genomic locus, gene product annotations utilizing the Gene Ontology, and links to published literature.
Mosquito gross anatomy ontology tgma A structured controlled vocabulary of the anatomy of mosquitoes.
Terminologia Histologica th The Terminologia Histologica (TH) is the controlled vocabulary for use in cytology and histology. It was intended to replace Nomina Histologica. [wikipedia]
TIGR protein families tigrfam TIGRFAMs is a resource consisting of curated multiple sequence alignments, Hidden Markov Models (HMMs) for protein sequence classification, and associated information designed to support automated annotation of (mostly prokaryotic) proteins.
Tissue List tissuelist The UniProt Tissue List is a controlled vocabulary of terms used to annotate biological tissues. It also contains cross-references to other ontologies where tissue types are specified.
Tohoku University cell line catalog tkg Collection of cell lines by Tohoku University. This includes transplantable animal cell lines, such as Yoshida sarcoma and rat ascites hepatoma (AH series) cell lines as well as human, murine cell lines and hybridoma cells.
Plant Trait Ontology to A controlled vocabulary to describe phenotypic traits in plants.
Tree of Life tol The Tree of Life Web Project (ToL) is a collaborative effort of biologists and nature enthusiasts from around the world. On more than 10,000 World Wide Web pages, the project provides information about biodiversity, the characteristics of different groups of organisms, and their evolutionary history (phylogeny). Each page contains information about a particular group, with pages linked one to another hierarchically, in the form of the evolutionary tree of life. Starting with the root of all Life on Earth and moving out along diverging branches to individual species, the structure of the ToL project thus illustrates the genetic connections between all living things.
Topology Data Bank of Transmembrane Proteins topdb The Topology Data Bank of Transmembrane Proteins (TOPDB) is a collection of transmembrane protein datasets containing experimentally derived topology information. It contains information gathered from the literature and from public databases availableon transmembrane proteins. Each record in TOPDB also contains information on the given protein sequence, name, organism and cross references to various other databases.
TopFind topfind TopFIND is a database of protein termini, terminus modifications and their proteolytic processing in the species: Homo sapiens, Mus musculus, Arabidopsis thaliana, Saccharomyces cerevisiae and Escherichia coli.
ToxoDB toxoplasma ToxoDB is one of the databases that can be accessed through the EuPathDB (http://EuPathDB.org; formerly ApiDB) portal, covering eukaryotic pathogens of the genera Cryptosporidium, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Trichomonas and Trypanosoma. While each of these groups is supported by a taxon-specific database built upon the same infrastructure, the EuPathDB portal offers an entry point to all these resources, and the opportunity to leverage orthology for searches across genera.
Pathogen Transmission Ontology trans The Pathogen Transmission Ontology describes the tranmission methods of human disease pathogens describing how a pathogen is transmitted from one host, reservoir, or source to another host. The pathogen transmission may occur either directly or indirectly and may involve animate vectors or inanimate vehicles.
Transport Systems Tracker transyt The Transport Systems Tracker (TranSyT) is a tool to identify transport systems and the compounds carried across membranes.
TreeBASE treebase TreeBASE is a relational database designed to manage and explore information on phylogenetic relationships. It includes phylogenetic trees and data matrices, together with information about the relevant publication, taxa, morphological and sequence-based characters, and published analyses. Data in TreeBASE are exposed to the public if they are used in a publication that is in press or published in a peer-reviewed scientific journal, etc.
TreeFam treefam TreeFam is a database of phylogenetic trees of gene families found in animals. Automatically generated trees are curated, to create a curated resource that presents the accurate evolutionary history of all animal gene families, as well as reliable ortholog and paralog assignments.
TrichDB trichdb TrichDB is one of the databases that can be accessed through the EuPathDB (http://EuPathDB.org; formerly ApiDB) portal, covering eukaryotic pathogens of the genera Cryptosporidium, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Trichomonas and Trypanosoma. While each of these groups is supported by a taxon-specific database built upon the same infrastructure, the EuPathDB portal offers an entry point to all these resources, and the opportunity to leverage orthology for searches across genera.
TriTrypDB tritrypdb TriTrypDB is one of the databases that can be accessed through the EuPathDB (http://EuPathDB.org; formerly ApiDB) portal, covering eukaryotic pathogens of the genera Cryptosporidium, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Trichomonas and Trypanosoma. While each of these groups is supported by a taxon-specific database built upon the same infrastructure, the EuPathDB portal offers an entry point to all these resources, and the opportunity to leverage orthology for searches across genera.
TTD Drug ttd.drug The Therapeutic Target Database (TTD) is designed to provide information about the known therapeutic protein and nucleic acid targets described in the literature, the targeted disease conditions, the pathway information and the corresponding drugs/ligands directed at each of these targets. Cross-links to other databases allow the access to information about the sequence, 3D structure, function, nomenclature, drug/ligand binding properties, drug usage and effects, and related literature for each target.
TTD Target ttd.target The Therapeutic Target Database (TTD) is designed to provide information about the known therapeutic protein and nucleic acid targets described in the literature, the targeted disease conditions, the pathway information and the corresponding drugs/ligands directed at each of these targets. Cross-links to other databases are also introduced to facilitate the access of information about the sequence, 3D structure, function, nomenclature, drug/ligand binding properties, drug usage and effects, and related literature for each target.
Teleost taxonomy ontology tto An ontology covering the taxonomy of teleosts (bony fish)
Toxic Process Ontology txpo Elucidating the mechanism of toxicity is crucial in drug safety evaluations. TOXic Process Ontology (TXPO) systematizes a wide variety of terms involving toxicity courses and processes. The first version of TXPO focuses on liver toxicity. The TXPO contains an is-a hierarchy that is organized into three layers: the top layer contains general terms, mostly derived from the Basic Formal Ontology. The intermediate layer contains biomedical terms in OBO foundry from UBERON, Cell Ontology, NCBI Taxon, ChEBI, Gene Ontology, PATO, OGG, INOH, HINO, NCIT, DOID and Relational ontology (RO). The lower layer contains toxicological terms. In applied work, we have developed a prototype of TOXPILOT, a TOXic Process InterpretabLe knOwledge sysTem. TOXPILOT provides visualization maps of the toxic course, which facilitates capturing the comprehensive picture for understanding toxicity mechanisms. A prototype of TOXPILOT is available: https://toxpilot.nibiohn.go.jp
Uber Anatomy Ontology uberon Uberon is an integrated cross-species anatomy ontology representing a variety of entities classified according to traditional anatomical criteria such as structure, function and developmental lineage. The ontology includes comprehensive relationships to taxon-specific anatomical ontologies, allowing integration of functional, phenotype and expression data.
uBio NameBank ubio.namebank NameBank is a "biological name server" focused on storing names and objectively-derived nomenclatural attributes. NameBank is a repository for all recorded names including scientific names, vernacular (or common names), misspelled names, as well as ad-hoc nomenclatural labels that may have limited context.
Uberon Property ubprop
UCSC Genome Browser ucsc The UCSC Genome Browser is an on-line, and downloadable, genome browser hosted by the University of California, Santa Cruz (UCSC).[2][3][4] It is an interactive website offering access to genome sequence data from a variety of vertebrate and invertebrate species and major model organisms, integrated with a large collection of aligned annotations.
UM-BBD Compound umbbd.compound The University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD) contains information on microbial biocatalytic reactions and biodegradation pathways for primarily xenobiotic, chemical compounds. The goal of the UM-BBD is to provide information on microbial enzyme-catalyzed reactions that are important for biotechnology. This collection refers to compound information.
EAWAG Biocatalysis/Biodegradation Database umbbd.enzyme The University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD) contains information on microbial biocatalytic reactions and biodegradation pathways for primarily xenobiotic, chemical compounds. The goal of the UM-BBD is to provide information on microbial enzyme-catalyzed reactions that are important for biotechnology. This collection refers to enzyme information.
EAWAG Biocatalysis/Biodegradation Database umbbd.pathway The University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD) contains information on microbial biocatalytic reactions and biodegradation pathways for primarily xenobiotic, chemical compounds. The goal of the UM-BBD is to provide information on microbial enzyme-catalyzed reactions that are important for biotechnology. This collection refers to pathway information.
EAWAG Biocatalysis/Biodegradation Database umbbd.reaction The University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD) contains information on microbial biocatalytic reactions and biodegradation pathways for primarily xenobiotic, chemical compounds. The goal of the UM-BBD is to provide information on microbial enzyme-catalyzed reactions that are important for biotechnology. This collection refers to reaction information.
EAWAG Biocatalysis/Biodegradation Database umbbd.rule The University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD) contains information on microbial biocatalytic reactions and biodegradation pathways for primarily xenobiotic, chemical compounds. The UM-BBD Pathway Prediction System (PPS) predicts microbial catabolic reactions using substructure searching, a rule-base, and atom-to-atom mapping. The PPS recognizes organic functional groups found in a compound and predicts transformations based on biotransformation rules. These rules are based on reactions found in the UM-BBD database. This collection references those rules.
Unified Medical Language System umls The Unified Medical Language System is a repository of biomedical vocabularies. Vocabularies integrated in the UMLS Metathesaurus include the NCBI taxonomy, Gene Ontology, the Medical Subject Headings (MeSH), OMIM and the Digital Anatomist Symbolic Knowledge Base. UMLS concepts are not only inter-related, but may also be linked to external resources such as GenBank.
UniGene unigene A UniGene entry is a set of transcript sequences that appear to come from the same transcription locus (gene or expressed pseudogene), together with information on protein similarities, gene expression, cDNA clone reagents, and genomic location.
Unique Ingredient Identifier unii The purpose of the joint FDA/USP Substance Registration System (SRS) is to support health information technology initiatives by generating unique ingredient identifiers (UNIIs) for substances in drugs, biologics, foods, and devices. The UNII is a non- proprietary, free, unique, unambiguous, non semantic, alphanumeric identifier based on a substance’s molecular structure and/or descriptive information.
Unimod protein modification database for mass spectrometry unimod Unimod is a public domain database created to provide a community supported, comprehensive database of protein modifications for mass spectrometry applications. That is, accurate and verifiable values, derived from elemental compositions, for the mass differences introduced by all types of natural and artificial modifications. Other important information includes any mass change, (neutral loss), that occurs during MS/MS analysis, and site specificity, (which residues are susceptible to modification and any constraints on the position of the modification within the protein or peptide).
UniProt Archive uniparc The UniProt Archive (UniParc) is a database containing non-redundant protein sequence information from many sources. Each unique sequence is given a stable and unique identifier (UPI) making it possible to identify the same protein from different source databases.
UniPathway Compound unipathway.compound UniPathway is a manually curated resource of enzyme-catalyzed and spontaneous chemical reactions. It provides a hierarchical representation of metabolic pathways and a controlled vocabulary for pathway annotation in UniProtKB. UniPathway data are cross-linked to existing metabolic resources such as ChEBI/Rhea, KEGG and MetaCyc. This collection references compounds.
UniPathway Reaction unipathway.reaction UniPathway is a manually curated resource of enzyme-catalyzed and spontaneous chemical reactions. It provides a hierarchical representation of metabolic pathways and a controlled vocabulary for pathway annotation in UniProtKB. UniPathway data are cross-linked to existing metabolic resources such as ChEBI/Rhea, KEGG and MetaCyc. This collection references individual reactions.
UniProt protein ID uniprot The UniProt Knowledgebase (UniProtKB) is a comprehensive resource for protein sequence and functional information with extensive cross-references to more than 120 external databases. Besides amino acid sequence and a description, it also provides taxonomic data and citation information.
UniProt Chain uniprot.chain This collection is a subset of UniProtKB that provides a means to reference the proteolytic cleavage products of a precursor protein.
UniProt Diseases uniprot.disease The human diseases in which proteins are involved are described in UniProtKB entries with a controlled vocabulary.
UniProt Isoform uniprot.isoform The UniProt Knowledgebase (UniProtKB) is a comprehensive resource for protein sequence and functional information with extensive cross-references to more than 120 external databases. This collection is a subset of UniProtKB, and provides a means to reference isoform information.
UniProt Keywords uniprot.keyword UniProtKB entries are tagged with keywords that can be used to retrieve particular subsets of entries.
UniProt Subcellular Locations uniprot.location The subcellular locations in which a protein is found are described in UniProtKB entries with a controlled vocabulary, which includes also membrane topology and orientation terms.
UniProt Variants uniprot.var The purpose of the UniProtKB/Swiss-Prot variant pages is: to display the variant related information extracted from UniProtKB/Swiss-Prot, and to provide useful additional information such as the conservation of the modified residues across orthologous species.
UniRef uniref The UniProt Reference Clusters (UniRef) provide clustered sets of sequences from the UniProt Knowledgebase (including isoforms) and selected UniParc records in order to obtain complete coverage of the sequence space at several resolutions while hiding redundant sequences (but not their descriptions) from view.
UniRule unirule Rules are devised and tested by experienced curators using experimental data from manually annotated entries as templates. UniRule rules can annotate protein properties such as the protein name, function, catalytic activity, pathway membership, and subcellular location, along with sequence specific information, such as the positions of post-translational modifications and active sites.
Database of Sequence Tagged Sites unists UniSTS is a comprehensive database of sequence tagged sites (STSs) derived from STS-based maps and other experiments. STSs are defined by PCR primer pairs and are associated with additional information such as genomic position, genes, and sequences.
Molecular database for the identification of fungi unite UNITE is a fungal rDNA internal transcribed spacer (ITS) sequence database. It focuses on high-quality ITS sequences generated from fruiting bodies collected and identified by experts and deposited in public herbaria. Entries may be supplemented with metadata on describing locality, habitat, soil, climate, and interacting taxa.
Universal Natural Products Database unpd
Units of measurement ontology uo Ontology of standardized units
Unipathway upa A manually curated resource for the representation and annotation of metabolic pathways
Unified Phenotype Ontology upheno The uPheno ontology integrates multiple phenotype ontologies into a unified cross-species phenotype ontology.
United States Patent and Trademark Office uspto The United States Patent and Trademark Office (USPTO) is the federal agency for granting U.S. patents and registering trademarks. As a mechanism that protects new ideas and investments in innovation and creativity, the USPTO is at the cutting edge of the nation's technological progress and achievement.
ValidatorDB validatordb Database of validation results for ligands and non-standard residues in the Protein Data Bank.
Veterans Administration National Drug File vandf The National Drug File (NDF) is produced by the U.S. Department of Veterans Affairs, Veterans Health Administration (VHA). NDF is a centrally maintained electronic drug list used by the VHA hospitals and clinics. Facilities use the NDF to check drug interactions, to manage orders, and to send outpatient prescriptions to regional automated mail-out pharmacies. NDF includes information on clinical drugs, drug classes, ingredients and National Drug Code (NDC) Directory codes.
Variation Ontology vario The Variation Ontology (VariO) is an ontology for the standardized, systematic description of effects, consequences and mechanisms of variations. It describes the effects of variations at the DNA, RNA and/or protein level.
Integrative database of germ-line V genes from the immunoglobulin loci of human and mouse vbase2 The database VBASE2 provides germ-line sequences of human and mouse immunoglobulin variable (V) genes.
Viral Bioinformatics Resource Center vbrc The VBRC provides bioinformatics resources to support scientific research directed at viruses belonging to the Arenaviridae, Bunyaviridae, Filoviridae, Flaviviridae, Paramyxoviridae, Poxviridae, and Togaviridae families. The Center consists of a relational database and web application that support the data storage, annotation, analysis, and information exchange goals of this work. Each data release contains the complete genomic sequences for all viral pathogens and related strains that are available for species in the above-named families. In addition to sequence data, the VBRC provides a curation for each virus species, resulting in a searchable, comprehensive mini-review of gene function relating genotype to biological phenotype, with special emphasis on pathogenesis.
Bioinformatics Resource Center for Invertebrate Vectors of Human Pathogens vectorbase VectorBase is an NIAID-funded Bioinformatic Resource Center focused on invertebrate vectors of human pathogens. VectorBase annotates and curates vector genomes providing a web accessible integrated resource for the research community. Currently, VectorBase contains genome information for three mosquito species: Aedes aegypti, Anopheles gambiae and Culex quinquefasciatus, a body louse Pediculus humanus and a tick species Ixodes scapularis.
Vertebrate Genome Annotation Database vega A repository for high-quality gene models produced by the manual annotation of vertebrate genomes.
VegBank vegbank VegBank is the vegetation plot database of the Ecological Society of America's Panel on Vegetation Classification. VegBank consists of three linked databases that contain (1) vegetation plot records, (2) vegetation types recognized in the U.S. National Vegetation Classification and other vegetation types submitted by users, and (3) all plant taxa recognized by ITIS/USDA as well as all other plant taxa recorded in plot records. Vegetation records, community types and plant taxa may be submitted to VegBank and may be subsequently searched, viewed, annotated, revised, interpreted, downloaded, and cited.
Virtual Fly Brain vfb An interactive tool for neurobiologists to explore the detailed neuroanatomy, neuron connectivity and gene expression of the Drosophila melanogaster.
VFDB Gene vfdb.gene VFDB is a repository of virulence factors (VFs) of pathogenic bacteria.This collection references VF genes.
VFDB Genus vfdb.genus VFDB is a repository of virulence factors (VFs) of pathogenic bacteria.This collection references VF information by Genus.
Vertebrate Gene Nomenclature Committee vgnc The Vertebrate Gene Nomenclature Committee (VGNC) is an extension of the established HGNC (HUGO Gene Nomenclature Committee) project that names human genes. VGNC is responsible for assigning standardized names to genes in vertebrate species that currently lack a nomenclature committee.
Vertebrate Homologous Organ Group Ontology vhog
Virtual International Authority File viaf The VIAF® (Virtual International Authority File) combines multiple name authority files into a single OCLC-hosted name authority service. The goal of the service is to lower the cost and increase the utility of library authority files by matching and linking widely-used authority files and making that information available on the Web.
The Virus Infectious Disease Ontology vido The Virus Infectious Disease Ontology (IDO Virus) is an extension of the Infectious Disease Ontology (IDO). IDO Virus follows OBO Foundry guidelines, employs the Basic Formal Ontology as its starting point, and covers epidemiology, classification, pathogenesis, and treatment of terms used by Virologists, i.e. virus, prion, satellite, viroid, etc.
Virus Pathogen Resource vipr The Virus Pathogen Database and Analysis Resource (ViPR) supports bioinformatics workflows for a broad range of human virus pathogens and other related viruses. It provides access to sequence records, gene and protein annotations, immune epitopes, 3D structures, and host factor data. This collection references viral strain information.
ViralZone viralzone ViralZone is a resource bridging textbook knowledge with genomic and proteomic sequences. It provides fact sheets on all known virus families/genera with easy access to sequence data. A selection of reference strains (RefStrain) provides annotated standards to circumvent the exponential increase of virus sequences. Moreover ViralZone offers a complete set of detailed and accurate virion pictures.
VIRsiRNA virsirna The VIRsiRNA database contains details of siRNA/shRNA which target viral genome regions. It provides efficacy information where available, as well as the siRNA sequence, viral target and subtype, as well as the target genomic region.
Variation Modelling Collaboration vmc
VMH Gene vmhgene The Virtual Metabolic Human (VMH) is a resource that combines human and gut microbiota metabolism with nutrition and disease.
VMH metabolite vmhmetabolite The Virtual Metabolic Human (VMH) is a resource that combines human and gut microbiota metabolism with nutrition and disease.
VMH reaction vmhreaction The Virtual Metabolic Human (VMH) is a resource that combines human and gut microbiota metabolism with nutrition and disease.
Vaccine Ontology vo The Vaccine Ontology (VO) is a biomedical ontology in the domain of vaccine and vaccination. VO aims to standardize vaccine annotation, integrate various vaccine data, and support computer-assisted reasoning. VO supports basic vaccine R&D and clincal vaccine usage. VO is being developed as a community-based ontology with support and collaborations from the vaccine and bio-ontology communities.
Vocabulary of Interlinked Datasets void The Vocabulary of Interlinked Datasets (VoID) is an RDF Schema vocabulary for expressing metadata about RDF datasets. It is intended as a bridge between the publishers and users of RDF data, with applications ranging from data discovery to cataloging and archiving of datasets. This document provides a formal definition of the new RDF classes and properties introduced for VoID. It is a companion to the main specification document for VoID, Describing Linked Datasets with the VoID Vocabulary.
Vertebrate Skeletal Anatomy Ontology vsao Vertebrate skeletal anatomy ontology.
Veterinary Substances DataBase vsdb Veterinary pharmaceuticals are biologically active and potentially persistent substances which are recognised as a continuing threat to environmental quality. Whilst the environmental risk of agricultural pesticides has had considerable attention in recent decades, risks assessments for veterinary pharmaceuticals have only relatively recently began to be addressed. Risk assessments and risk modelling tend to be inherently data hungry processes and one of the main obstacles to consistent, accurate and efficient assessments is the need for a reliable, quality and comprehensive data source.
Vertebrate trait ontology vt An ontology of traits covering vertebrates
Vertebrate Taxonomy Ontology vto Comprehensive hierarchy of extinct and extant vertebrate taxa.
Veterans Health Administration (VHA) unique identifier vuid The Veterans Health Administration is America’s largest integrated health care system, providing care at 1,293 health care facilities, including 171 medical centers and 1,112 outpatient sites of care of varying complexity (VHA outpatient clinics), serving 9 million enrolled Veterans each year.
ViralZone vz ViralZone is a SIB Swiss Institute of Bioinformatics web-resource for all viral genus and families, providing general molecular and epidemiological information, along with virion and genome figures. Each virus or family page gives an easy access to UniProtKB/Swiss-Prot viral protein entries.
WormBase RNAi wb.rnai WormBase is an online bioinformatics database of the biology and genome of the model organism Caenorhabditis elegans and related nematodes. It is used by the C. elegans research community both as an information resource and as a mode to publish and distribute their results. This collection references RNAi experiments, detailing target and phenotypes.
C. elegans Gross Anatomy Ontology wbbt Ontology about the gross anatomy of the C. elegans
C. elegans development ontology wbls Ontology about the development and life stages of the C. elegans
C. elegans phenotype wbphenotype Ontology about C. elegans and other nematode phenotypes
Web Elements webelements Browser for the periodic table of the elements
WGS84 Geo Positioning wgs84 A vocabulary for representing latitude, longitude and altitude information in the WGS84 geodetic reference datum
Wikidata wikidata Wikidata is a collaboratively edited knowledge base operated by the Wikimedia Foundation. It is intended to provide a common source of certain types of data which can be used by Wikimedia projects such as Wikipedia. Wikidata functions as a document-oriented database, centred on individual items. Items represent topics, for which basic information is stored that identifies each topic.
Wikidata Property wikidata.property Wikidata is a free and open knowledge base that can be read and edited by both humans and machines. Wikidata acts as central storage for the structured data of its Wikimedia sister projects including Wikipedia, Wikivoyage, Wiktionary, Wikisource, and others.
WikiGenes wikigenes WikiGenes is a collaborative knowledge resource for the life sciences, which is based on the general wiki idea but employs specifically developed technology to serve as a rigorous scientific tool. The rationale behind WikiGenes is to provide a platform for the scientific community to collect, communicate and evaluate knowledge about genes, chemicals, diseases and other biomedical concepts in a bottom-up process.
WikiPathways ID wikipathways WikiPathways is a resource providing an open and public collection of pathway maps created and curated by the community in a Wiki like style. All content is under the Creative Commons Attribution 3.0 Unported license.
Wikipedia wikipedia.en Wikipedia is a multilingual, web-based, free-content encyclopedia project based on an openly editable model. It is written collaboratively by largely anonymous Internet volunteers who write without pay.
C. elegans ORFeome cloning project worfdb WOrfDB (Worm ORFeome DataBase) contains data from the cloning of complete set of predicted protein-encoding Open Reading Frames (ORFs) of Caenorhabditis elegans. This collection describes experimentally defined transcript structures of unverified genes through RACE (Rapid Amplification of cDNA Ends).
WormBase database of nematode biology wormbase WormBase is an online bioinformatics database of the biology and genome of the model organism Caenorhabditis elegans and other nematodes. It is used by the C. elegans research community both as an information resource and as a mode to publish and distribute their results. This collection references WormBase-accessioned entities.
Wormpep wormpep Wormpep contains the predicted proteins from the Caenorhabditis elegans genome sequencing project.
World Register of Marine Species worms The World Register of Marine Species (WoRMS) provides an authoritative and comprehensive list of names of marine organisms. It includes synonyms for valid taxonomic names allowing a more complete interpretation of taxonomic literature. The content of WoRMS is administered by taxonomic experts.
World Wildlife Fund Ecoregion wwf.ecoregion WWF ecoregions are large unit of land or water containing a geographically distinct assemblage of species, natural communities, and environmental conditions.
Xenopus Anatomy Ontology xao XAO represents the anatomy and development of the African frogs Xenopus laevis and tropicalis.
Experimental condition ontology xco Conditions under which physiological and morphological measurements are made both in the clinic and in studies involving humans or model organisms.
Xenbase xenbase Xenbase is the model organism database for Xenopus laevis and X. (Silurana) tropicalis. It contains genomic, development data and community information for Xenopus research. it includes gene expression patterns that incorporates image data from the literature, large scale screens and community submissions.
Cross-linker reagents ontology xl A structured controlled vocabulary for cross-linking reagents used with proteomics mass spectrometry.
HUPO-PSI cross-linking and derivatization reagents controlled vocabulary xlmod A structured controlled vocabulary for cross-linking reagents used with proteomics mass spectrometry.
Xenopus Phenotype Ontology xpo XPO represents anatomical, cellular, and gene function phenotypes occurring throughout the development of the African frogs Xenopus laevis and tropicalis.
XML Schema Definition xsd This document describes the XML Schema namespace. It also contains a directory of links to these related resources, using Resource Directory Description Language.
XUO xuo
Yeast Deletion and the Mitochondrial Proteomics Project ydpm The YDPM database serves to support the Yeast Deletion and the Mitochondrial Proteomics Project. The project aims to increase the understanding of mitochondrial function and biogenesis in the context of the cell. In the Deletion Project, strains from the deletion collection were monitored under 9 different media conditions selected for the study of mitochondrial function. The YDPM database contains both the raw data and growth rates calculated for each strain in each media condition.
Yeast Intron Database v4.3 yeastintron The YEast Intron Database (version 4.3) contains information on the spliceosomal introns of the yeast Saccharomyces cerevisiae. It includes expression data that relates to the efficiency of splicing relative to other processes in strains of yeast lacking nonessential splicing factors. The data are displayed on each intron page. This is an updated version of the previous dataset, which can be accessed through [MIR:00000460].
YeTFasCo yetfasco The Yeast Transcription Factor Specificity Compendium (YeTFasCO) is a database of transcription factor specificities for the yeast Saccharomyces cerevisiae in Position Frequency Matrix (PFM) or Position Weight Matrix (PWM) formats.
Yeast Intron Database v3 yid The YEast Intron Database (version 3) contains information on the spliceosomal introns of the yeast Saccharomyces cerevisiae. It includes expression data that relates to the efficiency of splicing relative to other processes in strains of yeast lacking nonessential splicing factors. The data are displayed on each intron page. An updated version of the database is available through [MIR:00000521].
Yeast Metabolome Database ymdb The Yeast Metabolome Database (YMDB) is a manually curated database of small molecule metabolites found in or produced by Saccharomyces cerevisiae (also known as Baker’s yeast and Brewer’s yeast).
YRC PDR yrcpdr The Yeast Resource Center Public Data Repository (YRC PDR) serves as a single point of access for the experimental data produced from many collaborations typically studying Saccharomyces cerevisiae (baker's yeast). The experimental data include large amounts of mass spectrometry results from protein co-purification experiments, yeast two-hybrid interaction experiments, fluorescence microscopy images and protein structure predictions.
Maize gross anatomy zea
Zebrafish Experimental Conditions Ontology zeco Ontology of Zebrafish Experimental Conditions
Zebrafish anatomy and development ontology zfa A structured controlled vocabulary of the anatomy and development of the Zebrafish
ZFIN Gene ID zfin ZFIN serves as the zebrafish model organism database. This collection references all zebrafish biological entities in ZFIN.
Zebrafish developmental stages ontology zfs Developmental stages of the Zebrafish
ZINC is not Commercial zinc ZINC is a free public resource for ligand discovery. The database contains over twenty million commercially available molecules in biologically relevant representations that may be downloaded in popular ready-to-dock formats and subsets. The Web site enables searches by structure, biological activity, physical property, vendor, catalog number, name, and CAS number.
Zebrafish Phenotype Ontology zp The Zebrafish Phenotype Ontology formally defines all phenotypes of the Zebrafish model organism.