There are several aspects to the way the Bioregistry web application is deployed listed in this section.
bioregistry.io domain is registered with Namecheap and costs about $33 per year. It is managed
and supported by the INDRA Lab, a part of the Laboratory of
Systems Pharmacology and Harvard Program in Therapeutic Science
(HiTS) at Harvard Medical School.
The Bioregistry is hosted on an Amazon Elastic Compute Cloud (EC2) via a load balancing service to stay secure and highly available. It is managed and supported by the INDRA Lab, a part of the Laboratory of Systems Pharmacology and Harvard Program in Therapeutic Science (HiTS) at Harvard Medical School.
These are the software and operating system specifications for the currently running instance of the Bioregistry:
A Docker image is automatically built nightly following the update workflow on GitHub Actions and pushed to the biopragmatics/bioregistry DockerHub repository. This image is built with the Python 3.9 alpine base image, which significantly reduces non-essential components. The final compressed image weights less than 40 MB of disk space and runs inside Docker with about 65 MB of memory at baseline. This could easily fit on a dedicated t4g.nano instance on AWS that costs about $37/year on-demand or around $20/year reserved.
The Bioregistry's EC2 instance runs the following script on a cron job that stops the current running instance, pulls the latest image from this DockerHub repository and starts it back up. The whole process only takes a few seconds.
#!/bin/bash # /data/services/restart_bioregistry.sh # Store the container's hash BIOREGISTRY_CONTAINER_ID=$(docker ps --filter "name=bioregistry" -aq) # Stop and remove the old container, taking advantage of the fact that it's named specifically if [ -n "BIOREGISTRY_CONTAINER_ID" ]; then docker stop $BIOREGISTRY_CONTAINER_ID docker rm $BIOREGISTRY_CONTAINER_ID fi # Pull the latest docker pull biopragmatics/bioregistry:latest # Run the start script, remove -d to run interactively docker run -id --name bioregistry -p 8766:8766 biopragmatics/bioregistry:latest
This script can be put on the EC2 instance and run via SSH with:
#!/bin/bash ssh -i ~/.ssh/<credentials>.pem <user>@<address> 'sh /data/services/restart_bioregistry.sh'
The SSL/TLS certificate for
bioregistry.io so it can be served with HTTPS is managed through
the AWS Certificate Manager.
The deployment of the Bioregistry is currently funded by the DARPA Young Faculty Award W911NF2010255 grant. Due to its low cost, the Laboratory of Systems Pharmacology at Harvard Medical School has additionally committed discretionary (i.e., not tied to a project) funding to continue running the Bioregistry instance at https://bioregistry.io.
The Bioregistry project is also interested in acquiring grants to help ensure its longevity. We give a conservative estimate of around $100-200/year to support domain registration, AWS costs, and personnel time for maintenance. This means that the Bioregistry could cost as little as $1,000 to run for 10 years (which is much longer than most scientific databases/websites are able to persist).
The Bioregistry can be mirrored following these instructions.
The Bioregistry can be deployed using custom content by following these instructions.
Stakeholders in the Bioregistry have been interested in questions including:
These questions do not have easy answers and apply to most databases, software, and web applications in the life sciences. As first steps towards addressing those, we have written explicit, public, well-defined contribution guidelines, code of conduct, and project governance.
If you would like to be part of this discussion and/or development of these policies, you can try the following:
Content negotiation was implemented in PR #682 in order to better comply with FAIR-ness evaluations such as th FAIR Enough Evaluation