Skip to content

Latest commit

 

History

History
120 lines (82 loc) · 6.33 KB

Docker-use-cases.md

File metadata and controls

120 lines (82 loc) · 6.33 KB

Doker's powerfull functionalities


runs on your machine -> runs on others

  • portable deployment across the machines
  • environment needed for your application to run comes with the container
  • 'if it works on your machine, it will work on others too'

runs on your machine -> runs in production


application centric

  • Docker is optimized for deployment of applications, not machines or systems

automated builds

  • use docker build to build an image based on Dockerfile
  • automated building on DockerHub after pushing to git repository

versioning

  • docker tracks all changes in successive versions of the container
  • it is possible to commit, diff and roll back
  • like git, docker uses incremental uploads and downloads, so only diff is send

component re-use

  • any image can be used as a base for another image
  • you can build one generic image and then multiple specialized versions of that one

sharing

  • public registry Docker Hub with images uploaded by other people
  • you can see Dockerfile for each container
  • official containers maintained by Docker team

tool ecosystem


Use cases in Bioinformatics

making bioinformatics tools/pipelines easy to deploy on any infrastructure

you start developing a tool on your laptop, then you move it to the cloud (ubuntu) for some more serious analysis, but after a month you want ot put it on the local cluster (CentOS) for production use


making it work for others

you already use git (and maybe github) for version control and collaboration in your team, you would like to do the same with deployment, to know that if it worked on your machine, when you pushed the changes, it will work on your colleges machine when he pulls them


making it work for people who read you paper

it's great if you can point in the paper to a git commit relevant to publication, but that does not guarantee that they can install and run it, with docker you can point them to particular commit for the image and be sure they can run it anywhere they need to, you can even make whole paper reproducible

Example:

Bremges, A., Maus, I., Belmann, P., Eikmeyer, F., Winkler, A., Albersmeier, A., Puhler, A., Schluter, A., Sczyrba, A.: (2015) Deeply sequenced metagenome and metatranscriptome of a biogas-producing microbial community from an agricultural production-scale biogas plant.GigaScience 4:33 doi:10.1186/s13742-015-0073-6 Docker accessible version of the study


don't install anything twice

if you're installing a new aligner that you want to try, put it in a container, so that when you need it in the cloud or on the cluster you don't have to install it again


performance

Docker containerization has a negligible impact on the execution performance of common genomic pipelines where tasks are generally very time consuming. The minimal performance loss introduced by the Docker engine is offset by the advantages of running an analysis in a self-contained and precisely controlled runtime environment. Docker makes it easy to precisely prototype an environment, maintain all its variations over time and rapidly reproduce any former configuration one may need to re-use. These capacities guarantee consistent results over time and across different computing platforms.

The impact of Docker containers on the performance of genomic pipelines https://dx.doi.org/10.7287/peerj.preprints.1171v2


others on Docker?


other interesting posts


other intersting Docker based projects:


see this as a slideshow here