We use a variety of environments to support our data science work. These include:
These environments are used to support a range of activities, including:
Each environment has its own set of tools and configurations, and we use them in different ways depending on the task at hand. For example, we might use a local environment for development and testing, while using a docker environment for deployment and reproducibility.
We recommend using mamba for managing local environments. Mamba is a fast, robust, and cross-platform package manager that can be used to create and manage conda environments. It is a drop-in replacement for conda, and it is fully compatible with the conda ecosystem.
We use Docker to create and manage containerized environments. These environments are used for deployment, testing, and reproducibility. We publish our docker images to Quay.io.
We recommend using Docker in the following scenarios:
There are several approaches to creating Docker images. We typically use repo2docker
to create Docker images from a GitHub repository and GitHub Actions to build and push the image to Quay.io. This approach allows us to automatically build and publish Docker images whenever we push changes to the repository. An example GitHub Action workflow for building and pushing a Docker image can be found in the carbonplan/argo-docker repository