-
Notifications
You must be signed in to change notification settings - Fork 36
Docker Round-Up #562
Description
As part of my work with Mapbox, I am focusing on making the OpenAddresses codebase easier to install and use for developers. Based on recent conversations and research, I've identified a few use cases:
- Getting Machine up and running for direct development of the code (mostly me)
- Using Machine locally for rapid iteration on new data sources (e.g. @trescube)
- Adapting Machine for new types of data, such as parcels
Some of the difficulties installing and using Machine include:
- Compiled requirements such as GDAL 2.1.0 require too much toolchain understanding
- Chef installation scripts assume ownership of an entire Linux machine
- Big differences between how Machine is used locally vs. on AWS can make docs complex
Over the past couple years, Docker has emerged as a potential approach toward speeding up local installation of Linux software regardless of host OS. This has come up in OA a few times, e.g. #159 and #547. Conversations with a few developers suggest that the presence of a virtual machine configuration such as Vagrant or Docker is a signal that code is easy to approach and painless to get running. As a stopgap that eases local installation without duplicating the work in our Chef recipes, @jalessio has recommended Vagrant in PR #559. It’s a big improvement over the current situation, and retains our existing use of Chef under CI and in production.
Docker may be a good long-term solution. We need to validate against the three main environments where Machine code is run:
- Locally on a developer’s Mac, Windows, or Linux computer with continuous edits to the code
- In a CI environment like Travis or Circle CI for ensuring quality of new code changes
- In production on AWS where we use Cloudwatch metrics, Auto Scaling Groups, and other features to maintain a running service which can respond to spikes in demand from OA contributors
If Docker's going to work for us, I think it needs to work cleanly on each of these three environments, successfully replace Chef, and not introduce substantial new pain for developers or contributors. So far, I’ve done a few things:
- Created this Dockerfile for Machine based on Chef recipes
- Tested build times and found that it’s slower than Chef when starting from scratch, but not unreasonable
- Checked build sizes and found that all of Machine and all its dependencies can be packed into a ~400MB Docker image
- Tested using Docker in EC2 User Data scripts and found that it works just fine
I expect to work on this over the next few weeks, and I would love feedback and input.