Skip to content
This repository was archived by the owner on May 5, 2022. It is now read-only.
This repository was archived by the owner on May 5, 2022. It is now read-only.

Docker Round-Up #562

@migurski

Description

@migurski

As part of my work with Mapbox, I am focusing on making the OpenAddresses codebase easier to install and use for developers. Based on recent conversations and research, I've identified a few use cases:

  • Getting Machine up and running for direct development of the code (mostly me)
  • Using Machine locally for rapid iteration on new data sources (e.g. @trescube)
  • Adapting Machine for new types of data, such as parcels

Some of the difficulties installing and using Machine include:

  • Compiled requirements such as GDAL 2.1.0 require too much toolchain understanding
  • Chef installation scripts assume ownership of an entire Linux machine
  • Big differences between how Machine is used locally vs. on AWS can make docs complex

Over the past couple years, Docker has emerged as a potential approach toward speeding up local installation of Linux software regardless of host OS. This has come up in OA a few times, e.g. #159 and #547. Conversations with a few developers suggest that the presence of a virtual machine configuration such as Vagrant or Docker is a signal that code is easy to approach and painless to get running. As a stopgap that eases local installation without duplicating the work in our Chef recipes, @jalessio has recommended Vagrant in PR #559. It’s a big improvement over the current situation, and retains our existing use of Chef under CI and in production.

Docker may be a good long-term solution. We need to validate against the three main environments where Machine code is run:

  1. Locally on a developer’s Mac, Windows, or Linux computer with continuous edits to the code
  2. In a CI environment like Travis or Circle CI for ensuring quality of new code changes
  3. In production on AWS where we use Cloudwatch metrics, Auto Scaling Groups, and other features to maintain a running service which can respond to spikes in demand from OA contributors

If Docker's going to work for us, I think it needs to work cleanly on each of these three environments, successfully replace Chef, and not introduce substantial new pain for developers or contributors. So far, I’ve done a few things:

  • Created this Dockerfile for Machine based on Chef recipes
  • Tested build times and found that it’s slower than Chef when starting from scratch, but not unreasonable
  • Checked build sizes and found that all of Machine and all its dependencies can be packed into a ~400MB Docker image
  • Tested using Docker in EC2 User Data scripts and found that it works just fine

I expect to work on this over the next few weeks, and I would love feedback and input.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions