How to use a Gemfile.initial file to speed up Docker builds with an extra cached layer
We recently wrote a blog post about a script that updates version strings in a Dockerfile
. We've since identified a better way to achieve the same goals.
A Rails application will have gem dependencies listed in files called Gemfile
and a Gemfile.lock
. It can take a long time to install gem dependencies when you're building a Docker image. To speed this up, you can put these lines near the top of your Dockerfile
:
COPY Gemfile Gemfile.lock ./
RUN bundle install
Docker will cache the bundle install
step, and it will only re-run this step if there are any changes in Gemfile
or Gemfile.lock
.
However, you probably use some core gems that don't change too often, such as Rails, Rake, Nokogiri, etc. These do get updated from time to time, but probably not as often as you are adding or updating other dependencies. It would be great if you could cache these gems separately so that you don't have to download them every time your Gemfile.lock
changes.
You can achieve this by adding a second Gemfile
to your application, and call it something like Gemfile.initial
. This file should only contain a few gems that you would like to cache independently. You just need to make sure that you set explicit versions for these gems, otherwise Bundler might install a different version.
The nice thing about a Gemfile
is that it's just a Ruby file, so we can write some code to automate this and parse the versions from the original Gemfile.lock
:
source 'https://rubygems.org'
INITIAL_CACHED_GEMS = %w[
rails rake sprockets mysql2 bootsnap nokogiri newrelic_rpm rack-cors
].freeze
lock_file = [
File.join(__dir__, 'Gemfile.lock'),
File.join(__dir__, 'Gemfile.initial.lock')
].find {|f| File.file? f }
lockfile_parser = Bundler::LockfileParser.new(Bundler.read_file(lock_file))
INITIAL_CACHED_GEMS.each do |gem_name|
spec = lockfile_parser.specs.find { |s| s.name == gem_name }
gem_version = spec.version.to_s
gem gem_name, gem_version
end
This might not look like any Gemfile
you've seen before, but rest assured that it's a valid Gemfile
, and you can use it with bundle install
. The INITIAL_CACHED_GEMS
constant is an array of gem names that we would like to cache first. This code will parse the Gemfile.lock
file to find the pinned versions of each gem, and set this as an explicit version requirement. Then we can be sure that Gemfile
and Gemfile.initial
will install exactly the same versions of gems.
Here's how we can put this all together in an updated Dockerfile
:
Limitations and Next Steps
The main downside to this approach is that you will need to remember to run bundle install --gemfile Gemfile.initial
to generate a new Gemfile.initial.lock
whenever you change one of the gem versions. You could add a CI job to check for this and fail the build if you need to update the file (or even automatically commit and push the changes to the branch.)
We may be able to write a Bundler plugin that will take care of all of these steps. It would be great if there was a Bundler command that could install a subset of your gems and use/generate a separate lock file.
If you have any thoughts or comments, please feel free to send us an email: engineering@docspring.com