Preventing Flaky Tests With A Custom RuboCop Rule

In this blog post, I will talk about how I found and fixed a flaky test in our RSpec test suite, and how I wrote a custom RuboCop rule to prevent it from happening again.

DocSpring uses Stripe to handle credit card processing. We have a large test suite that covers lots of different billing flows and edge cases. We use Stripe's billing customer portal, which allows customers to update their credit card details and view invoices. Stripe hosts this portal for us, so we don't have to build all of this functionality ourselves.

I wrote an integration test that makes sure we can generate a link to this billing portal page. This test was "flaky", which means that it was randomly failing from time to time. Flaky tests can be really annoying, especially when they only fail once every few weeks. It's very difficult to fix something if you don't know how to make it happen consistently.

I figured out that this test failure was order-dependent, which means that it was only failing whenever another test ran before it. We run our RSpec tests in a random order, which helps us find issues like this. The problem was happening because we use the stripe-ruby-mock gem to mock requests to Stripe's API. This mocking behavior is enabled by calling StripeMock.start, and turned off by calling StripeMock.stop As you might have guessed by now, we had a test that wasn't tidying up after itself by calling StripeMock.stop.

The stripe-ruby-mock docs suggest using it like this:

describe MyApp do
  let(:stripe_helper) { StripeMock.create_test_helper }
  before { StripeMock.start }
  after { StripeMock.stop }

I had started with this approach, but it looks like I had accidentally moved the StripeMock.stop call into a nested block. This means that only some of the tests in this spec file were calling StripeMock.stop. If we got unlucky and tests were run in a different order, we'd end up with StripeMock enabled for the rest of our test suite. This didn't actually cause too many problems, apart from this one billing portal test. This test makes a live request to Stripe's API, which we record/replay using VCR.

If you look at our test failure logs in CircleCI above, you'll notice StripeMock complaining that it doesn't know about the post /v1/billing_portal/sessions API endpoint. This is how I figured out the issue, because I realized that StripeMock shouldn't have been running at all.

The solution was to make sure that our tests always call StripeMock.stop whenever they call StripeMock.start. But maybe there's a better way to do this, instead of using separate before and after blocks?

We can actually group these calls together by using an around block, like this:

    around(:each) do |ex|
      StripeMock.start
      ex.run
      StripeMock.stop
    end

This around block will start StripeMock, run the test, and then stop StripeMock once the test is finished. This convention keeps the #start and #stop calls right next to each other, so it helps to prevent us from accidentally removing the #stop call or moving it somewhere else during a refactor.

Now that I've fixed the flaky test, I also want to make sure that it never happens again. I want to prevent myself (and other developers on our team) from calling StripeMock.start in a before block, and only allow it inside an around block.

We use RuboCop, which is a Ruby static code analyzer (a.k.a. linter) and code formatter. RuboCop comes with many cops that can check your code for style, formatting, security issues, code complexity, among many other things. Many gems also have their own set of RuboCop rules. We currently use the RuboCop extensions for:

Rails
Rake
RSpec
Capybara
FactoryBot

I decided to try writing a custom RuboCop rule (or "cop") that would scan our RSpec test files for StripeMock.start. I would like it to warn us whenever we use it in a before block instead of an around block.

Here's the code I wrote for our custom cop (with some help from ChatGPT!) I saved this file at lib/rubocop/cop/stripe_mock_usage.rb

# frozen_string_literal: true

require 'rubocop'

# This cop checks that StripeMock.start and StripeMock.stop are only used within an around block (instead of before/after.)
# This prevents us from accidentally starting StripeMock in a before(:all) or before(:each) block and
# forgetting to stop it, which can cause flaky tests.
module RuboCop
  module Cop
    module RSpec
      class StripeMockUsage < Cop
        MSG = 'Only use `StripeMock.start` and `StripeMock.stop` within an `around(:each)` block.'

        def_node_search :stripe_mock_start?, <<~PATTERN
          (send (const nil? :StripeMock) :start)
        PATTERN

        def_node_search :stripe_mock_stop?, <<~PATTERN
          (send (const nil? :StripeMock) :stop)
        PATTERN

        def on_block(node)
          return unless %i[before after].include? node.send_node.method_name

          # Search within the before/after block for StripeMock.start and StripeMock.stop calls
          node.each_descendant(:send) do |send_node|
            if stripe_mock_start?(send_node) || stripe_mock_stop?(send_node)
              add_offense(send_node, message: MSG)
            end
          end
        end
      end
    end
  end
end

See the "Development" page in the RuboCop docs for more information on writing new cops.

I then enabled this custom cop by adding the following to our .rubocop.yml:

require:
  - ./lib/rubocop/cop/stripe_mock_usage.rb

RSpec/StripeMockUsage:
  Enabled: true
  Include:
    - 'spec/**/*_spec.rb'

Now we get a warning whenever we call StripeMock.start or StripeMock.stop in a before or after block:

New StripeMock warning showing in RuboCop output

It's not a perfect check, but I think this could be a nice reminder. My personal philosophy is that I try to fully stamp out any flaky tests, bugs, or security issues so that they can never happen again. I haven't truly fixed an issue unless I've also added a new test or linting rule.

I enjoyed learning more about the Ruby AST and how RuboCop works internally. I have a few other ideas for cops that might be helpful, so I might be writing some more custom cops in the future. Thanks for reading!

Preventing Flaky Tests With A Custom RuboCop Rule

Further reading:

Start filling out PDFs in minutes

Product

Developers

Company

Legal