Jorge Manrubia

March 12, 2022

Changing critical code paths with scientist


I recently worked on improving the inbound email analysis system in HEY. This system analyzes every email that hits HEY to decide whether it should flag it as spam, bounce it, or warn the user about specific problems such as having a forged sender or containing a virus. In its current form, the system was making it difficult to add some features we wanted, so we decided to rework it before adding new stuff.

I had heard of a GitHub library called scientist years ago, and I thought this would be a good use case for it. Its tagline, "A Ruby library for carefully refactoring critical paths", matched precisely what we needed. HEY ingests millions of emails every day, and the last thing I wanted was to introduce unintended changes here. Those could be as subtle as not showing the right warning for certain invalid senders or as catastrophic as flagging legit emails as spam in mass. Minimizing such possibility was worth some special attention.

The library offers many options, but the idea is simple: you define both the current code and the new code for the path you want to refactor. Scientist will execute both. In production, in case of a resulting mismatch, it will serve the current code result, and it will let you register the mismatch via a callback. In other environments, such as development or testing, you can configure it to raise an error when a mismatch happens. This way, you get the best of both worlds: the certainty that production behavior won't change and the confidence that tests are exercising the new code.

This is how we used scientist in one of the analysis pipelines:

def reject_inbound_email?
  science "inbound-email" do |experiment|
    experiment.context inbound_email_id:
    experiment.use { old_reject_inbound_email? }
    experiment.try { new_reject_inbound_email? }

And this is the configuration we used. It basically logs the mismatches in Sentry and makes sure that it raises an error when a mismatch happens during testing.

class ScientistExperiment
  include Scientist::Experiment

  def enabled?

  def publish(result)
    unless result.matched?
      Sentry.with_scope do |scope|
        Sentry.capture_message "#{name} mismatch"

ScientistExperiment.raise_on_mismatches = true if Rails.env.test?

The library won't work in every situation. You need to make sure that invoking both paths is feasible, which is not always the case. Also, it relies on both paths returning values you can compare easily. Considering these constraints, using scientists adds enormous confidence when deploying these refactors. If things go south and some bug happens, it won’t cause trouble, but you will still know about it. This approach looks obvious in hindsight, but I had never thought of it until I used this library.

Looking back, scientist worked wonderfully. The new system is live, and we have already removed the old code after making sure it worked as intended. It only registered a minor mismatch we could quickly fix, but more importantly, it made these critical deploys way calmer. What's the price of that? If we had a similar need in the future, I would repeat.


About Jorge Manrubia

A programmer who writes about software development and many other topics. I work at 37signals.