Visual Regression Testing with BackstopJS

[ quality  automation  testing  backstopjs  ]

I first came across BackstopJS when I started as a QA Engineer in March. My team had a Drupal 7 project that was suffering from pretty regular visual regressions when we deployed. The visual regressions on the project were consistent only in their frustration - sometimes a deploy to the dev environment would be fine, but staging and prod would have regressions; or dev would have regressions, but then staging wouldn’t. At the time, the site had about 60 individual pages and several integrations, and it was pretty tedious to check for regressions manually.

I needed to find a tool to help us avoid deploying code that would cause visual regressions. After doing some research, I decided to use BackstopJS, an npm package created for visual regression testing. It uses configuration files to target specific URLs and CSS selectors, and takes both reference and test screenshots to compare against each other. BackstopJS also runs a report that shows the screenshot comparison, and includes information like how much the comparison failed by.

BackstopJS sets up directories to house the reference screenshots, test screenshots, and regression reports. These folders are automatically re-generated each time the associated command is run (e.g. backstop test to begin generating the test screenshots), and the path for each is customizable - so you can choose the setup that best fits your project. A visual regression report is automatically generated, saved, and can be accessed later; which is great for sharing the results with our PMs and developers, as well as with our clients.

Figuring Out a Process

When I first set up BackstopJS, I created a config file for each of our four environments, so we could check each environment as our deploys progressed up the chain. I used a naming convention that would make sense across projects, e.g. local_backstop.json, dev_backstop.json, staging_backstop.json, and prod_backstop.json. Each environment has its own database and varying states of code deploys, but we tend to use prod as the “source of truth” for adherence to quality and features, so I used the prod site for the page and CSS selectors in my config files.

Because the history of this project was so fraught with regressions, I went through the site and included every URL in the BackstopJS configuration file. I was also determined to use a variety of selectors for each page to ensure coverage (something that’s especially annoying in Drupal). It ended up that each of the 60 URLs I was testing had about 8-10 CSS selectors; and I was testing on both desktop and mobile viewports. This means that I was using BackstopJS to create about 1200 reference screenshots, plus the same number of testing screenshots, on each of the four environments. It was one of those tasks where you start it running, then go eat lunch and play a few rounds of Towerfall (see http://xkcd.com/303/).

Having so many URLs and selectors wasn’t working as well as I wanted. I chose some selectors originally because they covered several sections within a page, but it turned out that BackstopJS would only hit the first instance of the selector, instead of using it multiple times.

Time was also an issue. It took FOREVER to run each test. BackstopJS 1.x version included a command to make sure the npm server didn’t time out; but if I forgot to run that command before I started a test, the server would die out by the time the report was ready to run, and I wouldn’t be able to view the report - making the entire test process a useless task. And when I did remember to keep the npm server open, I also had to remember to stop it after the report finished. If I didn’t stop and then restart the npm server for each test, the next config file would serve the report from the previous test, and I would again have wasted that time.

I took a step back, and thought about what BackstopJS reported on, and how it compared the reference and test screenshots. With a low mismatch threshold (I had mine set to 0.1%), BackstopJS will catch extremely nuanced differences, in everything from margins to image placement to font weights. And in spite of a somewhat cumbersome process, BackstopJS had already helped us catch and mitigate regressions before they made it to prod. It was obviously a good tool, but there was less of a need to be so finicky and precise about what it was checking.

Iterating My Practice

I decided to scale down. I used a much smaller sampling of URLS in my config files - many pages used the same layout, so a regression showing in one or two of those should be enough to catch regressions in the similar pages. I also changed my viewports sizes for mobile and desktop. Instead of using specific WxH for each, I chose the width to match the device, and just gave each one a height value of 3000px. Now it didn’t matter what selectors I chose, or how far down on the page they were, since each screenshot would cover 3000px worth of content. So I also changed my selectors, and only used body and footer for each URL. I’ve gone from about 1200 screenshots per environment, down to about 160!

Now that I’ve got my configuration down to a repeatable and logical process, I’ll start setting up most of our projects with visual regression tests. We currently do support for about 15 Drupal projects, and BackstopJS is a great smoke test for our monthly security updates. For these support projects, I set up symlinks to the Drupal files directory from the BackstopJS test folder. This means that none of the screenshots get committed to the repo, so we avoid unnecessary repo bloat. The configuration files are committed to the repo as part of a tests/backstop directory, so any developer has access to them, and can run visual regression tests locally against their feature branch. The process for this would be:

  1. Check out the dev branch, or whichever branch they want to use as source of truth
  2. Navigate to <project root>/tests/backstop and run the command backstop reference --configPath=local_backstop.json
  3. When the reference screenshots are done, check out the feature branch and run any necessary updates, such as feature reverts, database updates, and CSS compiling
  4. Making sure they’re back in <project root>/tests/backstop, run backstop test --configPath=local_backstop.json

The practice is similar for running BackstopJS against deploys up to our various environments. If someone was deploying to production, the steps they’d need to follow would be:

  1. Navigate back to <project root>/tests/backstop, where the config files are located
  2. Run backstop reference --configPath=prod_backstop.json, to get the pre-deploy comparisons (The branch they’re on doesn’t matter now, since the config file is referencing the prod URL, not their local site)
  3. Once the references have finished running, deploy the code and make any necessary changes, like indexing Solr or clearing caches
  4. Run backstop test --configPath=prod_backstop.json

For our green field projects, the BackstopJS references will get updated at the end of every sprint, after we deploy the sprint’s work up to prod. For support projects, the references will get updated after our weekly deploy day. This ensures that I’ll always have a fresh set of references to test against as needed.

Setting up BackstopJS on a Project

The npm package page has nicely detailed documentation on setting up and using BackstopJS, but I’ll run through the steps here as well on the assumption that it’s always nice to have another tutorial. With the recent upgrade to 2.x, BackstopJS can be installed globally, and this is what I’d recommend. Some of these steps are customized for my personal workflow, so I’ll try to note where your process could differ. Below these steps, I’ve included a config file example as well.

  1. Make sure you have node and npm installed on your computer, then run npm install -g backstopjs
  2. Navigate to your project root and create a tests/backstop directory. NOTE: It’s my preference to keep all tests within a tests directory, so you don’t clutter up the project root with them.
  3. From the backstopjs folder that you’ve just created, run backstop genConfig. This creates the default backstop.json config file, and the backstop_data folder (which will eventually house the folders containing the screenshots and reports).
  4. Rename backstop.json to prod_backstop.json. NOTE: This is to support a logical naming convention within the project, and so other developers can easily identify which config file to use for each environment.
  5. Edit the config file to set the values for id, define your viewport sizes, and set the paths for saving the screenshots and reports.
  6. If you’re going to be using symlinks instead of saving everything directly in the project repo, this is the time to set that up. Don’t forget that the path for saving files needs to point to the original source, not the symlink locations; and the path should start at the level where your config files are.
  7. Create all of your scenarios for the config file. Use logical names for the label, separating with hyphens or underscore instead of spaces, so you can use the label name for incremental updates later. I’d also recommend adding the SSL Casper flags (seen in the example below).

At this point, you’ve installed BackstopJS globally through npm, and generated and customized your first config file. With my personal workflow, the next step would be to make 3 copies of prod_backstop.json and rename them for each environment (staging_backstop.json, dev_backstop.json, and local_backstop.json). If you’re using SublimeText, ‘command+D’ is your friend for quickly modifying all of the environment-specific names in the new config files. Below is an example of a BackstopJS config file for a prod environment.

{
  "id": "project_prod_config",
  "viewports": [
   {
// Really long height to let BackstopJS capture the full page without worrying about selectors
      "name": "desktop",
      "width": 1300,
      "height": 3000
    },
    {
      "name": "mobile",
      "width": 400,
      "height": 3000
    }
  ],
  "scenarios": [
    {
// The label can be used as a filter to run incremental reference or test screenshots
      "label": "item0-projectProd-homepage",
      "url": "https://project.com/",
      "hideSelectors": [],
      "removeSelectors": [],
      "selectors": [
        "#content"
      ],
      "delay": 500,
// The low mismatch number means that BackstopJS will catch extremely small visual changes
      "misMatchThreshold" : 0.1
    },
    {
      "label": "item1-projectProd-suitesLanding",
      "url": "https://project.com/suites",
      "hideSelectors": [],
      "removeSelectors": [],
      "selectors": [
        "#content"
      ],
      "delay": 500,
      "misMatchThreshold" : 0.1
    },
 ],
  "paths": {
// When using symlinks, the paths for saving the images needed to point to the actual file path
// Not to the symlinks
// The point of reference is from this config file's location.
    "bitmaps_reference": "../../webroot/sites/default/files/backstopjs_tests/prod_reference",
    "bitmaps_test": "../../webroot/sites/default/files/backstopjs_tests/prod_test",
    "html_report": ../../"webroot/sites/default/files/backstopjs_tests/prod_html_report",
    "ci_report": "../../webroot/sites/default/files/backstopjs_tests/prod_ci_report"
  },
  "casperFlags": [
// Without this flag, BackstopJS had trouble accessing the local site because of https notices
// This tells BackstopJS to ignore it
    "--ignore-ssl-errors=true",
    "--ssl-protocol=any"
  ],
  "engine": "phantomjs",
// With the 2.x upgrade, reporting in your CLI happens by default
  "report": ["browser"],
  "cliExitOnFail": false,
// This can be changed to true for extremely verbose CLI reporting
  "debug": false
}

BackstopJS Benefits

I would definitely recommend BackstopJS for any developer or company looking to add to their QA toolkit, for a few reasons:

  • With the recent upgrade from 1.x to 2.x, BackstopJS is extremely quick to setup and be testing within minutes
  • The configuration preferences are easy to customize per project, including where the references, tests, and reports get saved
  • You can incrementally update reference and test screenshots by adding a filter flag to your commands, which means you can easily keep your regression tests up to date as you build out a new website or add content to an existing site
  • The reports are saved to their own folder, so you can access the browser report at any time by running the backstop openReport command. This is especially helpful for being able to show the visual regression results to a product owner or client.
  • The package owner is very responsive to talking about and fixing open issues, and takes pride in making sure the product he’s created remains useful.

Visual regression testing has already benefited Metal Toad and our clients. The Drupal 7 project that was plagued with visual regressions? We’ve run into visual regressions a few more times - and with BackstopJS in place, we’ve caught it every single time. It’s also helped catch an issue on another project where the subnav menu was disappearing, when a seemingly unrelated ticket caused problems with the site’s menus. Without the regression tests, that bug would have been deployed all the way up to production. Recently, we inherited a new support project that needed security and module updates after going without them for about 6 months, and the regression tests caught missing content as a result of some of those updates. Whether used as a basic smoke test, or more detailed testing to prevent visual regressions, BackstopJS is a very useful tool, and has quickly become an important part of our commitment to code quality.

Have you used BackstopJS for visual regression tests? How has it helped your team? If there’s another framework that you’ve tried, what pros and cons have you discovered?

Written on November 3, 2016