Opting out of A/B Tests while Running your Automated Tests

At Laurel & Wolf, we extensively introduce & test new features as part of A/B or split testing. Basically we can split traffic coming to a particular page into different groups and then show those groups different variations of a page or object to see if those changes lead to different behavior. Most commonly this is used to test if small (or large) changes can drive some statistically significant positive change in conversion of package or eCommerce sales. The challenge when running automated UI tests and A/B tests together is that your tests are no longer deterministic (having one dependable state).

Different sites conduct A/B tests differently depending on their systems. With our system we’re able to bucket users into groups once the land on the page which then either sets the AB Test and its variant options in local storage or as a cookie. Here are a few ways we successfully dealt with the non-determinism of A/B Tests.

Opting out of A/B Tests (by taking the control variant):

  1. Opt into the Control variant by appending query params to the route (e.g. www.kenst.com?nameOfExperiment=NameOfControlVariant). Most frameworks offer the ability to check for a query param to see if it should force a variant for an experiment and ours was no exception. For the tests we knew were impacted by A/B tests we could simply append to the base_url the query params for the experiment name and the control variant to avoid changing anything. This method didn’t require us to really care about where the experiment + variant were set (either as a cookie or localStorage) but really only worked for a single A/B test for a given test.
  2. Opt into the Control variant by setting localStorage. We often accomplished this by simply running some JS on the page that would set our localStorage (e.g.  @driver.execute_script("localStorage.setItem('NameOfExperiment', 'NameofControlVariant')")). Depending on where the A/B test sat within a test and/or if there were more than one, this was often the easiest way to opt out assuming the A/B test was set it localStorage.
  3. Opt into the Control variant by setting a cookie. Like the example for localStorage we often accomplished setting the cookie in the same way using JS (e.g.@driver.manage.add_cookie(name: 'NameOfExperiment', value: 'NameofControlVariant'). Again this was another simple way of allowing us to opt out of an experiment or even multiple experiments when necessary.

I know there are a few other ways to approach this. Alister Scott mentions a few ways here and Dave Haeffner mentions how to opt out of an Optimizely A/B test by setting a cookie here. How do you deal with your A/B tests? Do you opt-out and/or go for the control route as well?

Oh and if this article worked for you please consider sharing it or buying me coffee!

A typical day of Testing (circa 2018)

Recently I found myself repeatedly describing how I approach my testing role in a “typical day” and afterwards I thought it would be fun to capture some things I said to see how this might evolve over time:


  • At Laurel & Wolf our engineering team works in 2 week sprints and we try to deploy to production on a daily basis.
  • We work within a GitHub workflow which essentially means our changes happen in feature branches, when those branches are ready they become Pull Requests (PRs), those PRs automatically run against our CI server, then get code reviewed and are ready for final testing by myself and our other QA.
  • All of our JIRA stories are attached to at least one PR depending on the repository / systems being updated.
  • Everyone on the engineering team has a fully loaded MacBook Pro with Docker containers running our development environment for ease of use.


  • I’ll come in and if not already working on something, I’ll take the highest priority item in the QA column of our tracking system JIRA. (I try to evenly share the responsibility and heavy lifting with my fellow tester although I do love to jump into big risky changes.)
  • Review the GitHub PR to see /how big/ the code changes are and what files were impacted. You never know when this will come into play and at the very least it’s another source of input for what I might test. I will also do some basic code review looking for unexpected file changes or obvious challenges.
    • Checkout the branch locally and pull in the latest changes and get everything running.
  • If the story is large and/or complex, I’ll open up Xmind to visually model the system and changes mentioned in the story.
    • Import the acceptance criteria into as requirements. (An important input to testing but not the sole oracle).
    • Pull in relevant mockups or other sources cited either in the story, subtasks or other supporting documentation.
    • Use a few testing mnemonics like SFDIPOT and MIDTESTD from the Heuristic Test Strategy Model to come up with test ideas.
    • Pull in relevant catalogs of data, cheat sheets and previous mindmaps that might contain useful test ideas I might mix in. (When I do this I often update old test strategies with new ideas!)
    • Brainstorm relevant test techniques to apply based on the collected information and outline those tests or functional coverage areas at a high level. Depending on the technique (or what I brainstormed using mnemonics) I might select the data I’m going to use, make a table or matrix, or functional outline but it depends on the situation. All of these can be included or linked to the map.
  • If the story is small or less complex, I’ll use a notebook / moleskine for this same purpose but on a smaller scale. Frankly I also use the notebook in conjunction with the mind map as well.
    • I’ll list out any relevant questions I have that I want to answer during my exploration. Note any follow up questions during my testing.
    • I don’t typically time-box sessions but I almost always have a charter or two written down in my notebook or mindmap.
  • Start exploring the system and the changes outlined in the story (or outline in my mindmap) while generating more test ideas and marking down what things matched up with the acceptance criteria and my modeled system. Anything that doesn’t match I follow up on. Depending on the change I might:
    • Watch web requests in my browser and local logs to make sure requests are succeeding and the data or requests are what I expect it to be.
    • Inspect a page in DevTools for the browser or in React or Redux.
    • Reference the codebase, kick off workers to make something happen, manipulate or generate data, reference third party systems, etc.
    • Backend or API changes are usually tested in isolation and then with the features they are built to work with and then as a part of a larger system.
    • Look for testability improvements and hopefully address time during this time
    • Add new JIRA stories for potential automation coverage
    • I repeat this process until I’m able to find less bugs, less questions and/or I’m able to generate less and less test ideas that seem valuable. And/or I may stop testing after a certain amount of time depending on the desired ship date (unless there is something blocking) or a set time box.
  • When bugs are found and a few things can happen:
    • If they are small, I’ll probably try to fix them myself otherwise
    • I’ll let the developer know directly (slack message or PR comment) so they can begin working as I continuing to test
    • or I’ll pair with them so we can work through hard to reproduce issues.
    • If they are non-blocking issues I’ll file a bug report for addressing later
  • Run our e2e tests to try to catch known regression and make sure we didn’t break any of the tests.
    • This doesn’t happen often, our tests are fairly low maintenance.
  • Once changes are promoted to Staging I’ll do some smaller amounts of testing including testing on mobile devices. We do some of this automatically with our e2e tests.


  • Pickup a JIRA story for automation coverage and start writing new tests.
  • Investigate or triage a bug coming from our design community (an internal team that works directly with our interior designers).

circa 2018

I’m probably forgetting a ton of things but this is roughly what a “typical day” looks like for me. I expect this to evolve over time as data analysis becomes more important and as our automation suites grow.

What does your typical day look like?

Practice using Selenium Now!

Have you ever wanted to learn a little bit about Selenium WebDriver but didn’t know where to start?

Turns out there are some good tips / tutorials online for practicing writing Selenium in Ruby. One of those is a newsletter called Elemental Selenium that has something like 70 tips. You can sign up for the newsletter if you want but what I found valuable was to look at several of these tips, write them out (don’t copy + paste) and make sure you understand what they do. Turns out when you do this and you commit them to a repo, you can reference back to them when you come across similar problems in the future.

Simply stated [highlight]the goal is to:[/highlight]

  1. Read through the Elemental Selenium tips and then write (don’t copy + paste) the code yourself.
  2. Try running the tests locally and see how things work.
  3. Once you’ve written a few tests, refactor those example tests so they become more DRY (don’t repeat yourself). Create page objects.
  4. Commit these to your own repo on GitHub.

Putting your code on GitHub will have the benefit of showing you can write (at least) basic selenium automation. Although this code may not be your “best foot forward” given how new you’ll be, it is a starting point. As you learn more and make improvements, your code will reflect this.

These tips are (hopefully) grouped correctly but within each group there may be some variance. See if you can do one or two per day (some will be easier than others). If you see something interesting and want to jump to it immediately, go for it.

Beginning to Intermediate:

Intermediate to Advanced:

I’ve recommend a few people try this exercise because I found it valuable. Am I missing anything else? Has anyone else done something similar but in a different language or tool?

Installing GeckoDriver on macOS

Overview of naming conventions

  • GeckoDriver is the library you need to download to be able to use Selenium WebDriver with Firefox. These are the Selenium Bindings.
  • Marionette is the protocol which Firefox uses to communicate with GeckoDriver. Installed by default with Firefox.
  • FirefoxDriver is the former name of GeckoDriver.

Ways to install GeckoDriver:

  1. The easiest way to install GeckoDriver is to use a package manager like brew or npm such as npm install geckodriver. This method requires you some package manager installed but you probably should anyways.
  2. Run Firefox and GeckoDriver in a container using Docker. Simply download the combined container, start it and point your code at the right address. I’ve written about how to do this using Chrome, should be very similar to do Firefox.
  3. Specify it in your Selenium setup code. If you go this route, you can include ChromeDriver as well.
  4. Download the driver and add its location to your System PATH. These instructions are for Chrome but should work for GeckoDriver as well.

Once this is done, it should work like nothing has changed. The big advantage is you’ll now be able to use Firefox browsers newer than 46!

Selenium-WebDriver 2.53.x not working with Firefox 47 and beyond

The problem

I’m used to running selenium tests against Firefox locally (OS X Yosemite and now MacOS Sierra) both from the command line using RSpec and when using a REPL like IRB or Pry. I don’t use FF often so when I started having issues I couldn’t remember how long it had been since things were working. The problem was pretty obvious. The browser would launch, wouldn’t accept any commands or respond to Selenium and then error out with the message:

Selenium::WebDriver::Error::WebDriverError: unable to obtain stable firefox connection in 60 seconds ( from /usr/local/rvm/gems/ruby-2.1.2/gems/selenium-webdriver-2.53.0/lib/selenium/webdriver/firefox/launcher.rb:90:in `connect_until_stable’

This occurred for Selenium-WebDriver versions 2.53.0 and 2.53.4. It also seemed to occur for Firefox versions 47, 48, and 49.

The solution

Downgrade to Firefox 45.0 Extended Service Release (ESR).

I’m not the first one to post about the upcoming changes and lack of support for Firefox 47+. I probably deserve the error message for not paying more attention to the upcoming changes and will certainly look forward to implementing the new MarionetteDriver.

Installing SafariDriver on macOS

Safari + WebDriver aka SafariDriver comes included in Safari 10 which means as long as you have Safari 10 and later versions you can point your tests at Safari and run them without installing anything else.

Safari now provides native support for the WebDriver API. Starting with Safari 10 on OS X El Capitan and macOS Sierra, Safari comes bundled with a new driver implementation that’s maintained by the Web Developer Experience team at Apple. Safari’s driver is launchable via the /usr/bin/safaridriver executable, and most client libraries provided by Selenium will automatically launch the driver this way without further configuration. – WebKit Announcement 

That’s right, no brew installs and no system PATH setups!

Additional References:

Debugging Selenium code with IRB

Occasionally something will change in our system under test that breaks a Selenium test (or two). Most of the time we can walk through the failure, make some tweaks and run it again – repeating the process until it passes. Depending on how long it’s been since we last worked with the code, or how deeply buried the code is, it may not be enough to fix the test and we may have to tackle one or more of the underlying methods we used to build the code.

In these situations it can be helpful to debug our tests using an interactive prompt or REPL. In the case of Ruby we can use irb or Interactive Ruby to manually step through the Selenium actions, watching and learning. Here’s the general format for working with Selenium in irb; it’s very similar to how we code it:

  1. Launch interactive ruby: irb
  2. First, import the Selenium library: require 'selenium-webdriver'
  3. Second, create an instance of Selenium and launch the browser: driver = Selenium::WebDriver.for :firefox
    1. If you’ve got Chrome installed with an updated PATH, you can also swap firefox for chrome.
  4. Third, start entering your Selenium commands: driver.get 'http://www.google.com'

Here’s a simple example using irb:

In this example I’m bringing up Chrome as my browser and navigating to Google. I’m finding the search query box, typing my last name ‘Kenst’ and clicking enter. Thus an example of searching Google for my last name!

Technically once the browser is up we can navigate to whatever page we need without typing ALL of the individual commands. This is really valuable in those instances when you need to login, then navigate several pages before getting to the place you can debug. In other words do all the setup outside of irb, directly in the browser. Once you are in the proper location step through your code one command at a time (and lines with => show the responses to our commands, if any). Those responses will help us debug our tests by confirming what elements Selenium is picking up and what our methods are returning (if anything).

Additional References:

Review of The Selenium GuideBook: Ruby Edition

tl;dr If you’ve ever wanted to learn Selenium but didn’t know where to start, The Selenium GuideBook is the place (doesn’t matter which edition you use, it’ll be good).


Learning Selenium

The challenge of trying to learn Selenium WebDriver (simply referred to as Selenium) or any relatively new technology is that there is either too much disparate information (on the web), the material is out of date (many of the books I found) or of poor quality. With Selenium too many of the tutorials available to beginners had you using the Selenium IDE, which is a really poor option for writing maintainable and reusable test code. (See my previous post’s video for more on this.) I’ve walked out on conference workshops that sought to teach people to use the Selenium IDE to start their automation efforts. It wasn’t for me. I was going to do it right, I just had to figure out what that meant and where to start.

From the start I knew I wanted to learn about GUI test automation and more specifically Selenium WebDriver. I had tried WATIR (Web Application Testing in Ruby) and a few other tools but Selenium was open source and quickly becoming the go-to standard for automating human interaction with web browsers. It was and is the only choice.

Naturally I went searching the web for some tutorial or examples when I stumbled across several tutorials including Sauce Lab’s boot camp written by someone named Dave Haeffner. After struggling through the Bootcamp series (and finding some bugs in the process) I found Dave also produced a tip series called Elemental Selenium. I signed up for the Ruby weekly newsletter tips and went through many of the tips. Satisfied that Dave was worth learning from (good quality, relevant code examples) I decided it was time to try his book The Selenium GuideBook. I knew going into it, I was going to be the person maintaining the test suite and since I was more or less comfortable with Ruby I was happy The Selenium GuideBook came in that language!

Book Options

There are a few packages (book, code examples, videos, etc. ) for the language of your choice. As I said above I was more or less comfortable with Ruby so I ended up getting the “Ruby Edition, Just The Book” package. If I was doing this over today I probably would have done the “Cheat Sheets + Book” package and for JavaScript instead of Ruby.

The package itself contains a lot of great information and a number of materials:

  • The Selenium GuideBook; the Ruby edition is roughly 100 pages
  • Ruby code examples broken out by chapter
  • Elemental Selenium Tips eBook in Ruby
  • The Automated Visual Testing Guidebook

The first time I went through the book and code examples, it seemed redundant having different code for each chapter and example. It was only after I had gone through the chapters and examples for a second time, trying to apply and understand the differences that I began to understand the relative importance of seeing the code change chapter by chapter, example by example. The code examples all target an open source application Dave created called The Internet. It seems simple enough but many of the books and materials I went through either tried using Google or some badly written / hurried example.

The Book

Despite being less than 100 pages the Selenium GuideBook covers:

  • Defining a test strategy for your automation goals
  • Programming basics
    • Using page objects and base page objects to keep code clean
  • Locator strategies and tools
    • Relying on good locators seemed like a smart way to design tests. I wanted to avoid any record and playback tools and the poor locator strategies often employed.
  • How to write good acceptance tests
  • Writing Re-usable test code
  • Running on browsers locally and then in the cloud
  • Running tests in parallel
  • Taking your newly built tests, tagging them and getting them loaded into a CI server.

The whole package literally. In the preface Dave says the book is not full and comprehensive, it’s more of a distilled and actionable guide. The book really shows you how to start thinking about and building a test suite. It’s up to the user to take what they learned here and apply it to their application. That’s the fun part of the Elemental Selenium tips as well.

Applying the Book and Examples

After I had gone through all of the chapters and examples once, I went back through the relevant to me chapters and examples doing the following:

  1. Start with some very simple login tests.
    1. The book starts out this way as well. Writing tests that are not DRY or re-usable. but eventually get that way.
  2. Continuing through the code examples, getting a little more complicated and applying it to my own application.
    1. As I built out tests and start to see commonly repeated patterns, abstract out repeated code into relevant page objects. Eventually getting to a base page object.

In hindsight the hardest part of applying the book was trying to understand and apply a locator strategy within our application. While The Internet target application is great, it’s also a bit simplistic. Good for teaching, hard for bridging the sample application to the target application. Our single page .NET application was far more complicated and it took several attempts before I understood how my own strategy would work.

The transfer problem is always difficult. I mean how do you take what you learned and apply it to a slightly more sophisticated problem? It’s a difficult problem, not really a criticism of this book. It’s worth noting that whenever I had questions about what was written in the book, found a bug or two, or got stuck I could email Dave and within a week or so get a helpful response back.

Mission Accomplished!

With the lessons in this book and the Elemental Selenium Tips I was, through some focused time and lots of iterations, able to get a fairly good set of automated acceptance tests running against our application.

In other words, I highly recommend you buy the book. It’s slightly more expensive than similar books about Selenium but it’s far more effective. You are also directly supporting the author and with free updates and email tech support I think its well worth the cost.

Don’t believe me? Watch this video of Dave giving an overview of the GuideBook:

Additional References:

  1. http://www.tjmaher.com/2015/06/spotlight-on-dave-haeffner.html
  2. http://knowledge-anxiety.blogspot.com/2015/06/the-selenium-guidebook-and-thoughts-on.html
  3. http://davehaeffner.com/works/
  4. https://seleniumguidebook.com/

The Promise and Failure of Record and Playback

I came across the below video of Bret Pettichord’s keynote presentation to the Selenium Conference in 2011 called “Science and Stories and Test Automation”. Much of the talk covers his experience with Test Automation, specifically the promise and failure of record and playback over the last 20 years (I think). Just this historical perspective makes the video worth watching.

Pettichord, best known in the testing community as one of the authors of Lessons Learned in Software Testing, towards the end of the keynote calls out the Selenium community for falling victim to popularity like the commercial market did on Record and Playback with the Selenium IDE (their version of record and playback). Instead Pettichord says doing what we know is right should triumph in the long run which, in the case of Test Automation, means creating maintainable and useful automation code. Not building a product (like Selenium IDE) to make it easier for people to use but that doesn’t actually work.


Additional References:

Running Rspec acceptance tests in TeamCity

At work we use TeamCity as our CI service to automate the build and deployment of our software to a number of pre-production environments for testing and evaluation. Since we’re already bottling up all the build and deployment steps for our software, I figured we could piggy back on this process and kick off a simple login test. It seems faster and easier to have an automated test tell us this, than to wait until someone stumbles across it. After all who cares if a server has the latest code if you can’t login to use it?

Note: I’m calling the test that attempts to login to our system a sanity test. It could just as easily described it as a smoke test.

The strategy looked something like:

  • Make sure tests are segmented (at least one ‘sanity’ test)
  • Hook up tests to Jenkins as a proof of concept
  • Once the configuration works in Jenkins (and I’ve figured out any additional pre-reqs), reconfigure tests in TeamCity to run “sanity tests” (which is a tag)
  • If sanity tests prove stable, add additional tests or segments

Segmenting tests is a great way to run certain parts of the test suite at a time. Initially this would be a single login test since logging in is a pre-cursor to anything else we’d want to do in the app. For our test framework RSpec this was done by specifying a ‘sanity’ tag.

There didn’t appear to be any guidelines or instructions on the interwebs on how you might configure TeamCity to run RSpec tests but I found them for another CI, Jenkins. Unlike TeamCity, Jenkins is super easy to set up and configure: download the war file from the Jenkins homepage, launch it from the terminal and create a job! Our test code is written in ruby which means I can kick off our tests from the command line using a rake file. Once a job was created and the command line details were properly sorted, I was running tests with the click of a button! (Reporting and other improvements took a little longer).  Note, we don’t use any special RSpec runner for this, just a regular old command line, although we do have a Gemfile with all relevant depencies listed.

Configuring TeamCity

Since I couldn’t find any guidelines on how to configure TeamCity to run RSpec accpetance tests, I’m hoping this helps. We already had the server running so this assumes all you need to do is add your tests to an existing service. After some trial and error here’s how we got it to work:

  1. Created a new build configuration to run the sanity tests
  2. Added version control settings for the automation repo
  3. Within the build configuration added 3 steps:
    1. Install Bundler. This is a command line step that runs a custom script when the previous step finishes. Basically handles the configuration information for Sauce Labs (our Selenium grid provider) and the first pre-req.
    2. Bundle Install. Also a command line step running a custom script. Second pre-req.
    3. Run Tests. Final command line step using rake and my test configuration settings to run the actual tests
  4. Added a build trigger to launch the sanity tests after a successful completion of the previous step (deploy to server)

After this was all put in, I manually triggered the configuration execution to see how well the process worked. There were quite a few hiccups along the way. One of the more interesting problems was finding out the TeamCity agent machine had outdated versions of Ruby and Ruby gems. The version of Ruby gems was so out of date it couldn’t be upgraded, it had to be re-installed which is never much fun on an RDP session.

Once the execution went well I triggered a failure. When the tests fail they print “tests failed” to the build log. Unfortunately the server didn’t seem to understand when a failure occurred so I went back and added a specific “failure condition” (another configuration option) looking for the word “tests failed” which, if found, would mark the test as a failure. Simple enough!

What’s next?

We’ve been running this sanity test for a few months now and it’s quite valuable to know when an environment is in an unusable state and yet I think visibility is still a challenge. Although the failures report directly into a slack channel I’m not sure how quickly the failures are noticed and/or if the information reported in the failed test is useful.

A few articles I’ve read suggest using the CI server to launch integration tests instead of the UI level acceptance tests we are running. I think what we are doing is valuable and I’d like to expand it. I wonder what additional sanity or segments of tests do we add to this process? Are there more or better ways to do what we’re doing now? Please share your experiences!