Regression testing isn’t only about repetition

Often when I’m chatting with someone about their regression testing strategy there is an assumption regression is all about repeating the same tests. This is a bit problematic because it ignores an important aspect which testers tend to be good at: focusing on risk. A better way to think of regression testing is it can be applied in two different ways: Procedurally and Risk-Focused

Procedural Regression Tests

When I speak of procedural I mean a sequence of actions or steps followed in regular order.  As I said above this seems to be the primary way people think about regression testing: repetition of the same tests. This extends to the way we think about automating tests as well.

Procedural regression testing can be quite valuable (so far as any single technique can be). The most valuable procedural regression tests are unit tests when applied to our CI system and run regularly. In this way they become a predictable detector of change, which is often why we run regression tests in the first place. (Funny enough automated UI tests are some of the most common procedural regression tests but aren’t the best detectors of change). 

The big problem with procedural regression tests are that once an application has passed a test, there is a very low probability of it finding another bug. 

Risk-Focused Regression Tests

When I speak of risk-focused I mean testing for the same risks (ways the application might fail) but changing up the individual tests we run. We might create new tests, combine previous tests, alter underlying data or infrastructure to yield new and interesting results.

To increase the probability of finding new bugs we start testing for side effects of the change(s) rather than going for repetition. The most valuable risk focused regression tests are typically done by the individual testers (or developers) who know how to alter their behavior with each pass through the system.  

A Combined Approach

Thinking about regression testing in terms of procedural and risk focus allows us to see two complementary approaches that can yield value at different times in our projects. It also gives testers an escape from the burden that comes  with repetition while still allowing us to meet our goals.

On other platforms:

If you liked this article please consider sharing it or buying me a coffee:

BBST Domain Testing – An Experience Report

The Domain Testing Workbook

In late January of 2014, after the Workshop on Teaching Software Testing (WTST) at Florida Institute of Technology, Dr. Cem Kaner and Dr. Rebecca Fiedler put together a 5-day pilot course to beta test a new Black Box Software Testing (BBST) course called Domain Testing. I was one of ten participants to try it out.

Two series of BBST

  • The first BBST series (Foundations, Bug Advocacy and Test Design) came from research funded by the National Science Foundation and Dr. Kaner. Part of the agreement with the NSF was to make the materials open source while also creating a way to teach the materials online with high standards. The Association for Software Testing (AST) was the lab for the initial classes and they continue to teach them to this day.
  • BBST Domain Testing represents the next step in a second BBST series focusing on specific test design techniques (Domain, Scenario and Specification based testing). Unlike the first series, the NSF didn’t fund the development of these classes so the materials won’t be open sourced. They also won’t be taught through AST, instead they will come directly from Dr. Cem Kaner’s corporate training firm Kaner Fiedler & Associates (or KFA for short).

What is domain testing?

Although domain testing sounds like it relates to domain knowledge, it is an umbrella term for equivalence class analysis (partitioning) and boundary testing. Domain testing is a sampling strategy where possible values of a variable (anything that can change) are divided into a subset of values that are in some way equivalent (equivalence class analysis). Then tests are designed to only use one or two values from each subset, along with boundary and extreme values that increase the likelihood of exposing bugs.

Why use domain testing?

Designing tests is about making choices. Choices such as:

  • How many tests do we want to run (from an impossibly large set of potential tests)?
  • How powerful are the tests we have chosen?
  • What do we hope to learn from the tests we’ll run?

When used appropriately domain testing can help us increase our efficiency by helping us run less redundant tests and increase our effectiveness by helping us find bugs thanks to powerful tests.

The Workbook

The Domain Testing Workbook (available on Amazon or directly from Context Driven Press) was published for those interested in self-study but also with the intention of using it in a class. The book contains among other things, a schema (list of ideas or step by step sequence of questions to answer) and examples ranging from the easy, classic problems, to increasingly more difficult examples. During class I used the book frequently for background context, explanations of concepts and examples, details of the Schema and for the many ideas sitting in the books multiple test catalogs.

The Class

Prior to the class, my experience applying domain testing consisted of trying to overflow input and output fields. The problem with the way I was doing things, besides being ad hoc, was I didn’t utilize the power of domain testing as a sampling technique. That’s what inspired me to take the class.

The pilot class was in-person for five full days at Florida Institute of Technology’s campus. Dr. Kaner delivered lectures in the morning and afternoon followed by exercises after the lectures and concluded the day with homework for additional practice. For example, on our first day, we learned how to characterize variables and worked through classic examples that originally appeared in the book Testing Computer Software.

Our first assignment was to demonstrate we were comfortable doing a variable tour (variables the basis for domain testing) of our sample application, FM Starting Point (a FileMaker Pro template), using Xmind. After the tour it was time to classify a smaller set of variables (a dozen) based on their data types; determine their primary dimensions and what benefit, if any, would come from doing domain testing. Finally that information went into a classical table.

On our second day we pushed further. We continued to classify variables and build classical tables only with more complexity than before. Continuing with the Schema we turned to creating a risk equivalence table. Risk equivalence tables are essentially risk-based versions of the classical table but you explicitly talk about the risks (or failures) your tests hope to expose and therefore describe why your test designs are powerful. The challenging and time consuming aspects of creating risk or classical tables is coming up with failure modes for tests – luckily the workbook has catalogs you can use for inspiration.

Our third day continued our practice of variable analysis; creating classical and risk based equivalence tables but added in pair testing of independent variables using the ACTs tool (Microsoft’s PIC tool that we used in Test Design was also an option). Our final days in the class focused on putting together the individual things we learned from the Schema, culminating in a capstone project.

The capstone project allowed us to showcase the skill and understanding we gained from the class into action by choosing a piece of an application and working through each step of the eighteen step schema. It was a difficult assignment but certainly the most fun part of the class. I made a lot of mistakes as I went through the steps without the assignments as a guide but the capstone allowed me to figure out what I like and don’t like, what order to address aspects of the schema (I didn’t like going in logical order) and going forward I’ll be more efficient and effective.

Future Classes

The pilot was a bit different from how BBST Domain Testing courses will be handled in the future. Besides the obvious in-person versus online transition, there will likely be a wide range of example applications to practice domain testing on.

At the time of the pilot (January of 2014) the class had two finished sets of real world example videos – QuickBooks (account software) and Electric Quilt 7 (design software). There were plans to add more real world examples including a Sewing machine (not your grandmothers sewing machine but a modern one to serve as an example of an embedded device), a video game (for a glassbox approach), an investing application and a database application.

Taking the class changed the way I look at testing

I barely remembered the introduction of domain testing in BBST Test Design after the class was over – I just didn’t get it. Things went by so fast, at such a high level that made it difficult to understand how to address things properly. After the class was over I couldn’t apply it to my work. This time things are different. After BBST Domain Testing, using the Schema and workbook I get domain testing.

When I look at an application today, I have a strong sense of where the strategy can be applied and where it shouldn’t. That makes me more confident in my abilities (not over confident, I’m still a novice), gives me lots of new and interesting ideas and a place to start practicing. More importantly I can actually see myself applying the technique to the software I test on a regular basis.

When Becky and Cem asked what I thought of the class I said:

it wasn’t as easy as I thought, but it was fun!

References

  1. Kaner, Padmanabhan & Hoffman. The Domain Testing Workbook. Context-Driven Press, 2013.
  2. Kaner, Cem. Teaching Domain Testing: A Status Report.

This article was originally published in the February 2014 edition of Testing Circus Magazine. This was such a great learning opportunity I wanted to highlight it by slightly updating and reposting it.

How To Run Your Selenium Tests with Headless Chrome

The Problem

If you want to run your tests headlessly on a Continuous Integration (CI) server you’ll quickly realize that you can’t with an out-of-the-box setup since there is no display output for the browser to launch in. You could use a third party library like Xvfb (tip 38) or PhantomJS (tip 46) but those can be hard to install and aren’t guaranteed to be supported in the long run (like PhantomJS).

A Solution

Enter Headless Chrome. Headless is simply a mode you can put Chrome into that allows it to run without displaying on screen but still gets you the same great results. This is a better option than using Chrome in a Headless manner such as in a docker container where the the container actually uses Xvfb.

Starting with Chrome 59 (Chrome 60 for Windows) we can simply pass Chrome a few configuration options to enable headless mode.

An Example in Ruby

Before we start make sure you’ve at least got the latest version of Chrome installed along with the latest version of ChromeDriver.

Let’s create a simple Selenium script (the example is posted below).

  1. We will pull in the requisite libraries and then create our setup method where we will pass Chrome our headless option as a command line argument. The first add_argument of ‘– headless’ allows us to run Chrome in headless mode. The second argument is, according to Google, temporarily required to work around a few known bugs. The third argument is optional but gives us the ability to debug our application in another browser if we need to (using localhost:9222).
  2. Now let’s finish our test by creating our teardown and run methods:

Here we are loading a page, asserting on the title (to make sure we are in the right place), and taking a screenshot to make sure our headless setup is working correctly. Here’s what our screenshot looks like:

headless

Expected Behavior

When we save our file and run it (e.g. ruby headless_chrome.rb) here is what will happen:

  1. An empty chrome browser instance will open
  2. Test runs and captures a screenshot
  3. Browser closes

Outro

Hopefully this tip has helped you get your tests running smoothly locally or on your CI Server. Happy Testing!

Additional References

This was originally written for and posted on the Elemental Selenium newsletter. I liked it so much I decided to cross post it with some updates.

Oh and if this article worked for you please consider sharing it or buying me coffee!

What are Quicktests and when are they used?

What are Quicktests?

Tests that don’t cost much to design, are based on some estimated idea for how the system could fail (risk-based) and don’t take much prior knowledge in order to apply are often called quicktests (sometimes stylized as quick tests or even called attacks).

When are Quicktests used?

When I’m about to test a new application the first part of my strategy is often to start with a few quicktests. I might choose to test around a particular interface, around error handling or even around boundaries cases. Similarly, when I’m about to test a new feature I can take a strategy where I look for common places where bugs might exist based on past experience (keeping an internal list of common bugs is a very good idea).

Boundaries are a good quicktest example: If there’s an input field on an billing or contact form we might decide to try testing some boundaries. Try to figure out what the upper limit is, lower limit, enter no information, try some weird characters, etc. Turns out you don’t really need to know much about the program to be effective with this approach and there are some handy tools that can help you.

The vast majority of us (developers, testers) use quicktests on a daily basis. And why wouldn’t we? They’re great if they work and if they don’t you can switch to something else. It’s not until these tests run out of steam that we’ll either need to switch focus to a new failure type or start testing the product in a deeper way. Hopefully by then we’ve gained some knowledge about the product and built a strategy around where we think additional valuable failures are so we can make better decisions about where / what to test.

More Quicktest Examples:

  • Interface tests
  • Boundaries
  • Intial States
  • Error Handling
  • Happy paths
  • Variable tours
  • Blink Tests
  • And many more… I’ve collected a few dozen examples and put them on this GitHub list.

When should Quicktests NOT be used?

Not every test is a quicktest, nor should they be. While boundaries are a good quicktest example, applying domain tests (specifically equivalence class partitioning) are not. In order to partition our field(s) (sometimes called variables) to develop our equivalence classes we need to know about the underlying data types for our variable (we might not know for sure, but we can make educated guesses). We need to know it’s primary purpose and what other variables might be dependent upon it. All of these things take time and effort to understand and apply. They’re still risk-based but they require more knowledge of the underlying system and are more costly to design.

It’s important to understand this concept of quicktests for a few reasons:

  • Test Strategy. Depending on the context of our work our test strategy should probably consist of quick and deeper tests.
  • Tool Selection. Really great tools like BugMagnet help with quicktests but not deeper boundary + equivalence class tests.
  • Creating Examples. It’s hard to find good examples of deeply applied test techniques. Most are only quicktests.

As I look around the web for the available resources on teaching test design many of the examples we have of particular test types or techniques revolve around showing these quicktests. Hey put in some boundaries. You are done! Yay. (facepalm) Starting with inexpensive tests optimized for a common type of bug is a great start but not a great ending. Here’s to better endings!

 

Notes:

Selecting a few Platform Configuration Tests

I’ve been developing a GUI acceptance test suite to increase the speed of specific types of feedback about our software releases. In addition to my local environment I’ve been using Sauce Labs to extend our platform coverage (mostly by browsers and operating) and to speed up our tests by running more tests in parallel.

This pretty similar to what I consider traditional configuration testing – making sure your software can run in various configurations with variables such as RAM, OS, Video card, etc. Except on the web the variables are a little different and I am more concerned with browser differences than say operating system differences. Nevertheless, with over 700 browsers and OS platforms at Sauce Labs I still need to decide what configurations I start with and what configurations I add over time in the hope of finding failures.

Data

I figured the best place to start was with our current users and since the only “hard” data we had comes from Google Analytics, I pulled up the details of two variables (OS and Browser). Using TestingConference.org as a replacement for my company’s data, our most commonly used platform configurations include:

  • Browsers:
    • Chrome 47 & 48,
    • Firefox 43 & 44,
    • Safari 8 & 9, and
    • Internet Explorer 10 & 11
  • Operating Systems:
    • Windows 7,
    • Windows 8.1,
    • Windows 10,
    • Mac OSX 10.11 and
    • Mac OSX 10.10

Excluding a few constraints like IE only runs on Windows and Safari only runs on Mac, testing browsers and operating systems in combination could potentially yield up to 40 different configurations (8 browsers x 5 operating systems). Also, we designed our application to be responsive and at some point we probably want to test for a few browser resolutions. If we’ve got 40 initial configurations and 3 different browser resolutions that could potentially yield up to 64,000 different configurations. Realistically even for an automated suite and with one to two functional tests, that’s too many tests to run.

Reduce the number of tests

We can start with adding in those constraints I mentioned above and then focuses on just the variables we want to ensure we have coverage of. To get a better picture of the number of possible configuration tests I used ACTS (Automated Combinatorial Testing for Software) tool and based my constraints on what Sauce Labs has available today for configurations. After I added OS and Browsers it looked something like this:

ACTS Window

If I wanted every browser and operating system in combination to be covered (all-pairs) then according to ACTS there aren’t 40 different configurations, just 24 configuration options. That’s more manageable but still too many to start with. If I focus on my original concerns of covering just the browsers, I get back to a more manageable set of configuration options:

Test OS Browser
1 Windows8.1 Chrome47
2 Windows10 Chrome48
3 OSX10.11 Firefox43
4 Windows7 Firefox44
5 OSX10.10 Safari8
6 OSX10.11 Safari9
7 Windows10 IE10
8 Windows7 IE11

8 configuration options are way more manageable and less time consuming than 24 options, considering each one of those configurations will each run a few dozen functional tests as well.

A Good Start

Selecting the configurations is probably the easiest part of configuration testing (and the only thing I’ve shown) but I’ve found its worth thinking through. (The harder part is designing the automated acceptance tests to produce useful failures.) Using ACTS at this point may seem like overkill when we could have just selected the browsers from the beginning but it didn’t take much time and should make it easier in the future when we want to add more variables or change the values of our existing variables.

The Apple Watch won’t change Testing

Probably. The Apple Watch won’t change Testing, probably.

Last month uTest announced a contest to win an Apple Watch. All you had to do was provide a response to this post:

In just a paragraph, describe how the Apple Watch will or will not change software testing as we know it today, or how testers are testing.

42mm Apple Watch

While I probably should have clarified the rules a bit (how do you define a paragraph?), I responded with:

If software testing is an empirical, technical investigation of a product for the stakeholders and a new product is introduced, that doesn’t change what software testing is or why we test. It might add some new dimensions to the product (the watch being an extension of the phone, tactile touch interface, etc.) that as testers we have to consider. It might change the importance or risk of some of those dimensions (change in interface, platform, etc.). Or it might change which test techniques we apply and how we apply them (think of stress testing a watch or computer aided testing / automation) but it probably won’t change testing.

I feel like elaborating. (tl;dr skip to the last paragraph)

As I was trying to formulate an answer I was thinking about how a test strategy might change between two similar or complimentary devices – an iPhone and an Apple Watch. The differences might suggest what changes were necessary to the model I was using. That model looked something like the Heuristic Test Strategy Model and the changes I noticed were within product elements and test techniques.

For example we might see a difference between the iPhone and the Apple Watch in:

  • Operations. I imagine the environment and general use cases are a bit different and more extreme for a watch. I’m extremely careful about damaging my phone but I seem to always strike my arms and wrist against doors and walls without knowing it. The fitbit I wear around my non-dominant wrist speaks to this extreme or disfavored use.
  • Platforms. The hardware and software are different. The OS that runs on the watch is new (watchOS), in addition to the apps. What level of support does the iPhone provide (its a required external companion) to the Apple Watch?
  • Interfaces. The user interface seems like the most obvious different given the small display and crown as the home screen. What about the new charging interface or how data is exchanged between the watch and the phone?

Those are just a few dimensions of the Apple Watch I could think of in the thirty or so minutes I took. How many more am I forgetting? (We should examine as many of those dimensions as possible when testing).

Then I started looking at the top Test Techniques listed in the HTSM. How many of them can we apply to testing mobile devices like an iPhone and now the Apple Watch?

  • Function Testing
  • Domain Testing
  • Stress Testing
  • Flow Testing
  • Scenario Testing
  • Claims Testing
  • User Testing
  • Risk Testing
  • Automated Testing

All of them! The challenge might be in applying these techniques. I’ve heard mobile GUI automation like Appium has come a long way in a short time but still has problems and isn’t at the level of Selenium WebDriver. My own experience with mobile simulators suggests they are much less reliable than their desktop counterparts.

After going through this brief exercise I came away thinking this model was still just as applicable. Although the business case is still being made amongst consumers and without knowing any specific project factors, my testing thought process remained much the same. This isn’t to say there isn’t a need for a specialty of testers who are really good at applying test techniques to mobile devices; only that the mental model is still the same. Testing as a whole isn’t really changing.

Why It Matters

I’ve always found joy in exploring new products and the impact they have on our lives. Although it’s assumed when you work in technology you are a technophile – someone who loves technology and you live and breathe gadgets and software -that’s not always the case. I find it just as interesting how quickly I abandoned certain things as how much they stick to me and how much I use them.

I’m still evaluating the Apple Watch. As progress slows on the development of smartphones I’m starting to question the relentless upgrade process – waiting instead for things I feel are worthy of spending the money on. The Apple Watch falls into this same category. I don’t typically wear watches but as I said before I do wear a fitbit. I like knowing how active or inactive I am. Whether I’m ready for a relentless upgrade cycle of expensive watches over the next 5 years in addition to whatever new phones comes out is an entirely different story.

As for the uTest May contest, I came in third place. Thanks to uTest and everyone who voted! Maybe if I win a few more contests I’ll be able to justify getting my own Apple Watch? Even though it probably won’t change testing much, how can I say no?

When to use a Gemfile

I’ve been building a GUI acceptance test automation suite locally in Ruby using the RSpec framework.  When it was time to get the tests running remotely on Sauce Labs, I ran into the following error:

RSpec::Core::ExampleGroup::WrongScopeError: `example` is not available from within an example (e.g. an `it` block) or from constructs that run in the scope of an example (e.g. `before`, `let`, etc). It is only available on an example group (e.g. a `describe` or `context` block).
occurred at /usr/local/rvm/gems/ruby-2.1.2/gems/rspec-core-3.2.2/lib/rspec/core/example_group.rb:642:in `method_missing'

It took a few minutes debugging before I spotted the error:

../gems/ruby-2.1.2/gems/rspec-core-3.2.2/lib/rspec/core/..

Source of problem: My remote tests were using a different version of RSpec than I was locally. Solution: Create a Gemfile to specify the version of using Rspec I’m using.

Since I didn’t realize I needed a Gemfile my question was, in general, when should someone use a Gemfile? According to the manual, a Gemfile

describes the gem dependencies required to execute associated Ruby code. Place the Gemfile in the root of the directory containing the associated code.

For example, in this context, I would place a Gemfile into any folder where I specifically want to call tests to run. In my case that meant a few specific locations:

  • At the root of the folder – where I run the whole suite of tests
  • In the /spec/ folder – where I typically run tests at an individual level

At a minimum I’d specify:

  • A Global Source
  • Each Gem I use locally that Sauce Labs will need to use

In the end it might look something like this:

Test, adapt, and re-test.

Getting better at Domain Testing

In late January of 2014, after the Workshop on Teaching Software Testing (WTST) at Florida Institute of Technology, Dr. Cem Kaner and Dr. Rebecca Fiedler put together a 5-day pilot course to beta test a new Black Box Software Testing (BBST) course called Domain Testing. I was one of ten participants to try it out.

Then they announced a follow up / pilot class for BBST Domain Testing that would be online. I applied to be part of the class and was accepted. This was like an expanded version of the pilot course (less intense than 5 full days) but more similar to what you might expect from a BBST courses. The course focused from Day One on diving deeper into the test technique Domain Testing and working through various aspects of the Domain Testing schema until we got to a final / capstone project where we put everything together.

I’m happy to share that I’ve officially completed the course!

The Domain Testing Workbook is available

Cem Kaner, Sowmya Padmanabhan and Doug Hoffman have a new book called The Domain Testing Workbook. I’d highly recommend picking up a copy or at least adding it to your reading list! This book is not just a deep dive into one test technique but it represents a collective thinking about what software testing is today.

Domain Testing

BBST Test Design was my formal introduction to Domain Testing aka boundary and equivalence class analysis. Domain Testing is often cited as the most popular (or one of the most popular) test techniques in use today. For its part the Test Design course spends a whole week, a full lecture series and at least one assignment introducing and practicing this technique. If you’d like an introduction I recommend the first lecture from the fifth week in the Test Design series in which Cem introduces Domain Testing:

(For more information see the Testing Education Website or YouTube.)

I got the chance to do some reviewing of the workbook and enjoyed how it elaborated on the material I was learning Test Design. Yet it went further by offering layers of detailed examples to work through and is set up to allow the reader to skip around to whatever section or chapter they find interesting.

In an email, Cem described the book:

My impression is that no author has explored their full process for using domain testing. They stop after describing what they see as the most important points. So everyone describes a different part of the elephant and we see a collection of incomplete and often contradictory analyses. Sowmya, Doug and I are trying to describe the whole elephant. (We aren’t describing The One True Elephant of Domain Testing, just the one WE ride when we hunt domain bugs.) That makes it possible for us to solve many types of problems using one coherent mental structure.

Cem Kaner

I’m excited for this book to be turned into a class on domain testing ( and hopefully open sourced) with the rest of BBST. In the meantime pick up the book from Amazon and let me know what you think!

How do you handle regression testing?

Matt Heusser sent the context-driven-testing email group a series of questions about handling regression testing. Specifically he asked:

How do your teams handle regression testing? That is, testing for features /after/ the ‘new feature’ testing is done. Do you exploratory test the system?  Do you have a standard way to do it? Super-high level mission set?  Session based test management? Do you automate everything and get a green light? Do you have a set of missions, directives, ‘test cases’, etc that you pass off to a lower-skill/cost group? (or your own group)? Do you run all of those or some risk-adjusted fraction of them?  Do they get old over time? Or something else? I’m curious what your teams are actually doing.  Feel free to reply with /context/ – not just what but why – and how it’s working for you. 🙂

Matthew Heusser

My Response:

I worked for a small software company where I was the only tester among (usually) 3-4 developers. Our waterfall-ish development cycles were a month+ in development with the same time given to testing. After the new features were tested, if we had time to do regression testing it was done through exploratory testing at a sort of high level. I don’t think I ever wrote out a “standard way” to do it but it fit into my normal process of trying to anticipate where changes (new features, bug fixes) might have affected the product. If I had understood SBTM at the time I would have used it.

We’ve never gotten around to automating regression testing. A part of that has to do with time constraints – small company I wear multiple hats, could have managed my time better, etc. Other reasons involve not really knowing how to approach designing a regression suite. I’ve used Selenium IDE in the past but automating parts of our web applications GUI isn’t possible without changes.

When I’ve had test contractors we used a set of missions to guide the group so everyone could hit parts of the application using their own understanding and skill (although this was our approach I don’t think it was necessarily an understood approach =).) In all of the testing / regression or otherwise its all based on some sort of risk – based fashion.

I have a lot to learn

Mostly I feel like I don’t know enough about automated testing and how to change my testing approach to include an appropriate amount of automation. It seems reasonable to assume some automated regression tests could help provide some assurance in the build not changing for the worse (at least for the areas you’ve checked). Although I continue to commit time to learn more about testing I haven’t committed much time to learning about automation and I believe its to my detriment. I guess I know where to focus more.