Thursday, February 20, 2014

Learning to use exploratory testing in your organisation ...

We have a certain view of test practises which have been passed to us from the waterfall project roots we've all had some experience of.  In many ways, these methods are familiar, especially to non-testers, and from this familiarity we feel a measure of comfort and security which sadly can often be unwarranted.

When veterans of waterfall such as myself are first faced with some of the concepts of exploratory testing our reactions can range from shock to abject horror.  Almost always though the initial reaction can be a negative one.

This article will help to define exploratory testing, de-construct what we do in scripted waterfall delivery testing, and talk about how we provide the same essential value with exploratory testing.  It will cover our approach to using exploratory testing as part of a system of testing, although of course other approaches can vary.  This is how we've found to get not only the best results, but the best buy in and turned our detractors into our allies.

Back when I did a series of articles on Building A Testing Culture for Testing Circus based on conversations I was having with other software testers, a major change and indeed the future of software testing was seen to be in the hands of increased use of exploratory testing.  And yet the idea isn't new - it was first coined by Cem Kaner in the 80s.  So what is it, and why is it becoming increasingly applicable to the way we're running software projects?

A few homegrown definitions

Let’s start by making a few definitions before we discover further!  These are the definitions I'm comfortable with, but you may find definitions will vary.

A test script is a set of instructions and expectations for actions on a system which a user then confirms when executing testing

Historically I've seen many a project where such scripting has meant we’d define every action of every button press, and detail every response.

Typically in waterfall, we would have a good amount of time to test … or rather we would ask for it, and whether we get it or not would be another tale.  

A product definition would come out of requirements and design, and whilst everyone starts to program against these, testers start to produce plans of what is to be tested, coupled with defining scripts for the new functionality based on these definitions.

Once done these plans and scripts are ideally put out for review – if you’re lucky you'd get feedback on them, which means you expand or reduce your coverage to include the areas which are key to the product.
When we run our scripts, typically life couldn't be simpler, all we do is go through, try each step, and tick or cross as we go to “prove” we've tested.

The problem with scripts is that they’re very brittle,

  • they require a lot of time upfront (before the tester has access to a product) to scope to the level of detail discussed above.
  • they are fallable in the fact that just as developers are not immune to misinterpreting product definition neither are testers.  And that means large amounts of rework.
  • the requirements and design need to be static during the scripting and execution phase.  Otherwise again, you have a lot of rework, and are always “running at a loss” because you have to update your scripts before you’re allowed to test.
  •  they depend on the testers imagination to test a product before they've actually seen a working version of it.

Unfortunately talk to most testers today, and they’ll report to you these are areas they’re under greatest duress.  The two biggest pressures they feel revolve around timescales being compressed and requirements being modified as a product being delivered.

Some testers have a desire to “lock down the requirements” to allow them to script in peace.  But obviously that involves to some extent locking out the customer, and although it’s something that we might feel is a great idea, it’s important to have an engaged customer who feels comfortable that they've not been locked out of the build process to have a successful project.

So testers have to be careful about not wanting to have a model for testing which works brilliantly in theory, but breaks down because of real and pragmatic forces on their project.

Exploratory testing has multiple benefits – one of the greatest being that it doesn't lock down your testing into the forms of tests you can imagine before you've even had “first contact” with a prototype version of software.

Exploratory testing is a method of exploring software without following a script.  There are many ways to perform it – the best parallel I have found is with scientific investigation.  You set out with an intention, and you try experiments which touch the area, devising new experiments as you go along and discover.
With exploratory testing there is more value in noting what you've actually done over recording your intentions.  Compare this with scripted testing, where you put the effort in ahead of time, and all you record during your testing is either a tick for a pass, or cross for a fail!

A test session is a high level set of behaviour that the tester should exercise when executing testing.  Often this can be a list of planned scenarios we think would be good to exercise.  But unlike a script, it's a set of things to try in testing, without getting bogged down early on with the step-by-step instructions of how that will be done.  It also is more a set of suggestions, with room for additional ideas to be added.

So what is exploratory testing?

There are many definitions out there for exploratory testing, and I'm going to add my own understanding of it.

Exploratory testing to my team is about

  • building up an understanding first of what the core values and behaviour of a system is (often through some form of Oracle)
  • using that understanding to try out strategic behaviour in the system to determine whether what we witness is unexpected


An oracle is a guide for understanding how the system’s supposed to behave.  An obvious one we all know is a requirement document functions as an important oracle.  However it’s not the only oracle there is.

Recently when we were making changes to our registration page, we took a look at registration pages on Gmail, Hotmail, Facebook, Twitter.  Many of these had some of the features we used in our registration page, so it gave us a method of playing and getting familiarity with.

Especially if you’re producing a system that’s got a broad customer base, most users don’t have the advantage of reading a whole stack of requirements when they use your system.  Your product has to make sense, especially given similar products in the market.

This ability for a tester to look at a page and ask “does it make sense” is an important freedom that’s required in exploratory testing.  Sometimes saying “as per requirement” isn't enough, and we have to push further.

Recently we put together a system to allow a user to manage their own password reset when they’d forgotten their password.  All the required behaviour was put in front of a lot of people who signed off on what they wanted.  The requirement we had read that the email the user would receive would say “your account has been unlocked, and you can no login”, just as per requirement.  My tester dared to suggest that “you can now login” would perhaps make more sense, going beyond just taking the requirements to be the holy oracle of all truth, and using a bit of common sense.

Somehow that typo had got through a lot of people - they did indeed want "you can now login" - but then, that's the nature of testing.  I've seen much larger slips than that pass by uncommented before now ...


You’re testing a login page, the requirement says, “When the user provides the correct password and username, the user is logged in”.

Within exploratory testing, it’s an expectation that the tester will be able to expand on this, using their experience with similar models,

This is a list of a tester using an Oracle based understanding and going beyond just the requirements to plan testing.  They'll typically expect,

  • If I give an incorrect password for a username, I won’t be logged in
  • If I give a correct password for the wrong username, I won’t be logged in
  • On other systems the username I use isn't case sensitive – should it be here?
  • On other systems the password I provide is case sensitive – should it be here?
  • What should happen when I try to log in incorrectly too many time?  Does the account get locked or should it be locked for a period of time?
Skill Based Testing

This ability to think beyond just a one sentence requirement is part of the inherent skill which is a core need for exploratory testers, it calls for,

  • An understanding of what the systems supposed to do.  Not just functionality, but the “business problem” it’s trying to address.
  •  An understanding of how the system has worked thus far is helpful
  •  An understanding of similar behaviour in other products
Exploratory testing is thus often referred to as “skills based testing”.

An argument often used in support of scripted testing over exploratory testing is that “if you have your testing all scripted up – then anyone can do it”.  Indeed such scripts it’s possible to provide to almost anyone, and they can probably follow them.

The problem tends to be that different people will interpret scripts differently.  Most inexperienced testers will tend to follow an instruction, and if what they see on screen matches, they'll just tick a box, without thinking outside of it (and hence would be quite happy with "you can no log in" example above because it was what the requirement said).

Lets explore this with another example ...

Test script

Click the link to take you to the login screen.
There are two fields for entry,

  • Your account
  • Password

The Test System

Here are 3 screens for the above script ... do they pass or fail?

Can you be sure that every tester would pick up on that from the script?  If so would it be a simple cosmetic defect, yes?

Sadly some people will combat this by trying to turn this into a version of the evil wish game, where you go overboard on explaining what you expect, so there' no room for ambiguity.  Hence an obsessed scripted tester might try and write the expectations as,

Click the link to take you to the login screen.
There are two fields for entry,

  • Your account
  • Password
The text on the display is 
  • all in Calibri font 11
  • all UPPER CASE
  • black colouring is used for all text

Yes sirree - that will stop those 3 problems above occurring again, but it won't stop other issues.  Especially if it turns out the root cause for the problems is a disgruntled programmer who's tired of reading your scripts, and currently serving out their notice period until they move to another company where they do exploratory testing ...

"But our scripts are our training!"

This is a common objection to exploratory testing.  If you have detailed scripts, anyone can follow them, learn the system and be a tester right?

Most people I have spoken with have found that it’s far easier to have a handbook or guidebook to a product, which details at a light level how to do basic flows for the system, than to have high level of detail throughout your testing.

The counterargument to handbooks is that “anyone can follow a script” but as discussed previously to build up familiarity with the system and the values, you will always need some training time to get that.  If you just throw people at the “your account” problem you’ll get differing results, you need people to be able to be observant and in tune with the system, and that doesn't tend to happen when you just throw people blind at a system with only scripts to guide them.

The bottom line is that there are no shortcuts to training and supporting people when they’re new testers on their system.  If you’re going to train them badly, you’re going to get bad testing. 

One of the few exclusions we've found to this is areas which are technically very difficult - there are in our test system some unusual test exceptions we like to run, and to set up these scenarios is complex, and completely unintuitive (includes editing a cookie, turning off part of the system etc).  This is one of the few areas where we've voted to keep and maintain scripts (for now), because we see real value in the area, and with maintaining so few other test scripts, we can do a good job, without it chewing up too much time.  

But we've found we certainly don't need test scripts to tell us to login, to register an account etc - when the bottom line is our end users only get the instructions on the screen to guide them.  Why should a one off user not need a script to register, and yet testers like ourselves who are using the system every day require a script to remind us that if we enter the right combination of username and password, we're logged in?

Debunking the myths

A common complaint about exploratory testing is that it’s an ad-hoc chaotic bug bash chase.  You tell a team to “get exploratory testing”, and they’ll run around like the Keystone Cops, chasing the same defect, whilst leaving large areas of functionality untouched.

This is some people's view of exploratory testing

With such a view of exploratory testing, it’s no wonder a lot of business owners see it as a very risky strategy.  However, it’s also misleading.

Whilst such ad-hoc testing can be referred to as exploratory testing – it’s not what many people’s experience of exploratory testing is like.

Just because exploratory testing doesn't involve huge amounts of pre-scripting, doesn't mean that exploratory testing is devoid of ANY form of pre-preparation and planning.

You will often hear the words “session based” or “testing charter” being referred to – these are a way at looking at the whole system and finding areas it’s worth investigating and testing.  The idea is that cover your sessions, and you will cover the key functionality of the system as a whole.

Drawing up a map of sessions

Creating a map of sessions is actually a little harder than you’d first think.  It involves having a good understanding of your product and it’s key business areas.

Let’s pretend we’re working for Amazon, and you want to derive a number of sessions.  Two key sessions jump out at you right away,

  • Be able as a customer to search through products and add items to the basket
  • Be able to complete my payment transaction for my basket items

Beyond that, you’ll probably want the following additional sessions for the customer experience,

  • Be able to create a new account
  • Be able to view and modify the details of an existing user account
  • Give feedback on items you ordered, including raising a dispute

Finally obviously everything can’t be usre driven so there is probably,

  • Warehouse user to give dispatch email when shipment sent
  • Helpdesk admin user to review issues with accounts – including ability to close accounts and give refunds

The trick at this point is to brainstorm to find the high level themes to the product.  Ideally each session has a set of actions at a high level that really encapsulates what you’re trying to achieve.

For more on mind mapping (I tend to be very much list driven in thinking), and I recommend Aaron Hodders notes on the subject.

Fleshing out the planned test session

For many skilled testers, just giving them the brief “be able to create a new account” will be enough, but perhaps a lot of detail came out of your brainstorming that you want to capture and ensure is there as a guideline for testing.

Let’s take the “be able to create a new account”, here are some obvious things you’d expect,
  • Can create an account
  • Email needs not to have previously been used
  • Checking of email not dependent on case entered
  • Password
o   Must be provided
o   Has a minimum/maximum length
o   Must have a minimum of 2 special characters

  • Sends email recript

A session can be provided in any form you find useful – mind mapped, bullet point list, even put into an Excel sheet as a traceable test matrix (if you absolutely must).  Whatever you find the most useful.

Such bullet pointed lists should provide a framework for testing, but there should always be room for testers to explore beyond these points – they’re guidelines only.

Keeping it small

Typically not all test sessions are equal – the test sessions for “adding items to basket”, if you've got over 10,000 items to choose from (and no, you’re not going to test all of them) is obviously a bigger task than creating an account.  However if some sessions are too big, you might want to split them up – so you might want to for instance split “adding items to basket” to,

  • Searching for items
  • Retrieving details for items
  • Adding items to basket

But really breaking it down is more an “art” than a set of rules, as you’d expect.

Test Ideas For Your Session

Okay, so you have a session, which includes some notes on things to test.  Hopefully with your experience in the project, and test intuition you have some ideas for things to try out in this area.

A common thing testers use to explore and test is called heuristics, which are testing methods.  Elisabeth Hendrickson has a whole list on her Heuristic Cheat Sheet.

But here are a few of those which should jump out at you,
  • Boundary tests – above, on the limit, below
  • Try entering too much/too little data 
  • Use invalid dates
  • Use special characters within fields
  • Can a record be created, retrieved, updated, deleted
Using heuristics, a simple statement like “can I create an account” can be expanded to be as large/small as needed according to the time you have.

Working through a session

In running and recording test sessions we've gone right back to University.  For those of us who did science classes, an important part of our lab experience was our lab book.  We used this to keep notes and results of experiments as we tried them out.

They weren't wonderfully neat, and we’d often change and vary our method as we went along.  But we could refer back to them when we needed to consult with other students or our tutor.

For our group, these notes are an important deliverable as we move away from scripted testing.  Typically we’d deliver our scripts to show what we intended to test.  Now when required we deliver our notes instead showing what we tested (over intended to test).  Not every exploratory testing group needs to do this, but to us, it’s an important stage in building the trust.

Obviously a key part of this involves “just how much detail to put in your notes”.  And this is a good point.  The main thing you’re trying to achieve is to try our as many scenarios as possible on your system under test.

We've found a table in simple Microsoft Word is the most comfortable to use (you can make rows bigger as you need to put more details in).  Basically you put in your notes what your intentions are.

We couple this with using a tool to record our actions when we test.  Each row in our notes table covers about a 10-15 minute block of testing we've performed.  We use qTrace/qTest to record our actions and the system responses during this time.

This allows us to use our notes as a high level index into the recordings of our test sessions.  Obviously things like usable names for our recordings help – we go by date executed, but anything will do.

A screenshot from a typical session log

It should be noted of course that not all exploratory testing needs to be recorded to this detail.  

For my group, as we move away from scripted testing, this approach allows us to build up trust in the new system. The bulk of this documentation – screenshotting, and a button-by-button trace of what we do – we find qTrace invaluable in absorbing the load for us (it would slow us down like crazy otherwise).

Of course, any test we find ourselves doing again and again is a candidate for automation – and even here, if we have a suitable flow recorded, we can give a qTrace log to our developers to automate the actions for us.

Closing a session

When a tester feels that they've come to the end of a session, they take their notes to another tester, and discuss what they've tried, and any problems they've encountered.  The peer reviewer might suggest adding a few more tests, or if there have been a lot of bugs found, it might be worth extending the session a bit.

This peer reviewing is essential, and we encourage it to lead to some peer testing.  Basically no matter what your experience, there’s no idea that you can have that can’t be enhanced by getting the input from another individual who understands your field.

This is why for us, the method we've used for exploratory testing really confounds the standard criticism that “when you write scripts, they’re always reviewed” (even this is rapidly becoming the unicorn of software testing, with many scripts being run in a V0.1 draft state). 

In both using reviewing (through brainstorming) of when we plan our test sessions and also using a level of reviewing when we close a single session, we’re ensuring there’s a “second level of authentication”tof what we do.  Sometimes we wonder if our process is too heavy, but right now an important thing is that the reviewing, being a verbal dialogue, is fairly lightweight (and has a fast feedback loop), and both testers get something out of it.  Being able to just verbally justify the testing you've done to another tester is a key skill that we all need to develop.

Peer Testing

We try and use some peer testing sessions whenever we can.  I've found it invaluable to pair up especially with people outside the test team – because this is when the testing you perform can take “another set of values”. 

Back at Kiwibank, I would often pair up with internal admin end users within the bank to showcase and trial new systems.  I would have the understanding of the new functionality, they would have a comprehensive understanding of the business area and day-to-day mechanics of the systems target area.  It would mean together we could try out and find things we’d not individually think to do.  Oh, and having two pairs of eyes always helped.

Pairing up with a business analyst, programmer or customer is always useful, it allows you to talk about your testing approach, as well as get from them how they view the system.  There might be important things the business analyst was told at a meeting which to your eyes doesn't seem that key.  All this helps.

It’s important as well, to try and “give up” on control, to step aside from them machine, and “let them drive and explore a while”.  We often pair up on a testers machine, and etiquette dictates that if it’s your machine, you have to drive.  It’s something we need to let go of, with a lot of the other neuroses we've picked up around testing …


I don’t usually include reference to my blog pieces, but this piece has been a real group effort, and our understanding in this area has evolved thanks to the interaction and thought leadership from a number of individuals beyond myself.

Instrumental amongst these have been

  • The Rapid Software Testing course from James Bach which really focuses on the weaknesses of scripting and adoption of exploratory testing, and is a road-to-Damascus, life-changing course I would recommend to any tester.
  • The book Explore It by Elisabeth Hendrickson
  • The many verbal jousting sessions I've enjoyed at WeTest
  • My engaged team of testers at Datacom


  1. This just became recommended reading for anyone working on my team.
    It will tell me a lot based on:
    - whether they raise at least one point for objection or dicsussion
    - whether they look for recording tools to start using
    - whether they look up more information on exploratory testing, oracles, tools, RST, etc
    - whether they read all the way to the end :-)

    Great work.


    1. Cheers Kim!

      And thanks for promoting through Twitter.

  2. Test Scripts is usually referred as Test Cases right? - But which is the one to call for Front end - Functional testing?

    I am confused with the definition: Test Scenarios with yours.

    test session: An uninterrupted period of time spent in executing tests. In exploratory testing, each test session is focused on a charter, but testers can also explore new opportunities or issues during a session. The tester creates and executes test cases on the fly and records their progress. See also exploratory testing.

    "Typo Error / Cosmetic Error" - which should be used
    "usre driven - should be user driven"

    Thanks, qTestExplorer - will try this one.

    1. Hey Srinivas - thanks for your comments.

      Yes indeed - when we talk about setting up test sessions or charters, this of course can become it's own spin off topic - both Elisabeth Hendricksons Explore It and James Bachs RST spend more time going into this, and indeed I may devote a future topic to it.

      To me the purpose of planning a session or charter is to help plan out at a high level those areas which you want to cover. Now in a way there are many ways to do this, but the most important thing to me is this - am I getting input in this from other testers and non-testers? If not, then I have to say however clever my approach is, I'm missing something fundamentally important. My system for organising this is only as good as the input I get for others.

      So a session plan can be pretty much anything - but it needs to be whatever is most useful to you and your team, esp for getting feedback. In my current team, with us being agile, our sessions are derived from user stories and include areas of functionality which we think could be affected etc. We start from the user stories and mindmap our test ideas from there.

      I've used a different approach of collecting system behaviour, and turned them into sessions "registration", "buying", "account amendment", "customer browsing", "helpdesk admin" for instance as broad banners for what our product does etc.