TestSheepNZ: February 2016

Friday, February 19, 2016

Metrics and our love of numbers

I recently went swimming for the first time several years. As someone who used to live for and love swimming - this seems ridiculous. I even have an open pool only 1km away!

In my 20s I'd typically swim 2-3 times a week, covering 64 lengths at my local pool in Burton, which would add up to exactly one mile. However, I felt a bit disheartened with my recent swim. I by no means expected to be back to my peak ability and speed, but all the same, a target of 40 lengths seemed quite adequate. However having completed 30 lengths, I just didn't feel like I had another 10 lengths in me, so crawled defeated from the pool.

It was only when I returned a week later that I noticed something ... my local pool in Wainuiomata is 50m in length. My pool at Meadowside Fitness Centre in Burton-on-Trent was 25m. I'd actually done the same distance - it also explained why each length seemed to take me twice the time!

On my second visit I didn't make the same mistake. I made sure to complete 32 lengths of the pool, or 1600m (one mile). What interested me was that somehow 32 lengths didn't feel quite as much of an accomplishment as when I used to do 64 lengths in Burton. Even though it's the same distance.

Why is that? Well it comes down to our blindness when it comes to numbers vs measurement - the subject of today's article.

Money on offer ...

We like to think numbers are a cold, rational, and scientific thing. But believe me, the way we act on them isn't - when it comes to numbers, we have an irrational attraction where the bigger the number, the better. For example - if I were to be feeling generous, and said to you dear reader "I'm going to give you some money ... would you prefer 10 New Zealand dollars ... or 1000 Japanese Yen?", what would you go for?

Pretty much everyone will feel the attraction of the Yen figure, simply because it's a much bigger number. Because I'm asking you, and I've a reputation as a bit of a trickster, you might well be suspicious. But the larger number is appealing purely emotionally.

In actual fact, the two amounts according to my Google exchange rate calculator are almost equal (currently slightly favouring the Yen). There's certainly not the gulf of difference between the two currencies on offer that your gut-feeling would have you believe.

Context matters ... but damn the bigger number is a more appealing.

Numbers at work

Okay ... time for a work question. Your manager asks you how many tests you've executed today ... which would you rather tell them - that you've executed 2 tests? Or 12? *

Which number will make your manager happier? Well, the higher one of course!

But you knew that which is why you split the 2 tests you were going to do into 6 parts each. Again the numbers lose their value if we perceive them as just numbers, and lose their context.

So once again, our emotional desire for large numbers gets in the way of our reasoning.

A world of numbers

We live in a world of numbers - we like to measure and share these numbers. And sometimes it plays to our vanity a little too much.

For example Runkeeper - anyone know someone who keeps posting to Facebook every run they do, including distance, speed, time?

I've once used a run tracker, but never dared to post the details because I'd taken about 45 minutes to move just a couple of kilometers. Mind you, it was up this ...

But the data shown on Facebook wouldn't have included the elevation covered. And even so, again because we want bigger numbers, we're more attracted to running/walking a long distance, than a shorter distance in more challenging circumstances ... because it looks and feels like we've done less.

What's the remedy?

One thing I find quite sad, and sometimes a bit alarming is that a few of the managers and business owners I've worked with who are most "focused on the numbers", are also the people who will sometimes talk with a note of pride about "never really getting maths ... and I dropped science and maths as soon as I could at 16".

I have no problems with managers who aren't good at maths - to me the core part of being a good manager isn't about applying mathematics (but for goodness sake delegate mathematical analysis if you try). But if you're going to make "interpretation of numbers" the core of your management style, I think it's fair to expect a higher bar of understanding than "dropped out as soon as I could".

Would you want a doctor who flunked then dropped biology? Would you want a programmer who "read a few books when I was 15 ... but haven't looked since?".

A key part of science in the post-16 curriculum in the UK is "what makes a good graph?". Which typically revolves around ideas of "units" and "title" ...

Units is about "what are we measuring". Always with a question of "is it appropriate?".
Title is about "what is this a graph of?". It describes what you're looking at, gives context and helps you understand what you're looking at.
Error - how accurate is my measurement? Is it so inaccurate that I can't really take much prediction from what I'm measuring? [Hint - this is frequently the case in IT, we typically choose things which are easy to measure, but that aren't really useful to measure]

Without these concepts we're tracking numbers, but we're not really sure what we're actually counting - for more information take a look at this previous blog post.

Next time - I'm going to cover the core of what a test report is about, and give some suggestions.

Denial 104: "But we spent a lot of money on the license ... and customising it!"

Previously we looked at the psychological effect of denial, and what drives it and makes it tick. Today we look at an example of denial at work within testing - especially other's perspectives of it. Ladies and gentlemen, I give you exhibit C ...

"But we spent a lot of money on the tool license ... as well as customising it!"

Again here there is a base assumption that spending a lot of money equals "this has to be really suitable for us". When we looked at peer pressure previously, I tackled some of this - especially the myth that a test management tool means "reporting for free" ... I of course automatically get nervous when people offer me anything for free, mainly because of this guy ...

Chitty Chitty Bang Bang's child catcher - he lured kids with promised of lollipops "for free". I'm sure he now makes a living trapping unwary IT projects in enterprise license agreements.

As we discussed, test tools generally trap you into working in a particular way. If it matches how you want to work, that's great. However if it doesn't align to how you need to operate on your project, then you're stuck with such a difficult and clunky methodology that will constrain the way you work, and so has an effect on your team which is difficult to measure. Except for the fact that your team hate using the system.

But wait ... there's something worse than an "out of the box" product which partly addresses your needs. And that's a product that's been "tailored".

Naturally, I have a ridiculous (yet true) example from my past. Many years ago we had to use Microsoft Team Foundation Server (TFS), and some industrious manager had decided that to get the most use out of the reporting from the system, they would replace the out-of-the-box defect lifecycle below with something that would be more granular, and would allow for better in-detail status reporting.

So out went the simplicity of ...

Difficulty setting: Easy

In came the following hierarchy of states before a defect could be closed ...

Proposed
Defect confirmed
Assigned to developer team leader
Assigned to developer
Developer working on
Code fixed
Ready for unit testing
Assigned to unit tester
Completed unit testing
Ready for system testing
Assigned to system tester
Completed system testing
Ready for pre-prod testing
Assigned to pre-prod tester
Completed pre-prod testing
Ready for deployment
Deployed
Checked in deployment
Verified
Closed

Each state above was mandatory by the way (no cheating and skipping states you naughty tester). It wasn't too bad for tracking a production incident, but for a defect in a project which wasn't in production, it had way too many states - most of which made no sense for what you were doing. Just to make it worse, every time you moved between states, you had to save and a comment was mandatory (so people could capture even more detail).

It meant that it took ten minutes just to close a defect that you'd already tested. No wonder there was a bit of a rebellion - testers wouldn't use TFS to track defects because it wasn't suitable. Yes, no-one could use it easily, but heck, it gave us great audit trails! [We eventually won pressure to considerably simplify that lifecycle by the way]

I have seen similar comedy-of-errors played out when,

an industrious manager introduces more states into test/defect tracking system
the same manager then finds they have to chase/bribe everyone to update the defect tracking system. Why? Because they're finding it just too awkward to use. It's not at all helpful.

The denial in this case comes in two places ...

The test team needs to slave their process to the tool. Not the tool "make things easier for the tester". The more money was spent on the tool, the more justification that it's the tool's approach that's right over the testers. Throw in a few choice phrases like "enterprise standard" and "best practice" to add to the smokescreen of why the team has to bend to the tool, and not visa versa.
More customisation = better. Our tailored defect flow above that fits one purpose fairly well, but was lousy for anything else. I see a lot of projects which go out to try and customise their product to the nth degree - usually powered by "but what if ..." scenarios. The problem is the process which typically comes out looks more like a map of the London Underground. And whilst that might cover a few choice scenarios really well, what typically occurs is even the simple stuff is nightmarish. The bottom-line is no-one wants to read a manual to understand how defect flow.

Difficulty setting: "Is Government involved?"

Ironically I see the same temptation with sprint task boards to "overcomplicate them", because they're too simple. Believe me, simple can be beautiful! Because needless complexity is almost always unusable.

My rule of thumb is if someone is pressuring you to have a state "just in case", resist at all costs. And if you find a state which you frequently skip on the way to another state, think about removing it - that state is not telling you anything!

Thursday, February 4, 2016

Denial 103: "But we spent a lot of money developing our automation"

Previously we looked at the psychological effect of denial, and what drives it and makes it tick. Today we look at an example of denial at work within testing. Ladies and gentlemen, I give you exhibit B ...

"But we spent a lot of money developing our automation"

I know the lure for a lot of people when it comes to automation is,

the tool is cheap (free is best yes?)
it's quick to produce automation scripts
it can run scenarios faster than a manual tester can

In a way this trap shares a lot of ground with the "testers spent a lot of time scripting" fallacy we discussed last time.

If it's quick for you to create scripts in, it's easy to try a short one-week demo of the product, and choose it as your desired automation strategy. However, six months down the line, your manager is wondering when the automation magic will kick in asking "where's my ROI?".

Now, as much as I hate that term, they've got a point - by now you have a large library of automated scripts. But they constantly need running and correcting (that wasn't in the brochure). So much so that they constantly fail, mainly in small but irritating ways, and typically the fails require a modification to the automation script rather than find problems in software under test that they're supposed to check.

And any time there's a major change to the system, large numbers of scripts need modification. Heck, they all needed some modification when the login screen was modified!

This is because when you evaluated the tool and the method, you went with cost and how easy scripts were to make (hey, record and playback, how much simpler could it be?). What you missed was maintainability. This was covered in a WeTest Workshop from way back, and dealt with under "Automation TLC". But an article by Messers Bach and Bolton which has just been released covers some similar ground. [As a hint, I've found that if you don't know what the term "code reuse" is and you're writing large numbers of automation scripts, maybe you shouldn't ... ask a developer instead].

The denial mindset here (much with manual scripts) is that you've invested a lot of time and effort into automation scripting. At some point that really should start paying off. Right now it's not going any faster than the tests the manual testers used to run. And although the automation occasionally find problems in new software builds, about 80-90% of the time they find a problem in the script itself which needs changing. And typically if it's a problem in one script, it's a problem in many scripts.

The problem is, if you've not chosen a tool and built up your automation scripts with maintainability in mind, it will NEVER pay off.

Your scripts are too brittle, and they will continue to break in multiple places from small changes. The more scripts you have, the bigger your maintenance bill of stuff that needs changing. The hard thing is, it's probably easier to start from scratch with a new tool, than to try and build in your maintainability with what you already have.

So we chose a bad tool ... and implemented badly as well. Is this denial? If you're continuing to fix it every time it breaks, then yes, it is.

What we have here is what I like to call "technical testing debt", and it shares attributes with other testing debt. This debt keeps rearing it's head - you don't have the time or backing to go back and deal with the fundamental issue. So you band aid it, and then band aid it again, and then again.

And because you're addressing the problem so often there will be a perception problem, that surely that technical debt is decreasing with each occurrence you fix. Right? The technical debt is causing you to sink in time and resources - and therefore the sunk costs fallacy makes people say "as you're spending time on it, it must be decreasing".

No - not at all. Every time it's being fixed, the fix for it is only for those occurrences where it breaks - the places where it surfaces. To really do justice to the problem, you have to pretty much come to a full stop, and do a serious reworking of the approach to the problem, going much, much deeper. That's something that's very hard to get backing for - especially if someone thinks you're addressing that technical debt piecemeal "because it keeps cropping up as a problem".

The good news - not every automation process goes like that. But talk around, everyone I know has the tale of a suite of automation which went this way. The trick is to know that if you're fixing breaks in your scripts than finding problems in your software under test, you need to ask if you're in a state of denial, and you need to address some fundamental problems in your automation strategy.

Post-script

Sometimes after I launch a blog entry, someone gives me a link so good, I need to add it to my article. So thank you Simon for recommending this article.

Wednesday, February 3, 2016

Denial 102: "But we spent a lot of money writing test cases!"

Previously we looked at the psychological effect of denial, and what drives it and makes it tick. Today we look at an example of denial at work within testing. Ladies and gentlemen, I give you exhibit A ...

"But we spent a lot of money writing test cases"

I talked a little about this phenomenon in a post way back in 2014. In my career, I've had to revisit old testing schemes several times... heck, let's just call it what it is "grave robbing dead/dormant projects".

Prior to every excavation of archives, I will have heard the legend "well we were late going into testing and the testers spent a lot of time writing scripts". This usually comes from non-testers, and gives the indication that as far as they're concerned I am sitting on a tester's gold mine of material. It has to be valuable, because we spent a lot of money on it. [It's value to me has to be equivalent to the cost sunk into it]

Here's what I hear in that sentence ... "well we were late going into testing" in my ears means that there were a lot of problems. So many that a basic version of the software could not be made to work enough for testing to start. That says there were fundamental issues on this project. Could one of them be that no-one really knew the specifics of what they were building? And if that is really the case, what's the chance that testers magically had a level of clairvoyance which eluded the developers?

And then there's the part that goes "and the testers spent a lot of time writing scripts", which roughly translates to ...

I kid you not - during my career I've seen some managers try to send any test contractors on leave for a month until it's ready for testing "to save costs". So sadly if you're a contractor and you want to be paid, there's a benefit to looking busy.

As I've mentioned in my previous article, a quick attempt to correlate the execution logs with the test scripts often shows me whole reams of planned testing which was dumped and never run.

New Zealand is such a small place, which means you might have worked on that project. You might have left the company. You might have left the city. BUT I WILL END UP FINDING YOU!

It frequently happens - informally I might add - and I'm glad it does, because it allows me to dig deeper to find out what really went on and find out another of a project's story. And it typically points towards a problematic project - from vague outline to changing scope - rather than a tester who just wrote scripts they had no intention of running.

Sooner or later though, it's my job to burst people's bubble that they're not sitting on a goldmine of test scripts that are ready to run. I'll take a scan through, and try and use what's been written to test the system to see how much use it'll be. Invariably though it's the results or execution logs which tell me a lot more about what was done than a folder filled with test scripts.

This aligns with James Christie's discussion around how ISO 29119 impacts testing. That standard focuses more on plans and scripts as auditable artifacts. James Christie argues that an audit always wants to focus first and foremost on what you actually did (over what you planned to do). And he has an excellent series of articles on that starting with part one of "Audit And Agile" here.

What I've learned with bursting bubbles is to do it gently. Always work out ahead of time if anything is salvageable, and have your counter offer ready (so we're going to exploratory test instead maybe?). That person who thought the testing was pre-written was probably hoping that your test effort would be minimal because you'd be able to build your effort from past work. Let them know the degree you can capitalise on - heck if it just gives you a good set of starting ideas that's something significant.

Even if you have a good group of test cases written which match the execution log you have, you'll still have to spend time learning the system and it's intricacies. For instance there might be a line going "expire the account", and it takes some time to find they have a test tool written which will age an account to it's expiry date, but you need to find where they kept that tool. Almost always a lot of verbal knowledge will have gone, or the details are there, just buried in a heap of other details.

Also remember having too much documentation, as above, can be as much of a bane as none at all. Because you have to spend a lot of time going through it all before deciding if it's any help or a hinderance. And I don't know about you, but I'm a bit of a slow reader.

Tuesday, February 2, 2016

Denial 101: Something I find hard to believe ...

Previously we looked at the effect of peer pressure when we try and adopt the new idea that's "all the rage". Today as promised we're going to look at the impact of denial, an effect I find very much paired with the "group delusion".

As originally planned, I would be diving right in and exploring the places where we find denial as a very real effect within IT departments. However as I wrote and expanded my ideas, I found a little too much material - so I ran it by a friend who said it really deserved to be split up over several posts ... so change of plan!

Although I talked a little about denial in my piece of the psychology of The Force Awakes (and believe me have I seen that piece validated in recent weeks), I want to spend this first post looking further down the rabbit hole understanding more about it.

Like any of the effects we've talking about, it's easy to feel a little smug and superior as we talk about it. As if these issues are something that "happen to other people". But denial is such a powerful effect on us as human beings because it works on our emotions (like many of the psychological effects we're looking at). We'd love to think that we use rational thinking to control our emotions, but often it works the other way around - our thinking tends to be slaved to rationalise the emotional outcome we want.

We are all led astray in our thinking when emotions are entangled. The point of this series of articles is to shed a bit of light on common traps we fall into, so we can be a little wiser - maybe asking ourselves a little "is this a MacGuffin effect?".

In a nutshell, denial is the rejection of facts because of an emotional reaction. I like how this is covered within the Your Deceptive Mind chapter on denial where it says that people commonly fall into a denial trap and will start with their desired (emotional) outcome, and use it to systematically reject any evidence which does not support that outcome. This form of reasoning increasingly requires the presence of conspiracies to support their model of thinking.

And indeed a really good example of this is the piece I wrote in 2014 where we tried to convince someone who believed in a flat earth that the Earth was round. In that piece they initially respond to evidence with vague science, saying "the Earth is round ... but flat ... like a plate".

Then when it's said they could try Skyping someone in another part of the world to see if it's night there whilst day where the sceptic is - however they're convinced the other party will be part of the conspiracy, so debunks the experiment.

And of course there's the line which several people have said is their favourite, "When someone flies from London to Auckland via America, it's okay, because that route exists. But if they go via Asia, then the pilot flies around Africa for a bit until the passengers get disorientated and it gets dark. He then heads via America, to New Zealand, stopping off at a secret replica of Singapore they've build deep in the Andes". More conspiracy.

So what leads so many people to go down that path? Whenever we make any kind of decision, we essentially make an investment of time, money, ego, and pride into that decision - we're obviously committed to that decision "coming out alright".

But sometimes pride and ego will mean we just stick to that decision, even in the face of new and glaring evidence. We want to be seen to be someone consistent, not someone who is ever wrong.

So rather the rectify our decision, we'll choose to undermine any contrary evidence just like our flat earther. This is known as the "escalation of commitment" - where people continue to justify commitment of time and money based on an initial decision and "we've already invested in this course of action". The phrase "throwing good money after bad" is of course one which perfectly sums up this behaviour.

Here's an everyday example for those of you who can remember driving before the days of SatNav. A couple of friends, Thelma and Louise, are driving to Mexico City. They're supposed to be stopping at the Grand Canyon as they do so, and they passed a sign that said they'd see it in 10 miles ... but that was 15 miles ago.

Thelma wants to go back as she thinks they've missed a turning.

Louise says she has an excellent sense of direction, and is sure they'll find the Grand Canyon soon enough.

Thelma says she's seen a sign saying they're heading towards a place called Bitter Springs, which means they're heading in the opposite way to both the Grand Canyon and Mexico.

Louise is sure that Thelma is just reading the map wrong and is not about to turn around now. This way has to end up in Mexico eventually.

Right?

The couple having an argument about directions, and someone just won't turn back, because "we've come this far". Sound at all familiar?

Our flat earther has invested time and pride into his worldview, and because conceding that worldview means conceding his pride, he refuses to. Even to the point of turning down a "round the world cruise" he won on the lottery, because it means admitting being wrong.

So - the question is, where does escalation of commitment happen in testing? And I'm afraid to say, it happens everywhere! Pretty much anywhere you've sunk time and money into doing something, there is a state of mind that wants to continue doing that course of action ... because it has to pay off eventually, right? Oh dear God please, it has to pay off!!!

Other everyday examples of denial and escalation of effort you might want to think about ...

A friend pointed out the Vietnam war followed this behaviour all too chillingly. It started out as a small involvement of US forces. But as it went badly, more and more forces were brought in from the American side, because "we've come this far". You see something similar in the "one big push" mentality in The Great War, particularly in The Battle Of The Somme. I talked about the Battle Of The Somme, and "sticking with the plan" in the face of changing evidence here back in 2013.
"I read about this amazing diet last week. I mean, I'd tried a few other diets over the years, but this one actually works. I read it in a magazine." You can substitute the world "diet" with any piece of revolutionary fitness equipment which "is like having a gym in your own home", and has so changed the world, it's not available in shops ... only to order over the phone on a midnight infomerical slot. [It's like the gyms are conspiring to keep your membership]
The right wing American politician whose response to this month's school shooting death toll is "the victims and their families will be in my prayers". Just like they were last month - except this time they'll pray really hard.
The Aztecs used human sacrifice to appease the god of rains. If there was no rains, then obviously they'd not sacrificed enough people. I talked about this here in 2014.
Variants on this meme when politicians have committed to a policy, despite it not producing the results they promised ...

If any of those points made you squirm uncomfortably, then well done, you're waking up.

[By the way - I've taken a few shots at the right wing there, and I'm an out and out socialist at heart. But critical thinking still applies to me - especially if someone is sharing on social media a "new item" which aligns with my beliefs. I often am somewhat suspicious if it falls into the camp of "I knew it", and start checking up on it. To be honest such fake stories annoy the heck out of me, because they undermine my political stance, and make it just way too easy for friends who have different opinions to "score cheap points" over me.]