Hi, thanks for reading this post 🙂 and for reading previous ones as well, as if you are reading third part, you probably went through them. If not, I suggest reading in correct order: trailer 😉 first part, second part, and then – the current one. As I promised last week – today, the mini-series of posts about Root Cause Analysis theory is about to finish. So let’s go.
Cause and Effect – the body of the Root Cause Analysis.
In order to fix all the Root Causes, first you need to find them. Therefore RCA team, during iterative meetings creates complete cause and effect diagram.
Creating the cause and effect diagram is like drawing a tree – you start with the trunk (the Problem), and through asking WHY?, and getting answers to those questions – you add more and more branches. You continue until each branch gets its ending.
Usually there are at least two (might be more) causes combining into one effect. And they need to occur simultaneously.
In other words – think of it in terms of forrest fire. To start any fire 3 ingredients are necessary. Fuel, oxygen and the source of heat. There is usually lots of wood (fuel) and oxygen in the forrest. And until the spark appears – no fire. Because one ingredient is missing. Until all together are present at the same time and place – you are safe.
The failure is an effect of combined multiple causes. Each of the causes is also effect of other causes. And so on, and so on.
Maintaining the parent-child relationship
One of crucial aspects of creating the cause and effect diagram in Root Cause Analysis is HOW you create it. The rule of thumb is creating it in the parent-child way. It might seem obvious, but fresh RCA Drivers and RCA teams tend to overlook its importance.
What is a parent-child relationship? It is DIRECT connection between cause and effect, without skipping “obvious” steps.
In this case: my neighbour did not broke his arm, because he slipped on slippery sidewalk. He broke his arm, because he fell on the sidewalk. And fell, because slipped on slippery sidewalk. This difference might seem vague, however during real-life analysis it might be sometimes the game-changer.
In this example, going straight to ‘grandchild’ Causes 1 and 1′ would trigger omitting Cause 2 and Cause n with their predecessors. The parent-child relationship of causes end effects is crucial to finding all Root Causes. Omitting one or more steps creates a risk of overlooking one or more of them.
How can I recognize that I am done?
There are four endings possible to finish the branch with. They vary in frequency and consequences.
- No data available
Well, that is sad, but occasionally happens. It means, that there is no data at all, not that we do not want to dig it (e.g. this one person that knows something left company, missing logs, etc.). - You meet Desired Condition
No-brainer, if it was tested – it is as it should be. No more work with this one. But if the fault was not found in tests (and should have been) you can dig on the other branch. - Another Root Cause Analysis needed
This usually means, that there is another fault to analyze, OR case is too big to analyze in current setup. - You found a Root Cause!
You need to end each branch with one of those four options, and when you do – analysis is done.
I have all Root CauseS identified. Now what.
Now it is “just” enough to formulate adequate improvements, and start implementing them. What does it mean “adequate”?
- First of all, S.M.A.R.T.
I am aware, that this acronym gives many people shivers, however improvements that are Simple (Yes!), Measurable (works/does not work), Attainable (sorry, no Moon shots this time), Relevant (fixes what is supposed to fix) Time Bound (well, value of improvement of 2023 releases, that in 2024 is still deep in backlog is arguable) do their job. - Company dependent.
Preferably – stuff within reach of the lowest possible level of people in the company. They will see its value. And it will not require tons of discussions and months of waiting. Hence if you try to look in your neighbours garden for improvement – look again. - Actually fixing identified Root Causes (boy, this one is sometimes tricky!)
Great new shiny Tool, but it does not work - Together fixing ALL identified Root Causes
If not – Murphy’s Laws are awaiting their turn to strike 🙂
Two sorts of improvements – prevention and detection
Remembering the often blurry weight of CoQ vs CoPQ (Cost of Quality vs Cost of Poor Quality), we can do some things to safeguard ourselves.
- Prevention
Actions designed to not let the faults happen in first place. - Detection
If the prevention was not enough – now it is time to find it.
You can sometimes combine these two 😉
One more thing that could have been mentioned earlier – however placing it here increases probability of it sticking. We are searching for FACTS. Not opinions, not convictions, not beliefs, not assumptions. Facts. Emails, testimonials, logs, tickets, you name it.
If you hear “most probably we tested it” – it certainly should ring a bell. Ask for confirmation, that it was actually performed. If you hear “it is impossible that (…)”, there is quite big probability that it is what actually happened.
Final words
My description of Root Cause Analysis method is done here. In this part I described the Cause & Effect analysis structure, the importance of parent-child relationship in analysis and the endings of the analysis branches. Additionally, I touched upon the relevance of RCA improvements, prevention and detection, and the importance of using the facts all over RCA.
That’s the theory. However, I am aware, that it is best to learn something using examples.
So there will be final part – I will give you example, that will connect all the dots. I will share with you my analysis of the crash of the NASA Mars satellite. The one, that is infamous for messed up units – metric and imperial. That being said, see you next time.
Stay tuned,
Marcin
1 comment