Root Cause Analysis
If I have an unwanted situation which consumes resources and
tends to happen in a repeated fashion then there is a possibility
that it might be beneficial to figure out what is really causing
this situation to occur and remove it so the situation does not
occur again. This is generally referred to as Root Cause Analysis,
finding the real cause of the problem and dealing with it rather
than simply continuing to deal with the symptoms.
This raises several questions:
- How does one determine which situations are candidates for
root cause analysis?
- How does one figure out what the root cause is?
- Does the removal of the cause entail less resource expenditure
than it takes to continue to deal with the symptom?
Determining Candidates
In normal chaotic organizational environments it is often quite
difficult to find candidates for root cause analysis because the
situations which repeat are either distributed over time so one
doesn't realize they are actually recurring, or the situation
happens to different people so there isn't an awareness of the
recurring nature of the situation. When an organization is using
a an automated problem resolution support system, such as SolutionBuilder,
it is very easy to determine which situations are recurring with
what frequency. Every time a solution is used its frequency counter
gets updated, so all one has to do is run reports against the
system to determine which solutions are being used with what frequency.
Those situations which are recurring with the greatest frequency
and consume the greatest amount of resource to rectify are the
candidates for root cause analysis.
Finding the Root Cause
Most situations which arise within an organizational context
have multiple approaches to resolution. These different approaches
generally require different levels of resource expenditure to
execute. And, due to the immediacy which exists in most organizational
situations there is a tendency to opt for the solution which is
the most expedient in terms of dealing with the situation. In
doing this the tendency is generally to treat the symptom rather
than the underlying fundamental problem that is actually responsible
for the situation occurring. Yet, in taking the most expeditious
approach and dealing with the symptom, rather than the cause,
what is generally ensured is that the situation will, in time,
return and need to be dealt with again.
Consider the specific example of expediting customer orders
in an order fulfillment process. The organization has a well defined
process for accepting, processing, and shipping customer orders.
When a customer calls and complains about not getting their order
the most normal response is to expedite. This means that someone
personally tracks down this customer's order, assigns it a #1
priority, and ensures it gets shipped ahead of everything else.
What isn't realized, until sometime later on, if at all, is that
in expediting this order one or more other orders were delayed
because the process was disrupted to get this customer's order
out the door. What is all comes down to is that expediting orders
simply ensures that more orders will have to be expedited later.
In systems terms this is a typical "Fixes
that Fail" structure which evolves into an "Addiction" structure where
the organization becomes addicted to expediting to deal with customer
order complaints.
The appropriate response to this situation is to figure out
why the order was in need of expediting in the first place. Yet
this is seldom done because the task assigned to the expediter
was, "get the order shipped!" and that's as far as the
thought processes and investigation are apt to go.
To find root causes there is one really only one question that's
relevant, "What can we learn from this situation?" Research
has repeatedly proven that unwanted situations within organizations
are about 95% related to process problems and only 5% related
to personnel problems. Yet, most organizations spend far more
time looking for culprits than causes and because of this misdirected
effort seldom really gain the benefit they could gain from understanding
the foundation of the unwanted situation. Consider the following
two scenarios.
Scenario # 1
The Plant Manager walked into the plant and found oil on the
floor. He called the Foreman over and told him to have maintenance
clean up the oil. The next day while the Plant Manager was in
the same area of the plant he found oil on the floor again and
he subsequently raked the Foreman over the coals for not following
his directions from the day before. His parting words were to
either get the oil cleaned up or he'd find someone that would.
Scenario # 2
The Plant Manager walked into the plant and found oil on the
floor. He called the Foreman over and asked him why there was
oil on the floor. The Foreman indicated that it was due to a
leaky gasket in the pipe joint above. The Plant Manager then
asked when the gasket had been replaced and the Foreman responded
that Maintenance had installed 4 gaskets over the past few weeks
and they each one seemed to leak. The Foreman also indicated
that Maintenance had been talking to Purchasing about the gaskets
because it seemed they were all bad. The Plant Manager then went
to talk with Purchasing about the situation with the gaskets.
The Purchasing Manager indicated that they had in fact received
a bad batch of gaskets from the supplier. The Purchasing Manager
also indicated that they had been trying for the past 2 months
to try to get the supplier to make good on the last order of
5,000 gaskets that all seemed to be bad. The Plant Manager then
asked the Purchasing Manager why they had purchased from this
supplier if they were so disreputable and the Purchasing Manager
said because they were the lowest bidder when quotes were received
from various suppliers. The Plant Manager then asked the Purchasing
Manager why they went with the lowest bidder and he indicated
that was the direction he had received from the VP of Finance.
The Plant Manager then went to talk to the VP of Finance about
the situation. When the Plant Manager asked the VP of Finance
why Purchasing had been directed to always take the lowest bidder
the VP of Finance said, "Because you indicated that we had
to be as cost conscious as possible!" and purchasing from
the lowest bidder saves us lots of money. The Plant Manger was
horrified when he realized that he was the reason there was oil
on the plant floor. Bingo!
You may find scenario # 2 somewhat funny, and laugh at the
situation. It would be better if the situation made you weep because
it is often all so true in numerous variations on the same theme.
Everyone in the organization doing their best to do the right
things, and everything ends up screwed up. The root cause of this
whole situation is local optimization with no global thought involved.
Scenario # 2 also provides an good example of how one should
proceed to do root cause analysis. Once simply has to continue
to ask "Why?" until the pattern completes and the cause
of the difficulty in the situation becomes rather obvious.
To Resolve or Not To Resolve
Once the root cause is determined then it has to be determined
whether it costs more to remove the root cause or simply continue
to treat the symptoms. This is often not an easy determination.
Even though it may be relatively easy to estimate the cost to
remove the root cause it is generally very difficult to assess
the cost of treating the symptom. This difficulty arises because
the cost of the symptom is generally wrapped up in some number
of customer and employee satisfaction factors in addition to the
resource costs associated with just treating the symptom.
Consider a situation where it is determined that it will cost
$100,000 to remove the root cause of a problem and only 5 minutes
for someone to resolve the situation when the customer calls with
the problem. Initially one might perceive that the cost of removing
the root cause is far larger than the cost of treating the symptom.
Yet suppose that this symptom is such that when it arises it so
infuriates the customer that they swear they will never buy another
product from you, and will go out of there way for the next year
to tell everyone they meet what a terrible company you are to
do business with. How do you estimate to lost business cost associated
with this situation. And if you think this is a bizarre case,
it is not, for I was personally on an "I hate Midas Muffler"
campaign for over two years because they screwed up the brakes
on my car. In that two years I managed to reach several thousand
people because I preached "I hate Midas Muffler" in
my TQM classes, and continued to use them as an excellent bad
example.
Postscript
Is "Root Cause Analysis" really an appropriate phrase?
In this apparently endlessly interconnected world, everything
seems to influence so many other things. Seeking the "Root
Cause" is an endless exercise because no matter how deep
you go there's always at least one more cause you can look for.
Might "Actionable Cause Analysis" be more appropriate?
I think I'm looking for a cause that I can act on that will provide
long term relief from the symptoms, without causing more problems
that I have to deal with tomorrow.
Acknowledgement
Scenarios from NASA's Mishap Investigation Training by Faith Chandler
(https://secureworkgroups.grc.nasa.gov/mi)
Additional Resources
theWay of Systems
* Feedback
* Musings
Copyright © 2004 Gene Bellinger
|