"A journey in the realm of systems"

Home Page

The Way
(Site Navigation


The Way of Systems

Root Cause Analysis

Root Cause Analysis

If I have an unwanted situation which consumes resources and tends to happen in a repeated fashion then there is a possibility that it might be beneficial to figure out what is really causing this situation to occur and remove it so the situation does not occur again. This is generally referred to as Root Cause Analysis, finding the real cause of the problem and dealing with it rather than simply continuing to deal with the symptoms.

This raises several questions:

  • How does one determine which situations are candidates for root cause analysis?
  • How does one figure out what the root cause is?
  • Does the removal of the cause entail less resource expenditure than it takes to continue to deal with the symptom?

Determining Candidates

In normal chaotic organizational environments it is often quite difficult to find candidates for root cause analysis because the situations which repeat are either distributed over time so one doesn't realize they are actually recurring, or the situation happens to different people so there isn't an awareness of the recurring nature of the situation. When an organization is using a an automated problem resolution support system, such as SolutionBuilder, it is very easy to determine which situations are recurring with what frequency. Every time a solution is used its frequency counter gets updated, so all one has to do is run reports against the system to determine which solutions are being used with what frequency. Those situations which are recurring with the greatest frequency and consume the greatest amount of resource to rectify are the candidates for root cause analysis.

Finding the Root Cause

Most situations which arise within an organizational context have multiple approaches to resolution. These different approaches generally require different levels of resource expenditure to execute. And, due to the immediacy which exists in most organizational situations there is a tendency to opt for the solution which is the most expedient in terms of dealing with the situation. In doing this the tendency is generally to treat the symptom rather than the underlying fundamental problem that is actually responsible for the situation occurring. Yet, in taking the most expeditious approach and dealing with the symptom, rather than the cause, what is generally ensured is that the situation will, in time, return and need to be dealt with again.

Consider the specific example of expediting customer orders in an order fulfillment process. The organization has a well defined process for accepting, processing, and shipping customer orders. When a customer calls and complains about not getting their order the most normal response is to expedite. This means that someone personally tracks down this customer's order, assigns it a #1 priority, and ensures it gets shipped ahead of everything else. What isn't realized, until sometime later on, if at all, is that in expediting this order one or more other orders were delayed because the process was disrupted to get this customer's order out the door. What is all comes down to is that expediting orders simply ensures that more orders will have to be expedited later. In systems terms this is a typical "Fixes that Fail" structure which evolves into an "Addiction" structure where the organization becomes addicted to expediting to deal with customer order complaints.

The appropriate response to this situation is to figure out why the order was in need of expediting in the first place. Yet this is seldom done because the task assigned to the expediter was, "get the order shipped!" and that's as far as the thought processes and investigation are apt to go.

To find root causes there is one really only one question that's relevant, "What can we learn from this situation?" Research has repeatedly proven that unwanted situations within organizations are about 95% related to process problems and only 5% related to personnel problems. Yet, most organizations spend far more time looking for culprits than causes and because of this misdirected effort seldom really gain the benefit they could gain from understanding the foundation of the unwanted situation. Consider the following two scenarios.

Scenario # 1

The Plant Manager walked into the plant and found oil on the floor. He called the Foreman over and told him to have maintenance clean up the oil. The next day while the Plant Manager was in the same area of the plant he found oil on the floor again and he subsequently raked the Foreman over the coals for not following his directions from the day before. His parting words were to either get the oil cleaned up or he'd find someone that would.

Scenario # 2

The Plant Manager walked into the plant and found oil on the floor. He called the Foreman over and asked him why there was oil on the floor. The Foreman indicated that it was due to a leaky gasket in the pipe joint above. The Plant Manager then asked when the gasket had been replaced and the Foreman responded that Maintenance had installed 4 gaskets over the past few weeks and they each one seemed to leak. The Foreman also indicated that Maintenance had been talking to Purchasing about the gaskets because it seemed they were all bad. The Plant Manager then went to talk with Purchasing about the situation with the gaskets. The Purchasing Manager indicated that they had in fact received a bad batch of gaskets from the supplier. The Purchasing Manager also indicated that they had been trying for the past 2 months to try to get the supplier to make good on the last order of 5,000 gaskets that all seemed to be bad. The Plant Manager then asked the Purchasing Manager why they had purchased from this supplier if they were so disreputable and the Purchasing Manager said because they were the lowest bidder when quotes were received from various suppliers. The Plant Manager then asked the Purchasing Manager why they went with the lowest bidder and he indicated that was the direction he had received from the VP of Finance. The Plant Manager then went to talk to the VP of Finance about the situation. When the Plant Manager asked the VP of Finance why Purchasing had been directed to always take the lowest bidder the VP of Finance said, "Because you indicated that we had to be as cost conscious as possible!" and purchasing from the lowest bidder saves us lots of money. The Plant Manger was horrified when he realized that he was the reason there was oil on the plant floor. Bingo!

You may find scenario # 2 somewhat funny, and laugh at the situation. It would be better if the situation made you weep because it is often all so true in numerous variations on the same theme. Everyone in the organization doing their best to do the right things, and everything ends up screwed up. The root cause of this whole situation is local optimization with no global thought involved.

Scenario # 2 also provides an good example of how one should proceed to do root cause analysis. Once simply has to continue to ask "Why?" until the pattern completes and the cause of the difficulty in the situation becomes rather obvious.

To Resolve or Not To Resolve

Once the root cause is determined then it has to be determined whether it costs more to remove the root cause or simply continue to treat the symptoms. This is often not an easy determination. Even though it may be relatively easy to estimate the cost to remove the root cause it is generally very difficult to assess the cost of treating the symptom. This difficulty arises because the cost of the symptom is generally wrapped up in some number of customer and employee satisfaction factors in addition to the resource costs associated with just treating the symptom.

Consider a situation where it is determined that it will cost $100,000 to remove the root cause of a problem and only 5 minutes for someone to resolve the situation when the customer calls with the problem. Initially one might perceive that the cost of removing the root cause is far larger than the cost of treating the symptom. Yet suppose that this symptom is such that when it arises it so infuriates the customer that they swear they will never buy another product from you, and will go out of there way for the next year to tell everyone they meet what a terrible company you are to do business with. How do you estimate to lost business cost associated with this situation. And if you think this is a bizarre case, it is not, for I was personally on an "I hate Midas Muffler" campaign for over two years because they screwed up the brakes on my car. In that two years I managed to reach several thousand people because I preached "I hate Midas Muffler" in my TQM classes, and continued to use them as an excellent bad example.


Is "Root Cause Analysis" really an appropriate phrase? In this apparently endlessly interconnected world, everything seems to influence so many other things. Seeking the "Root Cause" is an endless exercise because no matter how deep you go there's always at least one more cause you can look for. Might "Actionable Cause Analysis" be more appropriate? I think I'm looking for a cause that I can act on that will provide long term relief from the symptoms, without causing more problems that I have to deal with tomorrow.


Scenarios from NASA's Mishap Investigation Training by Faith Chandler (

Additional Resources

theWay of Systems * Feedback * Musings
Copyright © 2004 Gene Bellinger