Defining Failures and Being Consistent

To identify when maintenance is required, we need to define failure. The traditional view was that as equipment gets older, it is more likely to fail. The old definition of failure is when the equipment breaks down and is no longer operational.

However, studies have shown that the majority of failures are not age related.
In the new definition of failure, all equipment entering service immediately starts to wear, whether installed as new or brought back to new through repair. Equipment will eventually reach a point where it fails to meet the operating requirement. This failure point is not necessarily predictable – it could happen early on or after years of use.

If the equipment has no capability at all, it is in a totally failed state or breakdown state. If there is some capability, but the equipment is not meeting the desired level of performance, it is said to be in a functionally failed or partially failed state.
By conducting inspections of equipment condition on a regular basis, you can track early signs or indicators of a partial or functional failure long before it breaks down. By finding indicators of failure, maintenance can be targeted more accurately. When you look for indicators of failure, this is called conducting a condition inspection.

Let’s use an example. We have a pump that is required to supply between 130 and 100 gallons of water to the process. If it supplies any less than 100 gallons, the process will not operate properly. In the past, we defined failure as the point when the pump broke and does not pump any water at all. But most failures do not occur instantly. To track potential failures, we use indicators (such as tolerances, or gauge readings or other visual physical signs that indicate equipment condition is deteriorating). Since the failure point is not necessarily related to age, indicators must be monitored on a regular basis. Let’s use a gauge reading as our indicator. The indicator reads that the pump is only pumping 105 gallons. Since this is the low end of what it is required to do, it is considered a potential failure or point P on the curve. If the deterioration is not corrected, it will continue until it is pumping less than 100 gallons of water. The pump is still working, but not at the desired performance level – it has a functional failure. This is today’s definition of failure, the point where the asset fails to perform its intended function.

The amount of time that elapses between the detection of a potential failure and its deterioration to functional failure is known as the PF interval. If you properly define inspection tasks, you are able to detect failure long before it occurs and perform the corrective maintenance work when it will least impact operations.

Remember that if the potential failure (P) is not detected, the equipment will continue to deteriorate until the point where it reaches functional failure (F). Once enough condition inspection data has been defined, you can calculate the PF Interval and plan maintenance activities.

2 Responses to “Defining Failures and Being Consistent”

  1. Jim Becker says:

    There are other considerations. If the flow is controlled, then the failure is when the process reading for the flow is not at the set point. Then the calculation of SP-PV results in the Delta, which should be compared to the limit. If the set point is low, like 105 gpm, then a Delta of 5 is not as indicative of a pump problem at when the set point is 130 and the Delta is 20. One can also develop a curve for the control valve position versus flow. When the valve position is much higher than normal for the same flow rate, that indicates a pump problem.

  2. I agree with everything the author said as this is pretty much right out of John Moubray’s book. One exception is the last point: “Once enough condition inspection data has been defined, you can calculate the PF Interval and plan maintenance activities.”
    If we look at the Resnikoff Conundrum (p252 in John’s book):
    Many believe that it is not possible to develop a viable maintenance program without extensive data about failures but if we are collecting lots of data about failures it must be because we are not preventing them. Therefore large quantities of failure data must be evidence of the failure of our preventive maintenance programs; especially if the failures have significant consequences. So successful maintenance must be about preventing the collection of the information that some people think we need in order to decide what preventive maintenance we ought to be doing!
    The Resnikoff Corollary States:
    Really successful physical asset management is much more about anticipating and/or preventing failures (which matter) than it is about counting them. Score cards only tell you what you did, not what you should be doing.
    Therefore we need to define the P-F interval without data and refine as we get data, if we can. If we are very successful in developing our maintenance program we will not have enough data to refine the program. This is actually a good sign.
    Paul Lanthier P.Eng.
    Director, The Aladon Network

Leave a Reply