Asset Performance Management is the New Pinnacle of the Maintenance Maturity Pyramid

Let’s begin with a favorite quote from Terrence O’Hanlon, CEO of Reliability Web: “If reliability and asset performance are your objectives, here’s the bad news: The best practices for maintenance do not work. In fact, they never did! The planned maintenance domain is not stable.”

In the "pole position," long asserted as the pinnacle, is reliability-centered maintenance (RCM) — managing maintenance through criticality, risk and effective strategy. What does criticality mean in maintenance circles, and how should we assess it? Through opinion or in quantitative ways? The currently accepted maintenance maturity chart shows RCM as the highest level of maturity.

Terrence is right; it’s time for a change. Current maintenance techniques are clearly not working, especially RCM because it is too expensive for the minimal results achieved, and machines still break down without warning. We must replace RCM with simpler, more cost-effective and value-based tools.

New technologies in asset performance management (APM) are defining new and more effective techniques for maintenance and reliability. Maintenance execution, including RCM efforts, has always been a mysterious doctrine enshrined in opinion and preferences rather than a true science. Based on fact, science is a systematic and logical approach developed through testing, analysis, and measurable results.

Machine learning is such a science-based technology, with methods and theory for specific application domains to use “training” data for learning patterns of past behavior that when applied to new, unseen data can make accurate predictions and probabilities of future events. Not all machine learning is created equal, but some technology companies have perfected its use in a truly disruptive and breakthrough way that performs condition-based monitoring with far greater efficacy than previous monitoring techniques.

Such techniques are so effective that you no longer need to spend time and effort determining asset criticality with specious and expensive methods such as RCM’s RPN (reliability probability) numbers. RPN is a simplistic and unreliable multiplication of three equally weighted ordinals: severity (and potential failure effect), likelihood of occurrence and likelihood of detection. Each ordinal is determined by subjective opinion, and statistically the multiple is guaranteed to deliver a wildly skewed distribution imagining that each ordinal is equally important to criticality. (See “Problems With Risk Priority Numbers Avoiding More Numerical Jabberwocky” by Dr. D. J. Wheeler to learn more.)

Why Risk Priority Numbers Don’t Work for APM

There are two issues. First, criticality and risk are not just about the mechanical integrity of a single machine and its singular effect on the process. The machine is one cog in a whole manufacturing system — if it breaks, the whole system changes. Consequently, criticality also includes the integrity of a machine as a component part of whole system, considering comprehensive risk to operational performance, production impact, feed availability and selection, plans and schedules for manufacturing, inventories and delivery commitments. Diverse situations occur and may need to be considered individually in real time. There is not one plan that works for all scenarios.

Then there's the second issue: without such an unnatural RCM-RPN exercise, chances are your maintenance manager is aware of the critical assets at the top of the asset performance optimization effort. Associated is the idea that it’s not just the big critical assets that present risk. An unspared small pump on a fractionator recycle loop may present as much of a complete unit breakdown risk as the large wet gas compressor. It is an environment where lots of small bad things happen, none of which are individually fatal, but the knock-on effects and sum of them can add up to a slow demise, resulting in catastrophe.

Enter APM 4.0

APM 4.0 is about a total asset optimization program that takes in more than simplistic criticality scores. More effective, it considers total risk ranking: first to safety, health and the environment, and second to mechanical and operational reliability, including product quantity, quality and cost.

New APM 4.0 monitoring uses industrialized machine learning that’s inexpensive, simple and fast to implement, readily scaling to blanket all the assets in the organization with the same safety, environmental and breakdown protection. So you do not need to just implement on expensive critical equipment — you can start with those assets and move quickly to protect the whole site. State-of-the-art condition-based monitoring renders traditional criticality-driven asset performance management redundant.

I propose a new APM 4.0 maturity pyramid with significant changes to the top level, based on state-of-the-art technology implementations:

Operational Reliability

To achieve APM 4.0, products and tools must assess more than mechanical reliability. Operational reliability considers the full operating context of an asset, taking into account the perturbations and deviations in operating conditions that cause machine deterioration and process quality distress.

A new concept in APM is the idea of product quality leakage through inadequate process operations. Your APM 4.0 toolset will understand how process variations can provide uneven quality and processing times and prescribe specific corrective action. Similarly, prescriptive maintenance will advise specific operator changes to avoid process-induced equipment deterioration. Remember, you can’t separate the machine from the process or the process from its constituent machines!

Total System Risk and Reliability

APM 4.0 brings two other issues that are equally important to the asset lifecycle and the bottom-line plant performance: operational reliability and total system risk. In this regard, traditional maintenance strategies do not cut it.

Total system reliability considers how different actions on one machine affect the total bottom line. Earlier warnings of pending failures allow additional time to plan. That means orderly, safe equipment shutdowns without environmental incidents. But what is the quantitative difference? You could shut down immediately, in 20 days, 40 days, or run to failure. The analysis goes beyond specific assets to evaluate the entire ecosystem spanning the business, design, operations, maintenance and logistics.

An APM 4.0 solution will identify and assess all possible ways to minimize lost production and revenue under the imminent failure circumstances. Perhaps the right answer is to run 20 days, directing excess intermediate product to tankage to offset product shortfalls when the equipment goes out of service. Only data-driven analysis with quantitative results can help you choose the correct action.

Application Interoperability

APM 4.0 cannot be truly effective without full application interoperability: the applications used to manage maintenance execution work processes and strategies must intercommunicate with sources of data, mechanical and operational advice — with limited human intervention. Imagine a large piece of equipment is predicted to fail in 40 days, and a risk and cost process analysis has determined the optimal time to shut it down for service and repair is in 20 days. Intercommunications can determine which other minor pieces of equipment will also be shut down at the same time and assess their maintenance health and history. This data informs a judgment as to which equipment should be serviced at the same time, which may push out or shorten the next turnaround.

In summary, I believe APM 4.0 succeeds RCM as the pinnacle of the maintenance maturity pyramid. It is simpler and easier to implement, without intense effort and costs to provide blanket coverage for a whole plant. APM 4.0 covers more bases: operational and mechanical reliability and integrity, true prescriptive advice for maintenance and operations tasks, plus total system risks and costs when determining maintenance and operations activities. In tomorrow's blog, I’ll explain how APM 4.0 and the new maintenance maturity pyramid require a new definition of predictive maintenance.

For more information, read "Maximize Safety, Sustainability and Productivity by Turning Unplanned Downtime Into Planned Downtime."

Comments

4 years ago

Tony Poole Posted on September 18, 2020

The RCM approach developed when "systems" were less complex and asset management was about the component cost. Great perspective Mike.
4 years ago

Tony Poole Posted on September 18, 2020

The RCM approach developed when "systems" were less complex and asset management was about the component cost. Great perspective Mike.

AspenTech Blog