Tilgjengelighet

« Back to Glossary Index

Tilgjengelighet – på praktisk vis (og denne gangen dessverre på engelsk)

The instant a critical machine is not running when it is supposed to, the bottom line will be hit either directly or indirectly. Quantifying this is what the availability metric is for, and thus a metric to take seriously. A good metric should reflect the value of what we are aiming at, and even though often forgotten, true value is delivering what the customer needs (either internally or externally). Which is exactly what availability is doing – measuring to what degree equipment is available when needed to an internal customer. The implications of equipment not being available when needed are enormous – on critical equipment it is usually measured in thousands of dollars per hour, but other negative effects on long term economical sustainability are also always present in either of the forms: direct customer impact, quality, safety, environment and other ripple effects causing hidden costs throughout.

We all think that we know what availability is and therefore do not bother with a rigorous definition. Availability then ends up meaning whatever individuals believe it to be, leading to confusion and situations where no one really trusts this performance measure and it loses its power. In this article it will be described both how to understand it in a productive way and how to use it in practice.

Availability definition

The exact wording of its definition varies slightly, but it can always be expressed as something like “the ability of equipment to be available when required”.  In practice measured as a percentage.

Based on the confusion and the difficulties in implementation, suggests that this is not an easy definition to understand and apply. The following are three key things to understand:

  1. Its purpose.
  2. Understanding the meaning of “required” in the definition.
  3. Its practical implementation and use.

Purpose

The purpose is not to measure to which degree an equipment is available for operations when required. The purpose is to know the potential to be realized by improving maintenance and how the equipment is operated and making decision upon this information.

Everything below 100 % is simply a gap that maintenance and operations should aim at closing. Its value lies in a clear cut, non-negotiable responsibility, which is a great starting point for prioritization and actions.

Required

“Required” is central to a correct understanding, but it is not self-explanatory and is where it is easy to get into trouble. For a proper understanding of “required”, it is useful to keep in mind the contractual context of the seller and buyer of the equipment.

The seller has sold equipment with performance specifications. But these specifications are only valid given both the stated and implied requirements of equipment support are met. Buying the equipment alone will give us nothing. Amongst others, it needs a specified amount of power, maintenance, and operating personnel. It happens all the time that equipment is bought, commissioned, and paid for and then not meeting the expected performance during the operations phase. The buyer simply underestimates the stated and implied requirements on what it takes to operate and maintain the purchased equipment. The buyer might complain to the seller, but many times to no avail because there are usually plenty of fingers to point at causes outside the equipment itself: causes belonging upstream or downstream of the equipment, variations in the raw materials falling outside of the design specifications, poor quality in maintenance or operating activities or an unsuitable operating environment.

One of the critical requirements from the seller is that some predetermined maintenance will be required to achieve the performance specifications. Therefore, it would be unreasonable to require the equipment to be running during the execution of this stated or implied maintenance. Hence during this type of maintenance, the time should not be considered as a “required” time for the equipment to be available for operation.

Failing to dimension this needed maintenance correctly will end up in time and energy wasted in futile arguments with the seller, or a maintenance organization wasting resources trying to lift reliability above what the inherent reliability in the design allows.

Practical use

If only it were so simple to just automatically record running time and do some calculations. The complicating factor is the required time. The complication arises because not only is it necessary to know the time when something is not running. The cause (although on a high level) must also be known. For example, if down time is registered every time there is a lunch break, this time must be categorized as non-required time. Evaluating equipment performance based on lunch-time length would not be fair to the equipment.

Another non-operating time that cannot count as downtime affecting the availability metric, is the predetermined maintenance (periodic maintenance). As explained earlier, this time is mainly determined by design, and any grievances on cost, time etc. belong to the project organization and the decision makers behind the specific equipment. Not the maintenance organization or operations who will have little opportunity optimizing design related issues.

Below bar is a timeline – green is running time and orange is non-running time. No causes are given, and it is therefore not possible to calculate availability from the registrations.

In another timeline (below), green is running time and orange is non-running time caused by failures. Since availability is the proportion of the green time compared to overall time, availability is easily calculated with the given failure times and overall time:

Availability == (30 – 0,25 – 0,5 – 0,25) / 30 = 0,967 = 97 %

If the same equipment has a maintenance plan with a scheduled 6-month maintenance interval, the non-running time has doubled, but the availability will go down only slightly because predetermined maintenance is not defined as required time (see below bar for illustration). In this case the availability is:

Availability = (30 – 1 – 0,25 – 0,5 -0,25) / (30 – 1) = 28 / 29 = 0,966 = 97 %

A bit more realistic example is illustrated below, but even though this would still be too simple compared to the real world, things already start to get messy:

A few might have a dedicated IT-tool for monitoring and timing stops to juggle all the different causes, and even then, manual inputs are most likely needed in the form of an operator coding stops. With a tool like this, availability could be calculated:

Availability = (30-0,1-1-1-0,25-0,5-0,25-0,1-2) / (30-0,1-1-1-0,1-2) = 0,96 %

Again, availability is affected relatively little compared to the non-running time being more than doubled.

Below diagram is borrowed from murbox.com (no affiliation) and may show a more realistic picture of running time (green) and the different stop codes during two shifts.

A simpler way

Most do not have dedicated IT-tools for this kind of monitoring installed. A solution that in most cases works well enough, is simply to record failure duration and define all other time as required time. As above examples demonstrate, availability is fairly robust against even big changes in non-operating time, as long as the failure times are registered. We are then back to below timeline with the calculation:

Availability = (30-0,25-0,5-0,25) / 30 = 29/30 = 0,967 = 97 %

Given that the purpose is to know the potential and doing something about it, it can be argued that the availability metric usually is only applied to an exclusive few where there is both commitment and strength to follow up and make improvements. Luckily the nature of these selected few are typically conducive to simplifications like the one mentioned. And should there be examples where simplification are not meaningful, the thousands of dollars per hour typically lost in these cases, would probably make it worthwhile to establish a system providing the correct values.

It is not just about consistency – the degree of correctness matters. Resources in maintenance departments are always too scarce for the expressed needs since there is no cap on maintenance demand. That means the numbers must be trusted or they will surely be ignored.

How far back?

A relevant question will always be how far back in time we should go when analyzing the data. Availability may have cyclical tendencies tied to the calendar year because of ambient temperature changes, vacation periods with less skilled personnel on the job, etc. If that is the case, it may be useful to plot a monthly availability metric over at least a year.

It is possible to go too far back, and usually the aim is to gain knowledge relevant for the present situation – past data quickly become irrelevant or misleading. Analysis like this requires a human brain able to think about how the data is used. Endless variations between equipment and applications makes the use of cookbook recipes problematic and some critical thinking needs to be involved.

Comparing data points

Every time a metric is computed, it will be different from the value it was before. It will either be lower or higher – even if nothing in practice has changed. That is why two data points alone should never be compared to each other. It has to do with KPI`s being random variables and the values will therefore change even if the underlying process generating the values is the same. This applies for all KPI`s and is unfortunately ignored throughout all organizations and is easily one of the biggest causes of waste in all enterprise.

Instead, the availability metric must be plotted on a timeline to distinguish real changes from “white noise” by observing variation. Plotting monthly values in a diagram works well in practice (human brains do not deal well with values in tables).

Plot the dots!

Method justification

It is possible to deduct mathematical expressions defining availability, but it is important to recognize that pure mathematical models crumbles when facing a complex and ever changing manufacturing environment.

The definition and calculation used in this article does not rely on any model and are therefore robust against violations of any assumptions demanded by such model. This is an important point, because complex and ever-changing manufacturing environments eat assumptions for breakfast.

The results obtained from the demonstrated methods will be so-called “point estimates” of the probability of the equipment being available at a single point in time when it is required. This is both a theoretically valid and a practical definition of availability which does not violate the definition given in the beginning of the article.

Next steps

Knowing an availability value is knowing the improvement potential in maintenance and the way of operating. This should lead to actions. But knowing a specific value of an availability metric says nothing about what these actions should be. No-one will be impressed or be any wiser by being told that the solution is to reduce the number of failures or repair time.

The next step is therefore knowing the what and why. What are the specific failures and why are they happening? Not small questions! It then becomes important for this work to be justified properly. And how would we know without measuring?

And not the least, how would we truly know that the actions followed by the what and the why are working? And we need to know to learn, because without learning there will be no improvement. If learning is not the issue, then the question becomes why the necessary actions have not been done before. Options would then be narrowed down dangerously close to either being stupid or lazy.

Hvis man lykkes med disse registreringer, sitter man igjen med data som senere kan gå hen å bli en gullgruve da dette åpner opp for en mengde levetidsberegninger. Levetidsberegninger benyttes ved optimaliseringer, men som mest er for organisasjoner som allerede ligger på et høyt drifts- og vedlikeholdsnivå. Anbefalt lesning for optimering innen vedlikehold er Maintenance, replacement and reliability.

En masteroppgave på norsk om temaet kan lastes ned her.

« Back to Glossary Index
Til topp