Mining Equipment Maintenance Fundamental Concepts
Old Underground Coal Mines
Modern Longwall Mine
The effect of Mechanisation on Productivity 25000
20000
Tonnes/Man-Year Produced in Queensland Coal Mines
15000
10000
5000
0
6 2 0 2 5 3 0 3 5 4 0 4 5 5 0 5 2 5 5 6 0 6 9 7 2 7 5 8 0 8 5 8 8 9 5 0 3 1 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2
Maintenance Costs • 30-50% of the operating costs • Annual bill is about $10 billion • Equally significant is the cost of lost production when the machine is down • Every 1% improvement in equipment availability or productivity improves the company profits by up to 3.5%
Objective Function for Maximising Company profits The Annual Production for a given investment is a suitable objective function
Total Annual Production • Is this total annual production ? Tonnes/Hour x Total Hours • More realistic prediction is Tonnes/Hour x (TotalHours - Downtime) • Probably the following is more illuminating Tonnes/Hour x (TotalHours -Scheduled Downtime - Breakdown maintenance)
Total Hours of Possible Production
Subtract Planned Maintenance
Subtract Breakdowns This is the time available for production. Obviously, you want to maximise this time. That is why the ratio of this time to the total time is a very important KPI.
Availability TH - PM - BM A= TH TH Total Hours PM Planned Maintenance BM Breakdown Maintenance
Availability TH - PM - BM Operating Time A= ! TH Total Time Is also expressed in terms of MTBF and MTTR as
MTBF A= MTBF+MTTR
Availability TH - PM - BM Operating Time A= ! TH Total Time Is also expressed in terms of MTBF and MTTR as
MTBF A= MTBF+MTTR This is the textbook definition of AVAILABILITY
Availability Operating Time = N x MTBF Downtime = N x MTTR Total Time = N x (MTBF+MTTR)
Operating Time N x MTBF A= ! Total Time N x (MTBF+MTTR)
Maximise the Return to the Company The aim for the company is to maximise the return from its productive assets. Let us keep it simple and express this as maximising the production through the year. To achieve this, (a) The machine availability must be high (b) The machine must be producing at a high rate when it is operating The solution is a compromise between the two. We will see this in an example.
Optimum Bucket Size The recommended suspended load for your dragline is 150 tonnes. A competitor put a larger bucket on a similar unit and they are running it at a suspended load of 170 tonnes. Your Manager wants you to install an even bigger bucket, taking your suspended load up to 210 tonnes. You know that the safe static load for this dragline is 250 tonnes. -Does a larger bucket necessarily mean higher production? -Why not increase it to 250 tonnes if this is the safe load? -How would you determine the optimum bucket size?
Question
A=
TH - PM - BM TH
Present Production Po = 9000 t/h. Total Mine Operating Time in a Year = 8640h 12-hour planned maintenance shift every month. Breakdown maintenance downtime is expected to vary with the production rate as
BM
! 336 " R
3
where R !
P
P o
P is the production rate with the bigger bucket. Find R that maximises the annual production.
Solution t o c e n u e a d n n t e o i i n t a u c m d n o r w p o d t s k a o L r e B
Optimum Point
Dependence on MTBF
Maintenance • Maintenance must be considered in the context of asset utilisation • Mining is an asset-rich industry • Optimum utilisation of these assets is the only way a company will stay competitive • This is a task for both production and maintenance engineers.
Some Basic Concepts
Failure Loss of ability of an item to perform its required function
Failure Broken teeth on shearer downdrive gear
Failure Cost associated with this failure: •Lost production when the machine was down •Replacement gear •Maintenance labour
Failure Rate, MTBF, MTTR MTTR
MTBF
Failure
MTBF
r i a p e R
r i a p e R
0
MTTR
100
r i a p e R
200
Hours
Failure Rate, MTBF, MTTR MTTR
MTBF
MTBF
r i a p e R
r i a p e R
0
MTTR
100
r i a p e R
200
Hours
MTBF = Mean Time Between Failure (100 h) Failure Rate = Number of failures per unit time (0.01 h -1) MTTR = Mean Time To Repair (20 h)
Historical Records Failures occur randomly. The repair time is also not constant. How do we find MTBF and MTTR?
If we treat failure as a random event, then we can use the well-established tools of probability and statistics to model the uptime, downtime and availability for our equipment.
“Random Failure”? All nature is but art, unknown to thee All chance, direction thou canst not see All discord, harmony not understood
Alexander Pope, Essay on Man
Poisson Process Poisson distribution is commonly used in forecasting to represent the number of occurrences of a specific event in a given continuous interval. •Ships arriving at a dock on a given day •Traffic accidents on the SE freeway in a month •Mad cow disease breakouts in the world in one year •Typos per page in a long report typed by Hal Gurgenci •Cable shovel failures in one day of operation
Poisson Distribution p ( x) !
e
& % t
x
# % t $
x !
This is the probability distribution of the Poisson random variable X representing the number of outcomes occurring in a given time interval t. The parameter % is the average number of outcomes per unit time.
Reliability Assume failure events follow a Poisson distribution. What is the probability of having NO FAILURES in a given time interval t? This can be found by substituting x=0 in the Poisson distribution function:
p (0) !
e
& % t
# % t $ 0!
0
!e
& % t
This is referred to as the survival probability or the reliability
Reliability (t ) ! e
& % t
Reliability is the probability that a product will operate throughout a specified period without failure –when maintained in accordance with the manufacturer's instructions; and –when not subjected to the environmental or operational stresses beyond limits stipulated by the manufacturer
The value of e Two centuries ago, a Polish Statistician, Ladislaus Bortkiewicz, investigated the Prussian army fatalities caused by horse kicks. According to army reports, the rate was about one fatality every 1.64 years. Ladislaus collected the reports for one year. These were 200 reports and 109 recorded no deaths at all. Can you estimate the value of e using the above data?
Example A major piece of equipment fails twice a day on average. Consider its reliability over a period of month. What is the probability of failure at any time during that month?
Reliability ! e&2t
Zoom In ! e&2t
Failure Probability If the reliability, R(t), is the probability to survive through time t, then the probability of failing through that period is 1 – R(t) or
F (t ) ! 1 & R (t ) ! 1 & e Let us plot this
& % t
Failure Probability This chart gives the probability of the failure from 0 to time t. In other words, it is the Cumulative Distribution Function for failures. How would we find the probability density function or p.d.f. This is sometimes useful. We differentiate the cumulative distribution function.
Failure Distribution Function Cumulative Distribution Function
F (t ) ! 1 & R (t ) ! 1 & e
& % t
The time-derivative of the c.d.f. gives the Probability Distribution Function or p.d.f.
f (t ) !
dF dt
! % e
& % t
This form of p.d.f. is called the Exponential Distribution. It represents the case when the hazard rate or failure rate, , is constant over time.
Failure p.d.f. The probability of failure through a unit time interval is given by p !
(
t '1
t
f (t ) dt
Let us zoom into the p.d.f. The probability of failure through a unit time interval is given by the area under the curve
p !
(
t '1
t
1
1
f (t ) dt ) f (t ' ) " 1 ) f (t ' ) ) f (t ) 2 2
Hazard Rate The hazard rate is the conditional probability of failure in a small time interval (t, (t, t+dt). t+dt). It is conditional on there being no failure until t: h(t ) !
f (t )
(t )
For exponential failure distribution, the hazard rate is constant:
h(t ) !
f (t ) R (t )
!
% e e
& % t
& % t
! %
Constant Hazard Rate The exponential distribution corresponds to a constant failure rate a.k.a. constant hazard rate
Is % always constant ? Failure p.d.f.
f (t ) ! % e
& % t
The constant % is the failure rate (%= 1/MTBF) So far, we treated % as constant
Reliability function
R (t ) ! e
& % t
This is called the exponential distribution Is this always true? Let us first give an example
Uniformly Increasing Rate Assume the failure rate is increasing by the formula 0.1t where t is measured in days It starts from zero and at the end of the month, the hazard rate is 3 failures per day. How do we generate the reliability function for this component?
Reliability with varying m=0.1 hazard rate In a sample of N, dN will fail in a time interval dt
dN dN ! & Nmtdt ! & mtdt ln N N N
! Ce
& mt 2 / 2
! Noe
& mt 2 / 2
Then the reliability function is
2
! &m
t
2
' ln C
Where N is the value of N at t=0
N N o
!e
& mt 2 / 2
This corresponds to a Weibull distribution
Weibull Reliability *
R ! e
, t &. / 0 + 1
R
The probability of surviving through time t
*
Shape factor
+
Scale factor
Weibull Distribution Curves *
c.d.f.
F (t ) ! 1 & R (t ) ! 1 & e
, t &. / 0 + 1 *
p.d.f.
Hazard Rate
f (t ) !
dF dt
h(t ) !
!
* *
+
f (t ) R (t )
!
, t &. / * &1 0 + 1
t
e
* *
+
* &1
t
Weibull curves with different hazard functions *
R ! e h(t ) !
, t &. / 0 + 1
* *
+
* &1
t
Bathtub Curve % , e t a R e r u i l a F
Infant Mortality
Useful Working Life
Falling Apart
Optimum Point to Replace
Life in operation, hours
Actual Failure Patterns These curves are the failure patterns observed on aircraft components in a study completed in 1978 by Nowlan and Heap.
4% 2% 5%
This shows that only 4% of the components go through a bathtub curve.
7% 82%
(68% of this with infant mortality)
RELIABILITY OF SYSTEMS Series Systems Parallel Systems (Redundancy)
Series Systems A series system is a chain of components. When one of these parts fails, the entire system fails.
Series Systems
A
B
R ! R R B RC
C
Parallel Systems The failure for a parallel system means the failure of each individual component. The system failure probability is then the product of individual failure probabilities (1 – R). A
B
C
R ! 1 & (1 & R A )(1 & RB )(1 & RC )
Most mining machinery systems are series systems. In other words, the failure of one component fails the entire system. The redundancy in mining can be provided by having multiple systems, eg spare trucks or shovels.
Managing Reliability • Optimum utilisation of its capital investment in equipment is essential for company profits • Equipment reliability plays a major role in this • Therefore, managing reliability is a core business for a mining company • This is a task for both production and maintenance engineers. In the rest of this presentation, we will talk about the maintenance function.
Maintenance Function • Preventive Maintenance – Prevent failures by performing a set of maintenance tasks at periodic intervals • Service • Inspection • Replacement
• Corrective Maintenance – Repair after a failure to bring the machine back to an operating state
• Which one delivers higher overall system availability?
Corrective Maintenance • We assume that corrective maintenance brings the system to “as new” state. • Then it has no effect on system reliability • Its impact on system availability is measure by the Mean Time To R epair (MTTR )
Mean Time To Repair • Fault Identification – What caused the failure? What needs to be repaired?
• Set-up time – Find and bring the right person to the job – Actual repair
• Logistic delays – Waiting for the spare part
• Restart time – Time spent to bring the system back to normal operation after the fault is repaired
How to minimise MTTR • Identify the failed components quickly. This is achieved by experienced operators, on-line fault detection tools • For frequent failures have the repair crew with the right skills on standby • Ditto for the frequently failing spare parts • Design the equipment and the operating procedure to minimise re-starting time
PM Trade-Offs • The cost of failure – MTTR – The cost of the repair and the replacement
• The cash cost of the planned maintenance action (salaries, consumables, etc) • The opportunity cost (lost production)
Preventive Maintenance Issues • Service – Effect on System Reliability
• Inspection – P-F time (between potential and actual system failure)
• Replacement – Failure distribution curves – Effect on System Reliability
Would inspections help? The time when we can recognise “Potential Failure”
P-F Interval
Failure Time
Scheduled inspections help when • Potential failure condition is clearly defined • The P-F interval is consistent • It is practical to inspect at intervals less than the P-F interval • The P-F interval is long enough to implement corrective maintenance action
Scheduled replacements help when • The component breakdown has costly consequences (eg chain of failures, distance from the workshop, etc) • The dominant failure mode is age-related with the hazard rate consistently increasing above an acceptable value at around the set replacement period
Weibull curves with different hazard functions *
R ! e h(t ) !
, t &. / 0 + 1
* *
+
* &1
t
Periodic replacements - 1 * ! 0.5
Decreasing Hazard Rate Scheduled replacement increases failure probability
Periodic replacements - 2 * ! 1
Constant Hazard Rate Scheduled replacement has no effect on failure probability
Periodic replacements - 3 * ! 2
Increasing Hazard Rate Scheduled replacement decreases failure probability
Reliability Research A significant part of the academic and research community has been continuing to develop increasingly complex mathematical models of the engineering systems and the expected modes of failure under various loading assumptions. While the intellectual rigour in these studies and the amount of effort that go into them cannot be ignored, the applications to real manufacturing and mining processes have been limited primarily for the lack of data needed to support these models
Industry Data Analysis and Representation Tools
Pareto Analysis • Pareto Principle : “Significant few and Insignificant many” • In any application, a large part of the failures are due to a small number of causes • A Pareto plot helps to identify the most significant causes • The benefit is incurred only by attending the significant issues
Pareto Chart Pareto Analysis for Longwall Face Equipment Failures
Shearer Drive Shaft
Scatter Plot Diagrams • A scatter plot is a logarithmic plot of MTTR against the number of failures N. • Since the total downtime associated with each failure is NxMTTR, constant downtime curves appear as lines on logarithmic axes. TTR !
Downtime N
2 log MTTR ! C & log N
Longwall Scatter Plot
Lines of Constant Downtime
Reliability Analysis • Pareto Analysis and Scatter Plots are good tools to identify the reliability sinks in the equipment • The next step is to calculate the failure probability distribution curves for all critical components. • The MTTR statistics may also be required if MTTR is not reasonably constant for each item • This step requires high quality data
Features of High Quality Data • Large enough set to have at least 4-5 failure events for each target failure mode • Cover a sufficiently long time period to eliminate local effects • Uniform operating conditions over this period • Accuracy free of collector’s bias
Failure History Example Suppose that we have the failure log for a component as 180, 216, 930, 990, 1300 and 1850 hours. Estimate the MTBF assuming an exponential probability distributions.
Failure Data (example) Failure times 180h
36 714
36
60
216h 930h 990h 1300h 1850h
714 310 60 550
310 550
MTBF !
36 ' 714 ' 60 ' 310 ' 550 5
! 334
Curve Fitting The failure log was 180, 216, 930, 990, 1300 and 1850 hours. The TBF array , TBF = {36, 714, 60, 310, 550}. i
1
2
3
4
5
TBF
36
60
310
550
714
Fi
20%
40%
60%
80%
100%
This is cumulative probability distribution data. We can then compare it against exponential or Weibull distributions. For example, use the following form to try an exponential distribution:
3 1 4 & % t ln 6 1 ( ) % t F t e ! 5 & ! 7 F t 1 ( ) & 8 9
Estimate the Failure Distribution TBF, h 36 60 310 550 714
Rough Estimate 20% 40% 60% 80% 100%
Median Estimate 10% 30% 50% 70% 90%
Exponential Fit F (t ) ! 1 & e
& % t
! 1& e
&
t
334
Longwall Equipment Failure Probability Distribution Functions for some Critical Items
AFC Blockage/Overload (2002 only) 3 1 4 7 81 & F (t ) 9
ln 6
Exponential
AFC Blockage/Overload (2002 only)
Weibull
AFC Chain Failures 3 1 4 7 81 & F (t ) 9
ln 6
Exponential
AFC Chain Failures
Weibull
BSL Chain Failures 3 1 4 7 81 & F (t ) 9
ln 6
Exponential
BSL Chain Failures
Weibull
Shearer Cable Failures 3 1 4 7 81 & F (t ) 9
ln 6
Exponential
Shearer Cable Failures
Weibull
What is to be done? • Increase overall availability – Minimise the time spent on PM – Decrease number of breakdowns • More effective PM • Condition monitoring with long enough P-F time • Engineering changes – Design changes – Changes to the operating procedure
– Decrease MTTR
• Do all this while maximising the profit • Learn lessons for next equipment purchase