SPE 171517_Estimating Probability of Failure _2014_Final

16
SPE 171517-MS Estimating Probability of Failure for Drilling Tools with Life Prediction K. Carter-Journet, A. Kale, D. Zhang, E. Pradeep, T. Falgout, and L. Heuermann-Kuehn, Baker Hughes. Copyright 2014, Society of Petroleum Engineers This paper was prepared for presentation at the SPE Asia Pacific Oil & Gas Conference and Exhibition held in Adelaide, Australia, 1416 October 2014. This paper was selected for presentation by an SPE program committee following review of information contained in an abstract submitted by the author(s). Contents of the paper have not been reviewed by the Society of Petroleum Engineers and are subject to correction by the author(s). The material does not necessarily reflect any position of the Society of Petroleum Engineers, its officers, or members. Electronic reproduction, distribution, or storage of any part of this paper without the written consent of the Society of Petroleum Engineers is prohibited. Permission to reproduce in print is restricted to an abstract of not more than 300 words; illustrations may not be copied. The abstract must contain conspicuous acknowledgment of SPE copyright. Abstract Drilling tools are subject to numerous operational parameters such as revolutions per minute (RPM), vibration (lateral, stick- slip and axial), pressure, torque and temperature. These parameters can greatly fatigue even the most robust tool depending on where and how the tool is operated. Lifetime prediction methodologies represent an affordable and statistically significant way to estimate the probability of failure (risk) of drilling tools in a cost effective way. Understanding the potential risk is vital to ensuring reliability, performing the most efficient maintenance on the equipment and improving drilling performance. Sophisticated risk-modeling techniques reduce uncertainty in drilling operations by making use of readily available opera- tional field data, thus eliminating the need for costly laboratory experiments. Blind spots in the decision making process are eliminated by proactively identifying precursors to costly failures in the field. Preemptive guidance during maintenance peri- ods, for parts that may have otherwise been overlooked based strictly on procedure, is enabled. Statistical models that relate the operating environment to component life are derived from field component failure data, and introduce a fresh way to boost the drilling tool efficiency. A Bayesian-based model selection technique is also developed which incorporates operating environment variables after each successful drilling run to dynamically select the model that gives the best survival probabil- ity, ensuring maximum utilization of a component, while avoiding failure and improving the overall reliability of the tool in the field. The implementation of lifetime prediction methodologies also leads to lowered life-cycle and maintenance costs, reduced risk and improved operational performance. The paper presents the methodology used to estimate the probability of failure of drilling tools and further illustrates how to reach risk-informed decisions. Introduction Optimum drilling services minimize the non-productive time (NPT) experienced from tool degradation and/or failures. This objective of reliability starts with innovative tool design and spreads to the primary areas of application engineering, mainte- nance and well site execution. A universal approach for greater project efficiency, with minimized risk 1 , is necessary as the oil and gas industry seeks unconventional sources to meet increasing demands. Almost every product and service is designed to reduce costs, lessen risk or increase productivity during activities related to hydrocarbon extraction, further advancing reservoir performance. Consistent methodologies, which provide preemptive guid- ance for optimizing drilling parameters and reducing the probability of failures in the field, are necessary. These types of methodologies are especially important when analyzing electrical component anomalies. For instance, the reliability of elec- tronic-printed circuit board assemblies (PCBAs) in the bottomhole assembly (BHA) is vital to the success of any drilling op- eration. PCBAs are multi-scale devices (encased in electronic packaging) comprising multiple components and the geometric dimensions of individual components may vary in size and composition and are not easily assessable without disassembling a tool. Electronic packaging can also be subject to thermal expansion mismatch, accelerated corrosion, dendrite growth, metal whiskers, solder fatigue and outgassing which can lead to failure. Understanding the risk and amount of consumed life of PCBAs prior to deploying a drilling tool into the field improves reliability and overall drilling performance. The ever-present need for more flexibility in drilling regimes, greater reliability of drilling tools and higher rates of penetration puts further 1 Risk, for the purpose of this paper, refers to the uncertainty in drilling tool performance at the component level and/or as a whole. Risk centers on predicting the probability of failures that can lead to severe damage to the tool and/or the inability to perform the run or function to the best advantage. The consequences can be technical, safety, cost, or schedule related. The ability to quantify and understand risk provides a foundation for proactive risk management throughout the drilling tool’s lifetime.

Transcript of SPE 171517_Estimating Probability of Failure _2014_Final

Page 1: SPE 171517_Estimating Probability of Failure _2014_Final

SPE 171517-MS

Estimating Probability of Failure for Drilling Tools with Life Prediction K. Carter-Journet, A. Kale, D. Zhang, E. Pradeep, T. Falgout, and L. Heuermann-Kuehn, Baker Hughes.

Copyright 2014, Society of Petroleum Engineers This paper was prepared for presentation at the SPE Asia Pacific Oil & Gas Conference and Exhibition held in Adelaide, Australia, 14–16 October 2014. This paper was selected for presentation by an SPE program committee following review of information contained in an abstract submitted by the author(s). Contents of the paper have not been reviewed by the Society of Petroleum Engineers and are subject to correction by the author(s). The material does not necessarily reflect any position of the Society of Petroleum Engineers, its officers, or members. Electronic reproduction, distribution, or storage of any part of this paper with out the written consent of the Society of Petroleum Engineers is prohibited. Permission to reproduce in print is restricted to an abstract of not more than 300 words; illustrations may not be copied. The abstract mus t contain conspicuous acknowledgment of SPE copyright.

Abstract Drilling tools are subject to numerous operational parameters such as revolutions per minute (RPM), vibration (lateral, stick-

slip and axial), pressure, torque and temperature. These parameters can greatly fatigue even the most robust tool depending

on where and how the tool is operated. Lifetime prediction methodologies represent an affordable and statistically significant

way to estimate the probability of failure (risk) of drilling tools in a cost effective way. Understanding the potential risk is

vital to ensuring reliability, performing the most efficient maintenance on the equipment and improving drilling performance.

Sophisticated risk-modeling techniques reduce uncertainty in drilling operations by making use of readily available opera-

tional field data, thus eliminating the need for costly laboratory experiments. Blind spots in the decision making process are

eliminated by proactively identifying precursors to costly failures in the field. Preemptive guidance during maintenance peri-

ods, for parts that may have otherwise been overlooked based strictly on procedure, is enabled. Statistical models that relate the operating environment to component life are derived from field component failure data, and introduce a fresh way to

boost the drilling tool efficiency. A Bayesian-based model selection technique is also developed which incorporates operating

environment variables after each successful drilling run to dynamically select the model that gives the best survival probabil-

ity, ensuring maximum utilization of a component, while avoiding failure and improving the overall reliability of the tool in

the field. The implementation of lifetime prediction methodologies also leads to lowered life-cycle and maintenance costs,

reduced risk and improved operational performance. The paper presents the methodology used to estimate the probability of

failure of drilling tools and further illustrates how to reach risk-informed decisions.

Introduction Optimum drilling services minimize the non-productive time (NPT) experienced from tool degradation and/or failures. This

objective of reliability starts with innovative tool design and spreads to the primary areas of application engineering, mainte-nance and well site execution. A universal approach for greater project efficiency, with minimized risk1, is necessary as the

oil and gas industry seeks unconventional sources to meet increasing demands.

Almost every product and service is designed to reduce costs, lessen risk or increase productivity during activities related to

hydrocarbon extraction, further advancing reservoir performance. Consistent methodologies, which provide preemptive guid-

ance for optimizing drilling parameters and reducing the probability of failures in the field, are necessary. These types of

methodologies are especially important when analyzing electrical component anomalies. For instance, the reliability of elec-

tronic-printed circuit board assemblies (PCBAs) in the bottomhole assembly (BHA) is vital to the success of any drilling op-

eration. PCBAs are multi-scale devices (encased in electronic packaging) comprising multiple components and the geometric

dimensions of individual components may vary in size and composition and are not easily assessable without disassembling a

tool. Electronic packaging can also be subject to thermal expansion mismatch, accelerated corrosion, dendrite growth, metal whiskers, solder fatigue and outgassing which can lead to failure. Understanding the risk and amount of consumed life of

PCBAs prior to deploying a drilling tool into the field improves reliability and overall drilling performance. The ever-present

need for more flexibility in drilling regimes, greater reliability of drilling tools and higher rates of penetration puts fur ther

1Risk, for the purpose of this paper, refers to the uncertainty in drilling tool performance at the component level and/or as a

whole. Risk centers on predicting the probability of failures that can lead to severe damage to the tool and/or the inability to

perform the run or function to the best advantage. The consequences can be technical, safety, cost, or schedule related. The

ability to quantify and understand risk provides a foundation for proactive risk management throughout the drilling tool’s

lifetime.

Page 2: SPE 171517_Estimating Probability of Failure _2014_Final

2 SPE 171517-MS

strain on the drilling tool’s electrical components. High-performance drilling tools in the industry must drill in harsher envi-

ronments, higher temperature (often beyond 150°C), vibrations (levels exceeding 15g) and pressures (30Kpsi or more) along

horizontal paths (rather than conventional vertical bore holes) at increasing depths and abrasive formations because more

readily accessible oil and gas reserves were depleted long ago (Figure 1).

These demanding conditions can often influence companies’ decisions to operate drilling tools beyond their design specifica-

tions. The trend also leads to higher maintenance costs and more frequent system downtime. However, instead of over-

maintaining drilling tools, companies must target enhancing system performance. For example, focus should be on preventing failure and reducing system downtime, meeting customer demands, reducing maintenance costs and ensuring

equipment reliability.

Failure in the field extends a planned drilling program beyond the scheduled time frame, adding unnecessary cost.

Consequently, the capability to estimate the probability of failure (PoF) for drilling tools by using lifetime prediction

methodologies introduces an alternative way to avert expensive downhole tool failures and ensures the success of any drilling

operation by indicating the overall risk.

Background: Lifetime Prediction Methodologies In lifetime prediction analysis and reliability engineering, the output of the analysis is always an estimate (ReliaSoft, 2005).

The true value of probability of failure (PoF), probability of success (reliability), mean life, parameters of a distribution or

any other applicable parameter is never truly known. In fact, these values will (likely) remain unknown for all practical pur-poses. However, through the use of lifetime prediction analysis, engineers are able to use operational field data to determine

the PoF for parts, components and systems. Understanding the PoF is useful in determining whether drilling tools must be

used in harsh or benign environments for a desired length of time without failure. Lifetime prediction is inclusive of an as-

sortment of statistical techniques ranging from best-fit modeling to machine learning and text/data mining as a way to ana-

lyze historical and current data for making predictions about the future (forecasting). Current lifetime prediction techniques

often require test data obtained throughout extended periods that approach the actual life of a part; this type of testing can be

costly and time-consuming. An alternate approach to laboratory testing is to obtain, catalog and statistically analyze opera-

tional field data using predictive analytics.

To meet the growing demand for more reliable drilling tools, there is mounting interest in the area of health prognostics for

electronics components by using physics-based models, operational field data, design and qualification testing data and in-service inspections data. The field data can be used to build a part/tool profile

2 to evaluate operational fitness based on the

historical usage of an entire population of the same part/tool. This is very similar to the way a doctor compares the results of

an individual’s blood test against a specified range from a larger population to determine if the values are within an accepta-

ble range. The determination factors into what risk category (low, medium or high) the individual belongs in. A risk-

informed decision is made that determines whether any corrective action is required. In the case of a part/tool, the risk indi-

cates the fitness of the part/tool to operate optimally in the next run (meaning a recommendation is made on whether to pro-

2 A profile would describe events for a part, from manufacture through end of life. Natural or induced factors (performance or

environmental) would be included. All associated failures (confirmed or unconfirmed) would be documented.

Figure 1: Illustration of Drilling System

Rig

Drill Pipe

Drilling Tool Abrasive Formation

Page 3: SPE 171517_Estimating Probability of Failure _2014_Final

SPE 171517 3

ceed with using the tool as is, perform some level of maintenance or entirely replace/ retire it).

Identifying precursors to failure and quantifying the associated risk in real-time is challenging because it is not realistic to

take measurements during drilling; an example is when PCBAs are built-in inside the tool and require disassembly to perform

tests and measurements. Therefore, using algorithms as a diagnostic tool to detect anomalies is a fast and practical approach.

Table 13 provides further details on contemporary methodologies used for lifetime prediction.

Table 1: Lifetime Prediction Methodologies

Methodology Description (+/-) FD RD TD R&M SP Comp RC SS

Measurement of failure precur-sors

The first process of measuring failure precur-

sors as indicator of impending failure is es-tablished on the hypothesis that a degraded circuit board produces a significantly different

signature from that of a defect-free board.

Detecting

Anomalies using fuses/ sensors

The second technique of research in elec-tronics prognostics and health management (PHM) uses sacrificial circuits like fuses, ca-

naries, circuit breakers and self-diagnostics sensors for detecting whether the device is operating outside of its design limits. Sacrifi-

cial circuits are widely used in consumer electronics products and appliances.

Physics based modeling

The third approach for life prediction uses modeling and simulation to relate the funda-mental physical and chemical behavior of

materials to the action of surrounding envi-ronment and applied loads. Typically for elec-tronics, the PoF- based modeling process

starts by exposing the product to highly ac-celerated life tests (HALT) and highly accel-erated stress tests (HAST) to find the signifi-

cant mode(s) and root-cause(s) of failure.

Field data driven analytics & Sta-tistical modeling

The fourth methodology gained momentum

because of availability of large volumes of data and limitations of data-agnostic meth-ods.

Proposed Lifetime Prediction Methodology This paper introduces a methodology to estimate the lifetime of drilling electronics using operational field data, drilling dy-

namics and historical maintenance information. Reliability analyses on specific drilling parameters and Bayesian statistics are

combined in a probabilistic framework. Parameter estimation is used to calibrate statistical equations to field data, and proba-

bilistic analysis is used to obtain the likelihood of failure. Model parameters are represented as random variables, each with a

probability distribution. The methodology takes into account that drilling electronics in downhole conditions can have varied

failure modes, and each failure mode can be caused by the interaction of multiple variables, either independently or interde-

pendently. Several candidate models were developed to account for the inability to model each failure mode of a component

in the field. Bayesian updating further improves the model results by updating prior probability estimates to produce a poste-

rior probability estimate established upon operational run history updates for individual part numbers (PNs) within a drilling

tool. The inclusion of Bayesian updating adds precision to dynamically selecting more accurate failure models for a selected part as a function of usage. Sophisticated risk-modeling techniques can reduce uncertainties in drilling operations for oil and

gas companies by quantitatively identifying the risk.

3 Fully darkened circles denote complete correlation with method.

Page 4: SPE 171517_Estimating Probability of Failure _2014_Final

4 SPE 171517-MS

The essential information necessary to estimate the probability of failure is entrenched in the historical life cycle data

normally found within a company’s Failure Reporting Analysis and Corrective Action System (FRACAS). FRACAS data is

important because it reveals when/how components fail, provides detail on the material properties, loads (electrical and

mechanical), material response, the physics of failure and corresponding corrective actions (upgrade or revision). The data is

fed directly into a life prediction model that is used to assist in the Risk-Informed Decision Making (RIDM4) process,

maintain drilling tools and increase reliability (Figure 2).

Operational Field Data Requirements

Foremost, oil and gas companies must have the ability to predict business outcomes and make risk-informed decisions that enable them surpass their competitors. This is heavily contingent on how successful these companies are at harnessing the

available data. In the case of the proposed lifetime prediction approach, historical data is leveraged to forecast future perfor-

mance of drilling tools to the part level. However, this cannot be effectively done unless these companies understand the data

available.

Field data-driven models for lifetime prediction of electronic assemblies in drilling operations is challenging for two reasons.

First, not all of the factors impacting component life can be measured in real time. Second, the data that can be measured has

errors and noise because of limitations of the measurement system and human factors. Challenges for historical data of

PCBAs (Figure 3) include variable operating environment, incomplete information on failures and operating history and sta-

tistical variation in components (manufacturing defects, material properties, etc…).

4 The primary objective of RIDM is to provide the decision maker with the necessary risk information to make a choice that

has the most potential for successfully meeting objectives (ex: completing a drilling mission without failure and within the

specified timetable).

Figure 2: Proposed Life Prediction Methodology

Optimize Drilling

Performance

Improve Reliability

and

Reduce

Risk

Create Independent

Models for each Part

Number (PN) in a

Drilling Tool

Utilize Models to Evaluate Proba-

bility of Failure (Generate Results

and Plots)

Model is Invalid

Reduce Maintenance

Costs

Gather Operational Field

Data (Failures and Sus-

pensions)

Environment, run,

and failure data

Repair and Maintenance

Data

Develop Life Prediction

Model

Select/Screen Desired

Data (Filter miss-

ing/incomplete data)

Consider Per-

centage of

missing data

Consider Size of

Dataset

Data must have high fidelity and quality.

Select Appropriate Lifetime Distribution

to Fit Data (Weibull, Lognormal, etc…)

Variable Selection

(Drilling Hours, Temperature and Vibration)

Train Data & Create Model (Outlier Detection, Weighting Factors,

Bayesian Updating, and Best Fit

Models)

Test

Model

Lifetime Prediction

(Risk and Remain-ing Useful Life)

Model is Valid

Utilize Probability of

Failure in Risk-Informed

Decision Making Process

(RIDM)

Page 5: SPE 171517_Estimating Probability of Failure _2014_Final

SPE 171517 5

There are some basic requirements for integrating historical data with predictive models. The first requirement is that the

operational field data used in lifetime prediction must be plentiful5 and the second requirement is that the parts must be serial-

ized6. This may require that the data is assembled and formatted in a way that provides the necessary data fields for each PN.

Failure (parts that are no longer operational) and suspension (parts that have either been scrapped or have not failed) data

must be included in the dataset for each PN. Basic information for a PN must also include the job number, run and incident

information for any failures.

Table 2 presents some data fields that are helpful in developing a predictive lifetime model (list is not all-encompassing; oth-

er fields may also be applicable):

Table 2: Potential Drilling Tool Data Fields of Interest

Data Field Name Field Description

Part Number (PN) Referencing identifier for a group of parts that share common design.

Serial Number (SN) Unique identifier of a single piece with associated history.

Revision (Rev) Tracking method for non-functionality related changes associated with a PN.

Upgrade History (per Revi-sion)

Whether or not the PN-SN-Revision has multiple combinations.

Repair History (per Revision) Whether or not this PN-SN-Revision has any repair activity.

Last Repair Date The last repair activity date during the reported time frame.

Scrap Whether or not the most recent component has finally been scrapped.

Scrap Reason The scrap reason if the most recent component has finally been scrapped

Last Maintenance Level Most recent level of maintenance performed on this PN-SN-Rev.

Last Maintenance Location Geographical location where the most recent maintenance activity was complet-ed.

Product Description The top level assembly display name.

Last Job Location Geographical location where the tool was most recently operated.

Last Job Number Referencing (SAP) number used to link with a customer’s well name.

Drilling Hours The length of time the BHA/bit was actively making hole.

Circulation Hours The length of time drilling fluid was pumped through the BHA.

Distance Drilled The total distance drilled by a PN-SN in the reported time frame.

Average Temperature The average of average temperatures in the reported time frame.

Average RPM The average of average revolutions per minute (RPM) in the reported time frame.

Min Depth In The minimum of Depth-In in the reported time frame.

Average Flow Rate The average of average flow rate in the reported time frame.

5 There must be sufficient failure data for a part to be adequately modelled. Since there is some pre-screening of the data prior

to building a model, there must be at least a year’s worth of data to use in the process. 6 Each part has a part number (PN) and a serial number (SN) that uniquely identifies it and its applicable history. One PN

may have multiple SNs and multiple revisions.

Figure 3: PCBA Images

Page 6: SPE 171517_Estimating Probability of Failure _2014_Final

6 SPE 171517-MS

Data Field Name Field Description

Incident Date The incident date of the last failure in the reported time frame.

Incident Description Description of events that led to the incident in the reported timeframe.

Root Cause Description The root cause of an incident in the reported timeframe.

Failure Mode The failure mode of the last failure reported time frame.

Average Vibration The average vibration (axial, lateral, and stickslip).

Model Development

Models are developed at the part number level first (each revision of a part is evaluated separately). Next all of the parts that

are installed in a specific drilling tool are grouped together to provide the overall risk of that tool. Therefore, each part’s con-

tribution to the overall condition of the drilling tool can be assessed. Comprehensive data for the entire history of the part is

required to analyze the relationship between operating environment and life.

A typical time to failure model comprises a life distribution function to incorporate the statistical scatter in failure time and a

life characteristics function (Appendix A) that describes a general relation between failure time and stress levels (Kale et al., 2014). Weibull, lognormal and exponential distributions are considered in this methodology for each part’s model. The life

characteristic can be any life measure such as the mean, median or hazard rate that represents a bulk property of the distribu-

tion. The life characteristic is expressed as a function of stress (as shown in Appendix A). The unknown parameter of the

composite model is determined by tuning the model equation to fit field data using the Iterative Maximum Likelihood Esti-

mation technique.

Optimizing Problem

One of the main focuses of this paper is to optimize allocation of assets by incorporating operational constraints on life and

reliability of individual components that make up the tool. A case study is presented describing a scenario where two assets

in a maintenance shop are awaiting overhaul. Furthermore, there are a fixed number of spare parts that can be used as re-

placements and stringent threshold reliability constraints that each tool must meet after a maintenance action. The optimizing

problem is to maximize reliability of the two assets by swapping the existing parts between the two tools and the additional

spare parts. Calculating which assets to swap is done by using the constrained linear programming algorithm. A linear pro-

gramming problem may be defined as the problem of maximizing or minimizing a linear function subject to linear con-

straints. The constraints may be equalities or inequalities.

(1)

where x represents the vector of variables whose optimum values are to be determined, C is a vector containing the sensitivity

of objective function with respect to each unknown x, B are vectors of known coefficients representing constraint bounds, A

is a matrix of coefficients containing sensitivity of constraints with respect to each x and the superscript T stands for matrix

transpose. The unknown variables xi represents the system reliability of the ith asset. For typical assets used in drilling sys-

tems, it is fair to assume that failure of a single component leads to service failure; consequently, the reliability of assets are

modeled using the series system (Equation 2).

∏ ( )

(2)

The number of subcomponents in the asset is represented by n. The example used in this paper shows the application of the

linear programming method to optimize allocation of individual subcomponents between two assets awaiting maintenance.

When considering typical assets (such as PCBAs) used in drilling, the vector C = {1, 1} since reliability of each asset has to

be maximized, A = {1, 1}. The example also incorporates a typical scenario where an asset is deployed in more critical jobs

(example: award based contract). In this scenario, the selected drilling tool will need to have a higher threshold for reliability

than the other. The best parts from the lower reliability drilling tool will be swapped to the tool with higher reliability. The

overall asset optimization problem can be summarized as:

∏ ( )

∏ ( )

(3)

Page 7: SPE 171517_Estimating Probability of Failure _2014_Final

SPE 171517 7

∏( )

∏( )

The optimization problem is solved using the simplex method in MS Excel. Theoretical discussion of the simplex LP method

can be found in Dantzig et al., (1997) and Murty et al., (1983).

Case Study 1: Utilizing Lifetime Prediction for Identifying Risk

Drilling tool parts are maintained or replaced depending on how many circulating hours they are exposed to. Those circulat-ing hours may be in the range of the mean time between failure (MTBF7) or exceeded it. Predicting time to failure of PCBAs

within a drilling tool prior to deployment to the field is the example used in this case study. This is a similar type of diagnosis

as to when someone takes their vehicle in for maintenance. The vehicle has sensors that reveal data on the current condition

of the vehicle depending on how it is used. The results of the screening indicates the type of maintenance required so the ve-

hicle can continue to perform optimally and achieve the mission (operate without failure when the owner is driving).

Using the location of the upcoming run, a desired risk threshold (example 50% or half of the calculated operating life) can be

set to determine whether the parts have consumed more than the risk threshold.

Fig. 4: Drilling Tool

Based on the location of the upcoming run, a desired risk threshold (example 50%, half of the calculated operating life) can

be set to determine whether the parts have consumed more than the risk threshold.

Baseline Case: Initial Drilling Tool Assessment

A drilling tool was analyzed in April 2014 prior to deployment. In the case study, the risk threshold 8 is set at 50% which is a

conservative setting (the expected life of each part is displayed graphically and interpreted in Figure 5).

Figure 6 shows the prediction range for actual drilling hours9 (DrillHrs) and a fuel gauge chart that enables the user to see the

percentage of life consumed.

7 MTBF is the predicted elapsed time between inherent failures of a system during operation. 8 The risk threshold is the amount of risk that a job is willing to incur. For instance, in the case of an award driven contract

where risk is less tolerable, electronics with greater than 50% life consumed are considered risky. However, for a contract

where the drilling conditions are more benign, greater risk in the range of 75% could be more acceptable. 9 Actual drilling hours are represented by the diamond.

Figure 5: Data Interpretation

Page 8: SPE 171517_Estimating Probability of Failure _2014_Final

8 SPE 171517-MS

Further guidance on replacing parts at a maintenance cycle can also be extracted from the results, depending on how the data

is flagged (in this case, >50% risk) (as shown in Table 3).

Table 3: Predicted Life of Electronic Parts in Drilling Tool Results – April 2014

Results Interpretation Key

Low Risk: 0.0 – 0.25 Medium Risk: 0.25 – 0.50 High Risk: >0.50 Uncertainty in prediction due to missing data (>30%)

A maintenance center has the ability to distinguish that one part (xxxx-7) requires replacement and can apply lower levels of

maintenance to the other electronics (with low or medium risk) or leave them undisturbed (alleviating induced failure because

of human error or process escapes). In this case, serial number xxxx-7 can be examined more closely to observe/perceive the

predictions for each run. Four parts are identified as medium risk and can either be assigned a certain level of maintenance or can be left untouched depending on associated risk. In addition, the maintenance facility can incorporate the risk values into a

sparing forecast for the parts shown in Table 3. The incorporation of historical data and utilization of lifetime prediction risk

values can now be used as indicators for future demand. As a result, the accuracy of the sparing forecast is improved and ad-

ditional cost savings are generated.

Figure 7 shows the run history of the part flagged as high risk. The diamonds represent the actual drilling time of the part.

Although the drilling hours do not reached the upper limit of the confidence bound, there is still uncertainty. A recommenda-

tion is made for the repair10, if applicable, or replacement of the part before the next run.

10 Repair is more likely to be considered at the assembly level. PCBAs are more likely to be replaced.

Part Number Serial Number Last Job No

Cumulative

Temperature

C

Cumulative

StickSlip

Cumulative

Lateral

(g_RMS) DrillHrs [h]

Worst Case

Life 25Q

Predicted

Mean Life 75Q

Best Case

Life Risk Part Description Comments

1 xxxx-1 10000 50 (L0) 0.35 (L1) 1.39 (L2) 23.33 301.10 428.47 518.63 622.46 834.54 0.00 PCBA (1) Missing Data For 1Runs..(12%)

2 xxxx-2 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 132.53 236.27 307.20 377.36 505.40 0.00 BATTERY

3 xxxx-3 10000 91.57 (L1) 0.32 (L1) 1.39 (L2) 332.25 278.65 451.23 601.87 731.80 997.62 0.11 PCBA (1) Missing Data For 1Runs..(6%)

4 xxxx-4 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 373.68 587.34 744.87 937.90 1390.76 0.00 PCBA

5 xxxx-5 10000 50 (L0) 0.32 (L1) 1.34 (L2) 216.99 216.73 389.55 534.26 739.48 1229.68 0.03 TRANSDUCER (1) Missing Data For 2Runs..(11%)

6 xxxx-6 10000 50 (L0) 0.35 (L1) 1.39 (L2) 23.33 124.35 204.65 266.52 318.90 418.51 0.00 BATTERY ASSY (1) Missing Data For 1Runs..(12%)

7 xxxx-7 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 378.20 470.53 543.91 647.85 922.11 0.81 PCBA (1) Missing Data For 3Runs..(13%)

8 xxxx-8 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 547.85 776.58 953.76 1183.45 1750.25 0.15 MAGNETOMETER (1) Missing Data For 3Runs..(13%)

9 xxxx-9 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 414.99 617.51 761.79 940.06 1285.00 0.38 ACCELEROMETER (1) Missing Data For 3Runs..(13%)

10 xxxx-10 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 528.26 916.09 1224.80 1612.98 2303.18 0.12 PCBA (1) Missing Data For 3Runs..(13%)

11 xxxx-11 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 253.96 517.32 736.48 1008.65 1805.98 0.44 PCBA (1) Missing Data For 3Runs..(13%)

12 xxxx-12 10000 91.57 (L1) 0.32 (L1) 1.39 (L2) 332.25 223.98 480.48 685.26 969.48 1686.29 0.12 SENSOR (1) Missing Data For 1Runs..(6%)

13 xxxx-13 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 266.49 581.45 923.16 1315.53 2178.79 0.00 POWER SUPPLY

14 xxxx-14 10000 91.57 (L1) 0.32 (L1) 1.39 (L2) 332.25 223.98 480.48 685.26 969.48 1686.29 0.12 SENSOR (1) Missing Data For 1Runs..(6%)

15 xxxx-15 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 266.49 581.45 923.16 1315.53 2178.79 0.00 POWER SUPPLY

16 xxxx-16 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 528.26 916.09 1224.80 1612.98 2303.18 0.12 PCBA (1) Missing Data For 3Runs..(13%)

17 xxxx-17 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 414.99 617.51 762.53 940.06 1285.00 0.38 ACCELEROMETER (1) Missing Data For 3Runs..(13%)

18 xxxx-18 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 414.99 617.51 762.53 940.06 1285.00 0.38 ACCELEROMETER (1) Missing Data For 3Runs..(13%)

19 xxxx-19 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 514.10 748.73 927.73 1151.33 1714.98 0.19 MAGNETOMETER (1) Missing Data For 3Runs..(13%)

20 xxxx-20 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 123.18 202.70 264.81 316.51 416.86 0.00 BATTERY ASSY

Figure 6: Graphical Representation of Predicted Life of Electronic Parts in Drilling Tool – April 2014

Page 9: SPE 171517_Estimating Probability of Failure _2014_Final

SPE 171517 9

The probability of failure increases as the part’s life (measured in drilling hours) reaches the 75% life estimate of the predic-

tion (as shown on the chart on the left-hand side of Figure 7). If the part was to remain unchanged in the drilling tool and

operated with the same or similar conditions as shown in Table 4, then the probability of failure continues to increase.

Table 4: Predicted Life versus. Run for Part Outside of Risk Threshold

Decision Case

The drilling tool was analyzed again in May 2014, prior to deployment. In this case study, the risk threshold is set at 50%

again. Maintenance was previously performed based on the recommendations made in Table 3. Fig. 8 shows the prediction

range for actual drilling hours and a fuel gauge chart that enables the user to distinguish the percentage of life consumed.

Last Job

No

Cumulative

Temperatur

e C

Cumulative

Lateral

(g_RMS)

Cumulative

StickSlip

(g_RMS)

DrillHrs

[h]

Worst Case

Life 25Q

Predicted

Mean Life 75Q

Best Case

Life Risk Comments

10000 66.50 1.15 0.29 692.17 378.20 470.53 543.91 647.85 922.11 0.81

10000 66.72 1.15 0.28 677.75 377.84 470.27 543.06 647.52 921.70 0.79

10000 66.77 1.15 0.28 677.42 378.16 470.66 543.40 648.06 922.53 0.79

10000 66.80 1.15 0.28 674.42 377.97 470.41 543.08 647.71 922.10 0.79

10000 66.81 1.15 0.28 673.92 377.80 470.20 542.84 647.42 921.70 0.79

9000 66.91 1.14 0.28 672.34 378.17 470.66 543.45 648.15 922.93 0.78 Missing Data For This Run

9000 66.91 1.14 0.28 599.51 381.24 474.37 548.12 653.31 930.57 0.63 Missing Data For This Run

9000 66.91 1.14 0.28 595.68 381.26 474.40 548.16 653.36 930.63 0.62 Missing Data For This Run

6000 66.91 1.14 0.28 594.01 381.21 474.33 548.08 653.25 930.50 0.62

6000 68.60 1.25 0.31 585.11 347.15 432.64 499.96 595.60 846.42 0.72

5800 67.58 1.32 0.33 541.21 335.40 417.57 483.78 574.98 811.59 0.67

5123 69.40 1.44 0.36 495.11 309.81 386.13 447.29 530.82 745.99 0.65

5123 71.92 1.44 0.34 432.61 170.63 325.55 450.20 619.38 1060.60 0.47

4726 76.43 1.42 0.23 348.44 202.54 385.29 535.00 736.66 1279.67 0.20

4726 77.31 1.46 0.24 339.24 197.54 375.06 521.84 716.61 1244.64 0.20

4575 78.99 1.56 0.25 317.14 188.14 356.96 496.13 681.03 1179.46 0.20

4575 75.97 1.46 0.24 283.72 200.54 381.58 528.37 727.27 1263.72 0.13

3723 76.04 1.47 0.24 273.80 200.15 380.77 527.33 725.79 1261.03 0.12

3723 50.00 1.25 0.13 85.70 338.55 629.71 867.77 1188.73 1764.98 0.00

3723 50.00 1.12 0.15 46.80 312.93 584.04 801.60 1102.97 1764.98 0.00

2555 50.00 0.85 0.12 42.70 337.10 623.24 857.42 1189.40 1764.98 0.00

1859 50.00 0.82 0.11 30.00 356.99 661.58 913.72 1261.60 1764.98 0.00

Figure 7: Predicted Life versus Run for Part Outside of Risk Threshold

Page 10: SPE 171517_Estimating Probability of Failure _2014_Final

10 SPE 171517-MS

Fig. 8: Graphical Representation of Predicted Life of Electronic Parts in Drilling Tool – May 2014

Most of the serial numbers have remained the same. However, based on the previous data there have been some changes

made to the tool build. There is also a noticeable change in the percentage of missing data, which these dynamic models take

into account, and a new risk is calculated for each part. The update to the data has also had a positive impact on the results, as shown in Table 5.

Table 5: Predicted Life of Electronic Parts in Drilling Tool Results – May 2014

Results Interpretation Key

Low Risk: 0.0 – 0.25 Medium Risk: 0.25 – 0.50 High Risk: >0.50 Uncertainty in prediction due to missing data (>30%)

In Table 3 and Table 5, serial numbers 7, 9, 11, 17 and 18 are identified as high/ medium risk in Table 3, and appropriate

maintenance actions were taken. Parts that had risk lower than the risk threshold were used and previous parts that met the

risk threshold remained unchanged. Table 5 enables for better decision making ability because the probability of failure can be assessed before a tool is sent into the field.

Case Study 2: Sparing Optimization

This section will show the application of optimization technique developed in the previous section to determine best possible

selection of sub-components that will maximize the overall system reliability of both the assets (example: Fig. 9).

Part Number Serial Number Last Job No

Cumulative

Temperature

C

Cumulative

StickSlip

Cumulative

Lateral

(g_RMS) DrillHrs [h]

Worst Case

Life 25Q

Predicted

Mean Life 75Q

Best Case

Life Risk Part Description Comments

1 xxxx-1 10000 50 (L0) 0.35 (L1) 1.39 (L2) 23.33 301.10 428.47 518.63 622.46 834.54 0.00 PCBA (1) Missing Data For 1Runs..(12%)

2 xxxx-2 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 132.53 236.27 307.20 377.36 505.40 0.00 BATTERY

3 xxxx-3 10000 91.57 (L1) 0.32 (L1) 1.39 (L2) 332.25 278.65 451.23 601.87 731.80 997.62 0.11 PCBA (1) Missing Data For 1Runs..(6%)

4 xxxx-4 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 373.68 587.34 744.87 937.90 1390.76 0.00 PCBA

5 xxxx-5 10000 50 (L0) 0.32 (L1) 1.34 (L2) 216.99 216.73 389.55 534.26 739.48 1229.68 0.03 TRANSDUCER (1) Missing Data For 2Runs..(11%)

6 xxxx-6 10000 50 (L0) 0.35 (L1) 1.39 (L2) 23.33 124.35 204.65 266.52 318.90 418.51 0.00 BATTERY ASSY (1) Missing Data For 1Runs..(12%)

7 xxxx-7 10000 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 237.15 448.23 628.43 865.80 1511.01 0.49 PCBA (1) Missing Data For 2Runs..(8%)

8 xxxx-8 10000 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 571.43 850.71 1066.84 1343.04 2055.70 0.05 MAGNETOMETER (1) Missing Data For 2Runs..(8%)

9 xxxx-9 10000 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 426.25 638.38 800.76 989.99 1285.00 0.22 ACCELEROMETER (1) Missing Data For 2Runs..(8%)

10 xxxx-10 10000 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 453.21 792.79 1076.48 1410.04 2236.77 0.13 PCBA (1) Missing Data For 2Runs..(8%)

11 xxxx-11 10000 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 254.15 538.13 773.65 1101.58 2038.32 0.33 PCBA (1) Missing Data For 2Runs..(8%)

12 xxxx-12 10000 91.57 (L1) 0.32 (L1) 1.39 (L2) 332.25 223.98 480.48 685.26 969.48 1686.29 0.12 SENSOR (1) Missing Data For 1Runs..(6%)

13 xxxx-13 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 266.49 581.45 923.16 1315.53 2178.79 0.00 POWER SUPPLY

14 xxxx-14 10000 91.57 (L1) 0.32 (L1) 1.39 (L2) 332.25 223.98 480.48 685.26 969.48 1686.29 0.12 SENSOR (1) Missing Data For 1Runs..(6%)

15 xxxx-15 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 266.49 581.45 923.16 1315.53 2178.79 0.00 POWER SUPPLY

16 xxxx-201 9859 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 453.21 792.79 1076.48 1410.04 2236.77 0.13 PCBA (1) Missing Data For 2Runs..(8%)

17 xxxx-1715 9859 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 426.25 638.38 800.76 989.99 1285.00 0.22 ACCELEROMETER (1) Missing Data For 2Runs..(8%)

18 xxxx-1869 9859 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 426.25 638.38 800.76 989.99 1285.00 0.22 ACCELEROMETER (1) Missing Data For 2Runs..(8%)

19 xxxx-1975 9859 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 571.43 851.33 1066.84 1344.01 2055.70 0.05 MAGNETOMETER (1) Missing Data For 2Runs..(8%)

20 xxxx-20 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 123.18 202.70 264.81 316.51 416.86 0.00 BATTERY ASSY

Page 11: SPE 171517_Estimating Probability of Failure _2014_Final

SPE 171517 11

Fig. 9: Asset Swapping

The reliability of each component is calculated using the life prediction method described in Kale et al., (2014). The system

reliability of the asset is calculated using Equation 2. The baseline reliability of the two assets is show in Table 6. First and

third columns in the table list the name of the subcomponent that makes up an asset, second and the fourth columns show the

risk of failure of individual subcomponent in the asset. The part name shown in each row represents unique parts serial num-bers present in each asset. For example, Asset1-01 is functionally different from Asset1-02, and Asset1-02 is functionally

different from Asset1-03 and so on. Part names with same numerical index represent a different sample of same functional

part. For example, Asset1-01 and Asset2-01 represent identically manufactured parts that have the same functionality. The

last row in the table shows the overall system reliability of the assets computed using Equation 2.

Table 6: Risk of Individual Sub-Components in Two Assets

Part Name Asset1 Risk of Asset 1 Part Name Asset2 Risk of Asset 2

Asset1-01 5×10-4 Asset2-01 5×10-4

Asset1-02 5×10-4 Asset2-02 5×10-4

Asset1-03 5×10-4 Asset2-03 0.025

Asset1-04 5×10-4 Asset2-04 5×10-4

Asset1-05 0.051 Asset2-05 5×10-4

Asset1-06 5×10-4 Asset2-06 5×10-4

Asset1-07 0.076 Asset2-07 0.038

Asset1-08 0.001 Asset2-08 5×10-4

Asset1-09 5×10-4 Asset2-09 0.029

Asset1-10 0.031 Asset2-10 5×10-4

Asset1-11 5×10-4 Asset2-11 5×10-4

Asset1-12 5×10-4 Asset2-12 5×10-4

Asset1-13 5×10-4 Asset2-13 0.061

Asset1-14 5×10-4 Asset2-14 0.061

Asset1-15 5×10-4 Asset2-15 5×10-4

Asset1-16 5×10-4 Asset2-16 5×10-4

Asset1-17 0.030 Asset2-17 0.0287

Asset1-18 0.030 Asset2-18 0.0287

Asset1-19 0.030 Asset2-19 0.0287

System Reliability 0.774 0.735

The sum of the system reliability for the two assets is (calculated by Equation 3) is 1.51. The total system reliability of these

two assets is maximized by swapping parts using the simplex linear programming method. The condition for swapping parts

is that the system reliability of each asset must be at least 50%. Table 7 shows the results for this optimization.

Page 12: SPE 171517_Estimating Probability of Failure _2014_Final

12 SPE 171517-MS

Table 7: Risk of Individual Sub-Components in Two Assets after Swapping Sub-Components Between Them to Max-

imize Cumulative System Reliability

Part Name Asset 1 Risk of Asset 1 Part Name Asset 2 Risk of Asset 2 Comments

Asset1-01 5×10-4 Asset2-01 5×10-4 Asset1-02 5×10-4 Asset1-02 5×10-4 Asset1-03 5×10-4 Asset2-03 0.025 Asset1-04 5×10-4 Asset2-04 5×10-4 Asset2-05 5×10-4 Asset1-05 0.051 Swap Asset1-06 5×10-4 Asset2-06 5×10-4 Asset2-07 0.038 Asset1-07 0.076 Swap Asset2-08 5×10-4 Asset1-08 0.00055 Swap Asset1-09 5×10-4 Asset2-09 0.03 Asset2-10 5×10-4 Asset1-10 0.03 Swap Asset1-11 5×10-4 Asset2-11 5×10-4 Asset1-12 5×10-4 Asset2-12 5×10-4 Asset1-13 5×10-4 Asset2-13 0.061 Asset1-14 5×10-4 Asset2-14 0.061 Asset1-15 5×10-4 Asset2-15 5×10-4 Asset1-16 5×10-4 Asset2-16 5×10-4 Asset2-17 0.0287 Asset1-17 0.03 Swap Asset2-18 0.0287 Asset1-18 0.03 Swap Asset2-19 0.0287 Asset1-19 0.03 Swap

System Reliability 0.881 0.646

Table 7 shows that combined reliability of the two assets can be maximized by swapping parts between them. The fifth col-

umn in Table 7 shows which part was swapped between the two assets. The outcome of this swapping of parts is that the

reliability of the first asset is increased to 88% from a baseline value of 77% and the reliability of second asset is reduced to

64% from a baseline value of 73%. A scenario is presented where there are additional new spare parts (assumed to have risk of 0) available to replace Asset-07, Asset-17, Asset-18 and Asset-19. The cumulative system reliability of the two assets is

maximized by swapping parts between them and utilizing the additional spare parts. The results for the optimization are

shown in Table 8. The overall reliability of assets can be enhanced by optimally utilizing the existing subcomponents in the

tool. Cost-based optimization can be achieved to add the economics of spares and repairs in the decision making process to

determine which component must be replaced and which must remain active. The cost of failure factor may be added to de-

cide the optimal maintenance interval and level of repairs and replacements from the calculated risk of failure and cost of

maintenance.

Page 13: SPE 171517_Estimating Probability of Failure _2014_Final

SPE 171517 13

Table 8: Risk of Individual Sub-Components in Two Assets after Swapping Sub-Components between Them and Uti-

lizing Additional Spare Parts to Maximize Cumulative System Reliability

Part Name Asset 1 Risk of Asset 1 Part Name Asset 2 Risk of Asset 2 Comments

Asset1-01 5×10-4 Asset2-01 5×10-4

Asset1-02 5×10-4 Asset1-02 5×10-4

Asset2-03 0.025 Asset1-03 5×10-4 Swap

Asset1-04 5×10-4 Asset2-04 5×10-4

Asset1-05 0.051 Asset2-05 5×10-4

Asset1-06 5×10-4 Asset2-06 5×10-4

Asset2-07 0.038 Asset1-07 0.000 New part/Swap

Asset1-08 0.001 Asset2-08 5×10-4

Asset2-09 0.029 Asset1-09 5×10-4 Swap

Asset1-10 0.031 Asset2-10 5×10-4

Asset1-11 5×10-4 Asset2-11 5×10-4

Asset1-12 5×10-4 Asset2-12 5×10-4

Asset2-13 0.061 Asset1-13 5×10-4 Swap

Asset2-14 0.061 Asset1-14 5×10-4 Swap

Asset1-15 5×10-4 Asset2-15 5×10-4

Asset1-16 5×10-4 Asset2-16 5×10-4

Asset2-17 0.0287 Asset1-17 0.000 New part/Swap

Asset2-18 0.0287 Asset1-18 0.000 New part/Swap

Asset2-19 0.0287 Asset1-19 0.000 New part/Swap

System Reliability 0.676 0.999

Conclusion

The paper presents how maintenance plans and reliability for drilling tools can be improved, while reducing cost, by taking advantage of the forecasting capability that lifetime prediction provides. The improvements ultimately lead to preventing

costly failures in the field. Lifetime prediction is a way to make risk-informed decisions and is a catalyst to maintaining drill-

ing tools by letting the data show where improvement/change must be made. This shift in standard practice leads to lower

maintenance costs without sacrificing reliability. Improvement in sparing forecasts is an additional benefit to this methodolo-

gy because higher risk parts are more easily identified.

Future work will focus on refining model predictions by using additional environmental variables, incorporating other statis-

tical methodologies and integrating data from design and qualification tests to optimize drilling performance.

Acknowledgements

The authors thank Baker Hughes for permitting them the chance to work on such a trailblazing methodology.

Nomenclature

Drillhrs = Drilling hours BHA = Bottomhole assembly

FRACAS = Failure Reporting and Corrective Action System

HALT = Highly accelerated life test

HAST = Highly accelerated stress test

MLE = Maximum likelihood estimation

NPT = Nonproductive Time

PCBA = Printed circuit board assembly

PHM = Prognostics and health management

PoF = Probability of failure

PN = Part number

RIDM = Risk-Informed Decision Making

RPM = Revolutions per minute SN = Serial number

F = Failure

L = Lateral vibration

Mi = ith model identifier

Page 14: SPE 171517_Estimating Probability of Failure _2014_Final

14 SPE 171517-MS

N = Symbol used to represent negative decision, generally

“no” or “0”

S = Symbol used to represent stick-slip or suspensions

T = Temperature

X = Vector of parameters such as temperature and vibrations

Y = Symbol used to represent affirmative decision, generally “yes” or “1”

f = Probability density function

m = Number of models

n = Number of records

p = Probability

p(a|b) = Conditional probability of occurrence of event a provided b is true revid = revision identifier

tf = Time to failure (drilling hours)

wi = Weight of ith data point

xave = Average value of parameter x

xstdev = Standard deviation of parameter x

α = Calibration parameters of reliability model

= Likelihood

η = Characteristic life or scale factor of a probability distribution

β = Shape factor of a probability distribution

σ = Standard deviation

λ= Hazard function {CF} = Set of life data for confirmed failure

{O} = Set of outliers

{S} = Set of life data for suspension

{UF} = Set of life data for unconfirmed failure

Load, Stress and Severity are used interchangeably to describe the impact of the operational environment (mechanical and

thermal) on the durability of parts.

Nominal part is a representative part that has a life equal to the average of several parts produced using the same manufactur-

ing process and operating under identical conditions.

Suspensions are used in reliability modeling to represent hours accumulated on parts that are in operation or removed from

service for reasons other than failure.

Bibliography

Barker, D., Dasgupta, A., and Pecht, M. (1992, February). PWB solder joint life calculations under thermal and vibrational

loading. Journal if the IES, 35(1), 17-25.

Chatterjee, K., Modarres, M., and Bernstein, J. (2012). Fifty Years of Physics of Failure. Journal of Reliability Information

Analysis Center. Duffek, D. (2004). Effect of Combined Thermal and Mechanical Loading on the Fatigue of Solder Joints. University of Notre

Dame. Notre Dame: Master's Thesis.

Garvey, D. R., Baumann, J., Lehr, J., and Hines, J. (2009). Pattern Recognition Based Remaining Useful Life Estimation of

Bottom Hole Assembly Tools. SPE/IADC Drilling Conference and Exhibition. Amsterdam, The Netherlands.

George B. Dantzig and Mukund N. Thapa. 1997. Linear programming 1: Introduction. Springer-Verlag.

Kale, A. A., Carter-Journet, K., Heuermann-Kuehn, L., Falgout, T., and Zurcher, D. (2014). A Probabilistic Approach to

Reliability and Life Prediction of Electronics in Drilling and Evaluation Tools.

Litt, J., Soditus, S., Hendricks, R., and Zaretsky, E. (2001). Structural Life and Reliability Metrics Benchmarking and

Verification of Probabilistic Life Prediction Codes. 5th Annual FA/_JAir Force/NASA/Navy Workshop.

Mishra, S., and Pecht, M. (2002). In-situ Sensors for Product Reliability Monitoring. Proceedings of SPIE, 4755, pp. 10-19.

Murty, Katta G. (1983). Linear programming. New York: John Wiley and Sons Inc. pp. xix+482. ISBN 0-471-09725-X. MR

720547. Reich, M. (2004). The Fascinating Workd of Drilling Technoligy: Products from Baker Hughes and their Functions. Celle:

Baker Hughes.

Tuchband, B. A. (2007). Implementation of Prognostics and Health Management for Electronic Systems. College Park:

University of Maryland.

Page 15: SPE 171517_Estimating Probability of Failure _2014_Final

SPE 171517 15

Appendix A

A. General Log-Linear Model

The relation between characteristic life and stress variables are represented by using one of the three models generalized as

log-linear (GLL), proportional hazard (PH) and cumulative damage (CD). The GLL model represents life using Equation A-1

( ̅) ∑ ∑ ∑

(A-1)

where ̅ = {T, L, S}. For a Weibull distribution, the probability density function is shown in Equation A-2, where β is the

shape parameter, η is the scale parameter and α’s are unknown parameters calculated from field data using the maximum

likelihood estimation technique.

( ̅) ( ̅) ( ̅) (A-2)

The probability density function (PDF) for an exponential distribution can be obtained by simply putting β=1 in Equation A-

1. For lognormal distribution, the probability density function for a GLL stress function is shown in Equation A-3

( ̅)

( ( ) ( ̅)

)

(A-3)

B. Proportional Hazard Model

For a proportional hazard model, the hazard rate of a component is affected by hours in operation and stress variables. The

instantaneous hazard rate of a part is given by Equation A-4

( ̅) ( ̅)

( ̅) ( ) ( ̅ ̅) (A-4)

where f is the probability density function and R is the reliability function. The instantaneous hazard rate, λ0, is a function of

time only and the stress function, η, is a function of operating stresses such as temperature, vibration etc. The list of unknown

model parameter ̅ is obtained by calibrating the model to test data using the maximum likelihood estimation (MLE). The

stress function, η, is given by Equation A-5

( ̅) ∑ ∑ ∑

(A-5)

Substituting Equation A-5 in Equation A-2, the hazard function for a Weibull distribution is written using Eq. (A-6)

( ̅)

(

)

∑ ∑ ∑

(A-6)

C. Cumulative Damage Model

The cumulative damage model incorporates the effect of time varying stress on life of components. The model takes into ac-

count the impact of damage accumulated at each stress level on the reliability of parts. Damage accumulation can take place

at various rates for various stress levels and can be determined using the linear damage sum (Miner’s rule), the inverse power

law or cycle counting techniques such as rainflow counting. The cumulative damage model used in this paper is established

from Miner’s rule which is based on the hypothesis that if there are n different stress levels and the time to failure at the ith stress σi is Tfi, then the damage fraction, p, is given by Equation A-7

(A-7)

where ti is the number of cycles accumulated at stress σi and failure occurs when the damage fraction equals unity. The prob-

ability distribution functions for Weibull and lognormal distributions are obtained by substituting equation A-7 in equations

A-2 and A-3, respectively. Given the stress variables ̅ { }, the PDF for a Weibull distribution is given by

( ̅) ∫

∑ ( ) ∑ ∑ ( ) ( )

Page 16: SPE 171517_Estimating Probability of Failure _2014_Final

16 SPE 171517-MS

( ̅) ( ̅)( ( ̅))

(( ( ̅)))

(A-8)

D. Characteristic Life Function

The life characteristic function describes a general relation between failure time and stress levels. The life characteristic can

be any time to failure measure such as the mean, median, hazard rate etc. that represents a bulk property of a probability dis-

tribution. Ideally, the function must incorporate the governing equations that represent the physical phenomenon of degrada-

tion of the material under the application of load. Typical electronic circuit boards used in drilling and evaluations are com-

plex and the governing equations representing degradation and failure mechanism are difficult to model; therefore, the paper

evaluates several empirical functions between stress variables and selects the one that best fits the field data.